Today I am at the Digital Scholarship: A Day of Ideas which is a day of “talks and discussions for staff and PhD students in HSS, to inspire and share ideas for digital research, teaching and scholarship. An exciting programme of invited speakers working in the field of digital scholarship will present their ideas and their work” taking place at The Business School at the University of Edinburgh. The Event has been arranged apart of the excellent Digital HSS programme of activities. The full programme is available here (and I’ve also linked to the related abstracts in the titles for today’s talks below).
::: Update: The videos are live on YouTube here :::
The event is also being webcast and can be viewed here: http://www.digital.hss.ed.ac.uk/?page_id=504
As this is a liveblog the usual caveats apply to this liveblog re: typos, errors, etc. And please do leave me comments and corrections!
The introduction for today is from Prof Dorothy Miell.
It seems appropriate that today’s event is taking place in Innovative Learning Week. We will have the opportunity to here about some of the most innovative work taking place across the uworls in digioital humanities and social sciences.
This is only one of a series of events that have been organised in the college – take a look at the Digital HSS website for more information and you will also see that some of our speakers are giving workshops across the university iover the next few days. We’ve also had workshops running on an ongoing basis last semester and this semester. We also have a live twitter feed on the wll today which will be ip whilst our speakers are talking.
Dr Sian Bayne has organised today’s event. she’s our first dean for Digital Scholarship and it is a tribute to her and her team that there has been so much interested and enthusiasm for this programme and for today’s events (Janet Rennie, Annie Hetheringtob, Dawn Smith)
I’m particularly delighted that we have real world leaders here today, having this group here is a real opportunity to hear about their work, particularly new research work, new potential for collaborations, new ways to engage with othe people engaging with our works. And I’m sure we’ll discuss work to make our own work and selves more visible.
Today you will also see a number of posters in the lobby – do take a look and huge thank you to those who have contributed to these.
And hopefully this time next year we may have another follow up event of this kind.
Now over to Sian.
In the first section we are focusing on Digital humanities chaired by Dr Sian Bayne:
We are so excited to have Melissa Terres with us here today. She’s a researcher in electronic communication at UCL. And she’s also just been appointed co-director for the Centre for Digital Humanities at UCL. Melissa has a background that has included degrees in humanities and computer science so she is well placed to talk about Digital Humanities. She is deputy editor of Digital Humanities Quarterly, acts as a hub for digital humanities work via her speaking schedule ad her very active blog and tweeting!
Melissa’s talk will help us think about what we mean by digital humanities and what it’s scope is.
The Vision After the Sermon: establishing an institutional digital humanities presence – Dr Melissa Terras, Reader in Electronic Communication and Co-Director of the UCL Centre for Digital Humanities, University College London
Lets go back about 20 years. The web was up and running but no-one really had the internet at home. When I was studying art history the way you were tested on art history was to see an image and be able to name it, know all about it, memorise it’s information essentially – Melissa tests the room and I fail at the details but someone is able to give the exact year of the image she’s showing – The Vision After the Sermon by Gaugin.
So the research process was to look through the library, to look at slides on very few lightboxes, visits to galleries. Now you can use Wikipedia, view an image on the gallery pages, find out more about the image and it’s history, you can view related paintings, you can see all the paintings in chronological order. You can see all impressionist painting at home for free in high res no matter where you are! You can see books in Google books, you can see images on Flickr – amazing for sculptures – and you can comment, you can download images and make a 3D rendering. You can use an image to search for related images.
If you’re interested in and have the skills you can look at sophisticated scans of underpainting etc. You can tweet the world if you just want to be pointed at the right experts. You can shout out on a forum!
So much has changed in the last 20 years. But we’ve been terrible at looking at this – how it is changing our disciplnes and how it’s changing what we can realy ask. And we also haven’t stepped back and thought about digital humanities – who is the digital humanist here or the digital social scientist? Is it the person googling? Is it the researcher? Is it the people providing the infrastructure? Is it the commercial companies?
Right now it’s a big tent, everyone is a digital humanist. But I have an issue with that. Everyone can use Google, everyone can use these tools. But we have to position ourselves correctly and be clear about what the specialist elements are here.
At the end of December I created an infographic about digital humanities – who does things in this space – there are 3000 subscribers to the Journal of Digital Scholarship in the Humanities. There are 2794 followers of @DHNow on Twitter… but the numbers vary depending on where you dig… it’s an approximation. So we have this big space of Digital Humanities and a smaller pool who self conciously label their practice “Digital Humanites”. So there are huge interesting tools for research and collaboration here but I’m very much in that smaller pool.
So what does Digital Humanities do?
Well there’s using Digital Infrastructure – the basics here, that’s use of websites, databases, online repositories, online tools – that’s just broad digital literacy!
Then there is the creating Digital Resources – making content – websites, databases, digitsiation, corpora, xml, tei
The we have those developing Digital Infrastructure – developing TEI, epidoc, etc. programming and experimenting with new infrasturce and application, user testing.
And then we have developing computational methods – computational algorithms and theory.
That last category is computational methods and very much my area but this isn’t a matter of one area being better than another. But as you go up that ladder you need greater computatiuonal skills. Really using Digitial Infrastructure is about digital literacy, the next two are digital humanities and social sciences, that las category is really computer science. But that whole chain is part of the Digital Humanities and Social Sciences world
So in terms of what we do to support digital humanities there is space to train at all levels. Where people are creating Digital Resources we can exploit that content. We can contribute to Digital Infrastructure and computational methods – we can provide those hard questions and real research problems to be addressed by computer sciences.
And this is where the UCL Centre for Digital Humanities. We try to match up people to ensure the most can be made from a really exciting project – that the digital and computational aspects and the humanities researchers work together.
But first, our logo. We use it everywhere, it’s on cushions. It’s really important that if you are an advocate for digital methods you have to show that you are using digital methods yourself.
We now have a Digital Humanities MA/MSc. 50% are specific but the other 50% can be anywhere at college – as long as it involves computational methods and/or one module in pure humanities. We are really bootstrapping digital humanities here to make the degree match what they are interested in. We had to get 8 students to make sure it paid it’s way, but we have 25 so it’s doing really well.
We also pull together information on all of the work across the university that is doing work in digital humanities – these cross any of the schools working in this area. We try to be the hub for all digital activity across college. And obviously we do our own research ourselves. I maintain my research process but since we set up this centre more and more of my work is cross-disciplinary in nature. So one of my projects, QRator, is with the Zoology museum and we are asking questions here and researching them through use of iPads in the museum. We are just beginning a project with the Imperial War Museum which will build on this.
Another project I’m working on is Transcribe Bentham – they said it was a digitisation project but they’d never get funding, I said “no! you make it a crowd sourcing project”. In the last year that this has been live we’ve had just over 1 million words transcribed by the general public – it’s now cost effective, producing more information than we could have afforded to pay someone to do. BTW we have Jeremy Bentham’s body somewhere at UCL but that’s a whole other talk…
And there’s another project coming up that I can’t talk about… but it is with the centre for spatial technology… keep an eye out for something called Textal. But that’s all I can say.
And I’m also doing hardcore computing methods… this is work on multi spectral imaging – we use it for documents, a medical staffer uses it in blood flow and a drunken conversation – drunken conversations are very useful for research btw – led to us getting a PhD student in for work on best practice on multi spectral imaging.
So, we advocate for digital humanities… what do we do?
Well once a month we have a Digital Excursion – we go off to an event or a venue for a meet up, there’s a short talk but it’s about meeting each other, building connections. And when the next
Decoding Digital London – a discussion group of a paper in a pub. Mostly postgrad and PhD students at UCL and Kings. A couple of hundred people are on the mailing list, about 30 folks show up each month.
And we do Painless Introduction To… open events for anyone in college each month – lunchtime talks that raise our profile and introduce specific topics.
We have a bunch of posters – see Flickr – that we put up around college. Things that raise profile and make people think about Digital Humanities, direct traffic owards us. One of our admins puts up posters in random places once a month!
1. Eat your own dogfood: Your website should be up to date. Use the methods you promote – like tweeting etc. Use good xml. We are now the best practice website for UCL – that’s great for reaching out if quite funny for us.
2. Have the capacity to deal with problems: You need good admin support. We need a centre co-ordinator to set up and maintain these events, to keep things going on.
3. You have to get out there, reach out, be highly visible. You have to know who folks are, they won’t approach you.
4. It takes time. It’s a leap of faith to college and investment of faith and time
5. It takes resources. You need money for this stuff. Not a lot. The centre builds on what I and my co-director Claire Warwick were already doing but it took cash to get it in action.
6. And you need institutional support and commitment for a few years, it will take that long to get going.
So that’s us. We bootstrap digital humanities. We help connect up stuff that couldn’t happen before.
Q1) I really enjoyed that. We’ve had a few visits from Digital Humanists. They do seem to be further ahead at playing with digital tools etc. in comparison with social sciences – why do you think there is that
A1) I can talk about digital classics. There has been digital work here for probably 15-20 years. But it’s a really inter-disciplinary field that brings in literature, geographers, historians etc. So adding in those digital tools that help answer questions and that makes a difference. Medievalists have taken to it similarly. But you also need good people, advocates and leaders in this area. People need to be open to it and you need people to drive it forwards.
Q2) That was very interesting and I, like you, have a background at Glasgow University. At a point in time there was a big computing in history thing going on at Glasgow in the last 80s and early 90s. There was a journal, Computing and History, which I don’t think is there at the moment. There was work on coding, big data sets etc. And that kind of died, the discipline moved on and it fell out of fashion. You are talking about a resurgence in a particular different area.
A2) It worries me a little. Digital humanities wasn’t cool until about 2 years ago. It worries me to be in fashion as yes, we could go out of fashion. I think that previous excitement was about the first wave of the web. People saw loads of opportunities but the technology was expensive to create, horrendously expensive like £100 per digitised page. And the tools weren’t up to all those questions. Now we are in a different place now. Our undergraduates have also been online – wireless online no less. Graduate students see this as a no brainer, of course they use these tools. Data storage is cheap. But it is going to go somewhere, we need to prove our worth and the value of what we do. Some projects are blue sky and don’t have a solid outcome. Digital methods are becoming a part of everyday scholarship now so it’s how we teach those methods and best practice that really matters here.
Q3) One of the words you mentioned is data and one of the interesting things that are emerging, quantitive data that is useful for all fields. What are your thoughts on quantitive data specialists
A3) Large scale data sets in the humanties traditionally have been very different to those in the sciences. They are scrappy, hand collected etc. They aren’t like a telescope looking at a star and producing consistent data and metadata. There are lots of questions to ask of big data sets in the humanities. That fuzzier data is actually fascinating to look at from a computational methods perspective. And there are large bodies of data in the humantities that are in a cleaner state. But we need those in the humanities to come forward to prod at that. We need to be more involved in understanding textual data and textual data analysis. The other thing is that there used to be a fantastic Arts and Humanities Data Service where you could store data sets. But the funding got pulled just at the point where it was really starting to be well used.
Q3) I think in the humanities there is this perception of noisiness of data. There is automated data in sciences but in the pioneering sciences my colleagues would be shocked to think that there isn’t huge noise in their data. That’s always the story in big data. Science gets very far along before analysis can be automated. Scientists have been thinking about these things for centuries because our data is really noisy too.
Q4) I’m another social scientist, slightly sceptical, slightly enthusiastic. I wanted to raise a slightly general question. That was a fantastic talk. But at an event like this we will hear the success stories, the people who’ve done great, but not about the digital casualities. The people who try but fail. The people who can’t self regulate when they find all this information. They can’t choose the right digital solution for their problems, they get lost in all that stuff.
A4) That’s where you need a Centre for Digital Humanities who are embedded in this stuff, to point you in the right direction, to help you find the right method or tool. People who can advise you. We don’t expect you to know everything or engage in everything directly. Most digital humanities projects are interdisciplinary and involve a team. We encourage you to find people to help you, help you scope things into a feasible form, help you apply for suitable funding.
Q4) I guess my question is that if we have poeple like you enthusing about these tools that we don’t always need people being sensible and cautious here, giving a counterpoint.
A4) Absolutely. I mean I’m as guilty of that as anyone else – on twitter at midnight and at 6am today. But that’s where training, where digital literacy skills come in. Understanding what is to be trusted and how they can judge credibility etc. We point to other poeple and resources, being a hub for our colleagues.
Q5) Thanks Melissa that was a fabulous paper! I’m from Trinity College Dublin and looking at this from across the Irish Sea… When you look at HATI and the lost of the AHDS etc. What happened with losses here is that the UK lost it’s voice in Europe, it’s not signing up to DARIA (check that acronym?). I think the UK are being left behind. How are poeple like you and your centre lobbying and representing that voice?
A5) I think we had a good infrastructure in the UK we didn’t really take as much of an interest in Europe as we should have but we’ve just appoined someone to do that. Lobbying is really important. Last week there was a paper for the House of Lords calling for fast broadband – mentioning shopping etc. BUT not mentioning humanities, arts, etc. We need to feed into that. I’m involved in forhumanities.org tries to show the value of humanities in general. There is lobbying to be done. Humanities should be on the agenda for politicians. Not just the AHDS important here but also the humanities in general. We have some sites that help reach out but we are not doing enough to reach out and influence and we should be doing more.
In the next section the focus is Digital ethnographies chaired by Prof Charlie Jeffery who has just started introducing the session by saying that Innovative Learning Week has been going well so far. Our next speaker is TL Taylor and she is based at the Center for Computer Games Research. I am interested in play in the social sciences and I’m also quite excited about the possibilities for innovative teaching around play and games.
Today I’m showing my work on communities but tomorrow I’ll be talking about professional computing gaming. I’ve also written a book on EverQuest. And I also have a book I’ve co-authored on conducting digital ethnography coming out in August.
So I want to talk about ethnography and play. I find ethnography really useful in researching online, it has real resonance. I think feeling “inside” in ethnography can be challenge. Play has helped me think about my own ethnographic research and methods.
So MMOGs (Massively Multiplayer Online games) are large scale virtual gaming worlds. These are generally a software and a subscription package but increasingly a free to play model as well. So to give you an idea of what I’m talking about I thought I would show you a video of one of these worlds and how they look. This video is of 40 people playing together. We see a few minutes but TL points out that this is like a 15 minute film of adventures in these worlds.
When we talk about play we think about “fun”. We think of it as being apart from everyday life, as having bounded purposes. But actually even in young children it’s more complex than that, it is part of life, it involves multiple complex experiences and emotions. To always link it to “fun” can undermine the complexity of what goes on. There’s some great work in Games Studies here that look at why “fun” isn’t quite the right label for multiple reasons. But it’s important to note that here I’m talking about play in the very specific context of these MMOGs.
With games we may find them in different ways but when we play we engage in a specific context, are we choosing to play alone or collectively, on your own or with friends, under our own or a role playing identity, are we on a european or US server? And when we play we really play with our identities. Does it fit into our notion, our sense of ourselves. Our choices are deeply tied to our identities that are deeply tied to our social context. We experiment with new genres, we experiment with new communities. Here we most closely hit that journey of an ethonographer entering into a new domain. When I think about my own ethnography work there is a sense of serendipity, entering something new, something else, but perhaps not something wholly unfamiliar. But we don’t connect to the “magical” aspect of game play.
Ethnographers have to manage their identity in difficult way, immersing ourselves and experimenting but continuing to hold the role of ethnographer. We have to learn the rules of the system, how to play our character, how to act in the world. We stumble on techniques, skills, methods that surprise and please us as we work. Most of our gaming lives are filled with controlled risk. They are filled with control and repitation. We have to understand the rules, the norms, the habits. We learn ourselves, from other players, occasionally from manuals, from others online and from experiment. That experimentation sets up our habits etc. And as we enter we are encultured here, and that is into a particular server, a particular group. You are not born a gamer, you become that. And that is is deeply rooted in social context. Our understanding is all part of that social process. The entire process is about taking in so much that you can interpret it and understand it and to track something new that hasn’t been tested before. We can find that our colleagues ask dumb questions but if we are lucky we do get review of our work, feedback etc.
But is there an equivelent in ethnography for the stumbling experimentation of game play? We comport ourselves in way that secure our entry into the space. We think about how we present and communicate ourselves. But a risky comparison is around ethical dilemmas. Experimenting with ideas is tougher there. That formulation of experimentation is outside of ethnography. But gender play etc. to examine a phenomenon can be tempting. If we think about David Myers experiment in City of Heroes – felt like provocative griefing but for him it seemed like a positive experiment. Experimentation can really cause issues for the ethical responsibility of the ethnographer.
However experimentation can also be understood from the perspective of the puzzling. People take notes of process, they record and analyze their play, they engage in theory creating.
Analytic puzzling. We certainly do this – we sit there puzzling through the materials, grabbing snippits of conversation, documents, evidance of all sorts. One of the most interesting ambiguities between gaming and ethnography is in this area. So to show this off I want a couple of volunteers. People to play “Caster”, “Ranger”, “Magician”, “Warrior/Cleric”, and the “Beastlord”. So this is an extract from the game where they are killing monsters…
This is proving to be an amusing exercise – we have players across the room! And in the game dialogue players are being “killed” then “resurrected” but also chatting…
Comment from the floor: There’s an interesting aspect here as there are two people in this dialogue who are a couple and there is a really interesting family communication going on.
And in that last line it looks like these people are also messaging in-world as well as talking to the group…
Comment: that negotiation and comment on real vs. game identities is very interesting.
Comment: And they have so many characters in game at the same time
Yup, they are deploying their son’s character as well as multiple of their characters. For ethnography it’s so hard to patch together what the heck is going on in that place, deeply complex.
There is a communal enjoyment of history, stories and playful myth – gamers tell each other stories of their adventures in these games. We can find satisfactio and joy like these gamers, happiness at mastering skills and languages or practices of a community. As an ethnographer my experience is often pleasurable – both from the point of view of playing that game, the embodied aspect of my play which is in sync with the game is very satifaction. And at the same time you smile when you hear someone produce a fantastic quotable quote, encapsulating something important. And we have pleasure from our dual status of player and ethnographer.
We are emodied and affective. We develop a ludic instinct that is almost intuitive. We can fall into routine patterns of actions – keyboard, mouse, eyes, etc. are all embodied and material practice of play. But it’s also affective – from adreniline rushes to defeat to sublime experience. When we play we are active in the space, games can take over our thought. Sometimes could be absent minded though, parallel to conversation. But in all those interactions we take on the role of the character. But this can be problematic. We experience the field and we do so only through the embodiment of our characters. We can be exhausted by the field but that can be very positive. Others in the field experience us as researchers in the field. We ocsillate being community members as engaged embodied presences – it is our strength. But the deeply participant component has long led ethnographer to be concerned with accountability, ethical practice etc. For me MMOs in particular have brought this into sharp relief – being part of a raid team in WoW provokes strong questions about agency and activity etc. Typically as players you must carry your own way, you come to embody part of the collective. There is no alternative. We may think of ourselves as the observing outsider/visitor but the game has no comparable role, radical reciprocity comes in here.
Frustration and failure. You can find yourself unable to progress. Those who wish for games to be as emotionally engaging as films I think you have never engaged with MMO gaming communities before. As a player you can be so frustrated, so annoyed with the game. As ethnographers that happens to, we need to be open about that. We have to talk about the gaps before what we think and what is. Our moments of not understanding or frustrated can help us to engage with something new. You can miss opportunities, you can fail to record something, you can misread a situation. We feel conflicted with our roles. In early stages of ethnographic work we live in a double state – we are keen for interpretive handles.
Transgression and transformation. In a game we grasp for tactics that will give us an advantage, a sweet spot to stand in for protection, guides for a quest that evade the exploration designers may intend. We can automate things, add patches and programs, hacking type approach. Adding this stuff in radically transforms play and can radically change our game play. And we have to make a decision between the ethical line between game play and cheating. Players make these decisions themselves, often formed as part of community perspectives/values.
As ethnographers are we engaged in transgression or transformation? Well our work is right on the edge of boundaries in many ways. The people we study can see our stance as weird of unnatural, we obsess over every day details. We have to interrogate our own pleasures. We can unwittingly break rules or norms – and then study that rather than moving on. We document, log, archive. We explore the boundary spaces and the line between legitimate and transgressive. Generally any transgression we provoke is sort of a soft transgression, rarely would we ever shake things up. In terms of transformation our presence certainly shapes what happens, we write stories, history and accounts and they can filter back to the community who will then interpret and wrestle with those. But there is also how we are changed by our work. We may find things we enjoy, find strengths or flaws. We rarely come aware unchanged. Transformation is an inherant part of ethnography as it is with play.
Myth and unknowable. There is great myth making and there is so much of an MMO that we never see, our view is always incomplete. We can, as ethnographers, be left with myths from the field, there is material we will never have access to. We are always left with some degree of incompleteness.
Leaving. At some point we drift away from a game or find ourselves wanting to. We no longer learn things, we lose our pleasure, we may find the community dropping off. We find ourselves dropping in to check back on the game and friends. We may archive the experience in some way if we were really attached. We may stage our leaving. As ethnographers we may anticipate feelings and practices. Our sense of discovery can be lost. But we can find we collect artifacts that remind us of our time. Ultimately we leave.
There is some overlap and some gaps between play and ethnography. Each domain is constituted by what I would call assemblage work, human and non human, we are part of a complex network of agents, actions and connections. Both are dynamic emergent space for possibility.
Q1) You described how one feels when one plays a game. How important for achieving the goals in the game are action? All of the games you described are games of action – I assume that contest is key to this?
A1) Games are quite heterogenous, action is quite heterogenous, and contest is quite heterogenous. My only response is that all of those nodes are quite diverse, they vary across game play and the experience you bring to this..
Q1) I was thinking of games in terms of arguements but that goes somewhere else…
Q2) Firstly a plug: TL will be presenting a paper at the school of sociology on professional gaming. Your talk reminded me of the Failed Humanist and also the Failed Musician – the hidden digital detritous of unreleased, undeveloped materials. What does that mean for you with all these pseudo invisible practice?
A2) This is why ethnography is so rich for me. It’s about finding those gaps. For me being present, being there is so important. I showed you that film of that raiding group. If you just saw the success you would miss the months and months of attempts. In fact there is a great book coming out from Mark Chen on new book out on attempts and failure and the lived embodied culture of rich meaning. Ethnography really shines for understanding these processes.
Q3) You were talking about the affect. How do you interpret this in a community that is online. You don’t have information form a physical space.
A3) So you don’t have the corporeal body in front of you? We have research on digital embodiment but also have more work to do here. When I talked to people in the 90s people would talk about how they felt in their embodiments although we have these multiple embodiments and spaces. And if you work in avatar spaces people talk about personal space of their avatars. But we do this with participant observation, interview techniques.
Q4) I was struck by the role of non online digital embodiments has changed things over time…
A4) When we did work on MUDs – text based multiplayer games – we did find people who were friends who would play again. And in EverQuest and particularly WoW we find people interacting with family. Games and online games are transforming how families interact with each other. I play games with my partner on our iPads, even though we are co-located it’s just become part of everyday life. But there has been a surge of this sort of familial interaction in games
Q5) I’m fascinated by this quick interchange between the gaming context and our everyday action. I want to suggest there is an extension here in the real world. I study innovative start ups and many use virtual teams. I interview and observe organisations where their actual work context is surprisingly gamelike. They do entirely online tasks and work with other individuals or organisations and all by email or text and reported all online. We did a report on success and failure in those contexts and my instinct is that some of the things is that soome of what you are finding could be applied or compared to the “real world” and business environment. I just wanted to put that out there as a cross disciplinary thing.
A5) Just the proliferation of these technologies makes a big difference. A few years back there was debate about whether being one of these raid leaders. leading 40 people across the world 5 hours a night, 3 days a week could be on your resume, showing your skills in leadership. Really interesting.
Q6) I was wondeirng within the games how do you use your ethnographic techniques to know about these poeple and their lives away from the game. And how can you verify what they are telling you. And I’m interested in non users of technology and how people leave and come back to these spaces.
A6) The non users is an area where there is lots of work to be done. And in games thinking about non gamers in a household with gamers could bre really interesting for understanding domestic arrangements. I’m not a big believer that you have to meet someone offline that you are studying online. People can lie to us offline as well. It depends on what your research question is as to whether that’s useful really. But gamers do go to conventions and meet ups
Q7) My partner and I live thousands of miles apart so we use online gaming to explore creativity – we are both museums – so I was wondering about use of gamed for teaching.
A7) I’m not a specialist here. Historically there have been real issues in mapping teaching to gaming. The most interesting work is stuff like that at University of Wisconsin, Madison where poeple are coming to their gaming environments with their questions, exploring ludic qualities, a long way from the area of the simplistic “educational game” that I’m sure we’re all familiar with.
After a very tasty lunch we are now moving on to Digital collections chaired by Prof Alvin Jackson. Our next speaker Jane Ohlmeyer is a leading researcher in anglo-atlantic history. She is working with the 1641 Depositions and will be talking about this project today.
The 1641 Depositions: the growth of a flagship project – Prof Jane Ohlmeyer, Erasmus Smith’s Professor of Modern History, Trinity College, Dublin
I spent a very happy decade teaching in Aberdeen and was also a trustee of the National Library of Scotland so it’s fantastic to come back to Edinburgh today! I’ve referred to the project as a flagship but perhaps that’s not yet quite right yet.
I am myself not a digital humanist, I’m a historian. I’m not proud of that as my colleagues in IT have grasped history. But I have been working in policy in Ireland and funding for infrastructure. We’ve had about €5 million for the 1641 Depositions and we’ve secured €28 million for the infrastructure for digital humanities from the Irish government. And I am now Vice Provost for Global Relations and I’m in charge of fund raising for the university so I’m busy fund raising for digital humanities – and we’re about to advertise a new chair in digital humanities. Industry really connects with it.
So, firstly a bit on the 1641 Depositions. This was one of the most bloody and brutal episodes in Irish history, more people lost their lives then than in any incident since. Consequently the Depositions are some of the most controversial documents in Irish history. And the voices are almost exclusively from Protestants, very little from Catholics so we only get one voice here.
I wanted to show you and read you part of one of these depositions – note the spelling variation, the voice in use, the sentence structure etc. These are harrowing accounts. There are over 8000 in all. They were bound in 18th century bindings but have now been rebound. They represent every county, every class etc. They came to Ireland in 1741 as part of the 100 year “celebrations”. There are other depositions as well as these, for all of Ireland but especially for Cork. But these depositions are documents of loss of goods and chattels, military activity, and alleged crimes of the Irish rebels. They are a unique source of information for causes and events surrounding 1641 but also much wider social material – these were sworn testimonies, people give their address, sometimes their profession and they sign or mark the papers. Huge amounts of rich social data.
But the data is really dirty data – highly marked up, loads of noise, hugely varied spelling and writing styles. Our commercial partners love that complex data but we’ll come back to that.
Collection took place at different times – the nature of the evidence is highly problematic, some were taken for use in war crimes tribunals, others collected at other times.
What the depositions prove once and for all was that the massacres were not pre-meditated. But because the 1641 rebellion is such a lynchpin of loyalist politics, it has influenced the shaping of protestant identities – in quite a negative way. We didn’t want to publish them to get the loyalists worked up. They were brought up as highly political documents. We just wanted to make them available. Some of the engravings on my slides here are from The Tears of Ireland – they look familiar if you know the Thirty Years War – that’s because they were lifted and text added. Very propagandist materials. Some key imagery here.
Because of the significance of these documents in Ireland and across the Irish world. Unlike the UK we don’t have national funding in the same way. We partnered with the Universities of Cambridge and Aberdeen and we had €600k from the AHRC and this was topped up by €400k from the Irish government. We also worked with IBM throughout the project. We employed 3 full time researchers for 3 years transcribing the depositions. These are as accurate as they get. The idea of crowd sourcing this I’d be interested to try that, it would be a challenge. We were a really cross disciplinary team and working with the library.
We wanted conservation. We wanted digitisation. We wanted transcription and computer based mark up. And I also wanted to print a hard copy as well – as a Historian I want to be assured of a copy being available. The AHDS were advising us at the time and they were unable to reassure us of a lasting digital copy here. It’s worth noting that the Depositions have been up for publication before – they’ve tried several times before. The Depositions were to have been published in the 1930s, was tried again in 1960s… third time lucky.
So we have the website up now and we have 62 k active users. You get the digital depositions along with a transcription. We do not have a normalised spelling but we are working on this as part of a European project. There can be 47 variations of a person’s name for instance, an issue for interrogating the digital text. They are phenomenal in the classroom though, they are used across the world. And we have a whole gang of PhD and Masters students working on these documents.
Trinity has committed to maintain the website in perpetuity – whatever that might mean online. Certainly 10 years – maybe 100?
We have received lots of publicity for the project – loads of press in the US in particular, closer to home a more mixed response. But there is huge interest in the project and it will be featured by Andy Wood in his next series of the Social History of Britain.
When we launched the publication we tried to ensure both sides of Irish politics were invited and gave extraordinary responses to the projects, both Mary McAleese and Ian Paisley gave powerful talks. A truly memorable evening.
The project has been an incredible journey and has now spurred a whole series of projects. Funding from AHRC was brought in to conduct Language and Literature Evidence of the data – the researcher leading this was looking at early modern language in the depositions. Viewing a word cloud of the depositions you see words like “pounds” and “worth” which do help you realise the true meaning of the depositions. We have a whole series of funded schools programmes and events. We have done these around Ireland but not in Portadown – still too hot an issue there but we did one in Ulna.
Another of our projects that has followed is CULTivating Understanding Through Research and Adaptivity – CULTURA funding under EU FP7 STReP, 2011-14. The project is looking at the models of user and contextual input – going really well at the moment, very interesting work.
Another of the projects is digitising the “petty maps” from the 1650s and we are linking the geospatial data of the depositions to the maps of Ireland – maps are political and Ireland is very well mapped. We are also connecting up resources that are effectively the Domeday Book for Ireland that give you the landholding in 1641 and the landholding again in 1670. So now we’re really cooking with gas!
We are really open to working with and sharing data from these projects by the way, there is real value in connecting resources.
The thing I have really gained in these projects. The researcher really needs access to a wide range of primary sources, access to knowledge, expertise, methodologies and practices across domains and disciplines. We need accepted standards and follows best practice. We want long term perservation and sustainability, And we need to experiment and innovate and that mean we really need the infrastructure there.
I was agitating for an Irish AHDS and I was told to look at work in Europe so Ireland became a founding partner in DARIAH and on the back of that Ireland funded the Digital Humanities Observatory and that was founded by Susan Schriebman. It went live almost at the moment AHDS was cut. In Ireland we really ganged together in wanting that infrastructure on a national level and in our institutions. We had €28 million to build our own infrastructure. Trinity College Dublin established the Trinity Long Room Hub – a centre for digital humanities. And we are looking to recruit digital humanists and we are recruiting a lectureship and expect more to follow. And it’s a really key area at the moment and industry is very interested here.
DARIAH is now moving into implementation stage. The UK was a the driver for DARIAH to be honest. And I think – and I can say it as I’m Irish – that the UK government has been very short sited here in cutting funding for AHDS. As the only English speaking country signed to DARIAH we are able to attract funding and support for the teaching, research and practice. It should be led by the UK, we were the new kids on the block, but I hope that changes. I think the UK is not getting good value from Europe in a variety of digital contexts. The digital humanists voices are not being heard. But back to Ireland…
Currently there is no policy – where is Digital Ireland? But Ireland is continuing to support digital humanities… why? Well we made a mistake when we didn’t sign up for Lisbon and that’s a mistake and we still want to be seen as a poster child for Europe. DARIAH helps with that engagement. We are good value for money and political value. Also we have engagement with industry. We are blessed with having the European headquarters of Microsoft, IBM, Google, etc. just spitting distance from us. Their CEO’s, many of them are Trinity College Dublin graduates, we are very lucky. We work closely with IBM, with Microsoft. For the depositions we worked with IBM on natural language processing. We had for the Depositions we had Google, Intel, Microsoft and IBM all around the same table and we said we wanted to work with all of them. They carved it up. Google wanted into crowdsourcing, IBM was into natural language processing. We work in a really collaborative way. To manage those relationship we insist that no one charges for anything we do, that it’s very open source.
I’m not saying we have everything right but all of this has happened in the last four or five years so it’s been really fast. My interest is as a historian. As a historian it’s not about data or being a technology flagship, it’s about understanding humankind. Whenever I end a talk of the Depositions I show an image of MyLai, Vietnam in 1969 – a photo of 15 bodies. I try to get my audience to think about that scene, on a different scale, when they think back to Depositions. My next piece of work will be looking at the history of atrocities and what the Depositions might be able to tell us about them.
Q1) Could I ask about the relationship between the depositions and IBM – what do you both get out of it?
A1) Out of IBM we have access to first rate technology and people for free. You’d be charged €4000 per day for that sort of stuff. You get them as partners in funding proposals. That adds value in relation to knowledge management. IBM meanwhile is interested in understanding the web. They want to understand the discussion, the natural language of the web and they really need that dirty data to build that understanding. They are not in it for the fun – I’d like to think that they are but these are tools that are useful for their commercial activity as well.
Q2) Given the amount of financial resources that have gone into this project how scalable are these sorts of projects? How feasible is it for others to have access to the technology to the projects?
A2) The European are deliberartly scalable and transferable, especially to cultural heritage. There are commercial implications of course – the computer scientists do look at that angle though. In terms of the scholarly community we are always happy to share our work. The technologies are not yet scalable and transferable but hopefully they will be. For me I care that we have the transcription and the research that I can now do that was unimaginable a decade ago. Technology has revolutionised the way that we as humanities scholars work.
Q3) I was intrigued by the point about future proofing you raises?
A3) At heart I’m a sceptic. For me the future is paper or microfilm. The money to maintain or preserve digital data is on a different scale. I’d love to think that we had a good handle on preserving digital data but I don’t think we do now. I want the full life cycle and that it can be sustained. The future is paper for me – a book is much more feasible to maintain. But we have in Ireland the National Audio Visual Repository which is looking to preserve and maintain data.
Q4) I was most impressed by your talk, especially the concensus you have found, could you say a bit more about how we could do that?
A4) Well it’s much easier in a small country like Ireland, it’s easy to have interactions there. Much harder in somewhere as big of the UK. But maybe it’s about, Digital Scotland, Digital Wales, Digital England… Our institutions do compete but we cooperate in that competition. We don’t have a REF. When we want research money we have to go for big chunks of money. Most humanists work on their own and need time and a laptop but once we had that money there was stuff we could all benefit from. We do fight, we do disagree. But DARIAH is hugely useful. And we can encourage one funder to attract support from another. Cultural institutions and industry backing was also hugely valuable. Leadership is hugely important. Brussels can be so dreary – you have to lobby and convince people and you have to stick in there. Institutional and government backing is so important.
Q5) I’m a freelance book indexer. You mentioned the Wordle. Have you found the new technology, new techniques for understanding the data, how’s that moving.
A5) When I saw the Wordle I was sceptical but it captures something important. Our students come in with new tools, new ideas. I worry that students will data mine but the concern is that they could lose or forget about context, that’s the important thing to remember.
Next we turn to Digitality across the disciplines chaired by Prof Chris Breward. Chris is principal of Edinburgh College of Art at the University of Edinburgh and is delighted to introduce Chris Speed who works in the areas of architecture but also works with geography, culture and society. He’s involved in huge numbers of projects – he’s a bit exhausting! – and he is the winner of this years Principal’s Medal for Services to the University Community. He is PI on numerous projects funded by JISC and AHRC such as Tales of Things and Walking Through Time. As a newcomer to Edinburgh the latter is particularly useful for me! Chris’ work provides really compelling examples of ways to make digital things both vivid and material.
Liquid Media: trans-disciplinary projects in digital technology – Dr Chris Speed, Lecturer and Reader in Digital Architecture, eca, University of Edinburgh
So… I’m going to unpack various projects. Things are a bit unrehearsed, it could be chaos!
I was supposed to do fine art but ended up doing an alternative practice degree, did some early apple mac coding. In that odd alternative other space I think with colleagues we started thinking about the idea of digital humanity. And we started to think about ubiquity – being everywhere all the time. With history.. ghosts are gonna get you and inform the future.
One of the earliest thing I was interested in was the notion of “The Street”. The idea of the consensual, of reclaiming space. But there probably isn’t a consensual street. People walk through at different times and with different priorities. Is there is a consensual context? Ubiquity challenges that.
An early student project in Plymouth looked at a street where you could kind of do everything you wanted – because of bombing in the war the whole town kind of moved to one street. So we recorded people discussing what was on that street, what they recalled. Which showed how much those spaces were not consensual. The landscape was remembered and understood differently by different people with different priorities. Then we started looking at who goes into which shops, how many people go into a space at what speed. We started to slow down that landscape, or speed it up according to access. So we sped up popular places, slowed down quieter shops. There was never a consensual space. You see this complex condition we face. Digital media is now lacing all these discreet time bases. I mean you’re all rotting at different speeds. At 19 you have your peak!
So now to think about the Challenger disaster (we are viewing a video). If you look back at this its very curious. When NASA released the transcript they threaded two things through. The crew and NASA talking about data – you only hear NASA, you don’t hear the crew members that are listed in that transcript. There is a lot of data but a lot of context bound up with that. What is fascinating of the human condition of this data… the voice is reading the data from a screen, he’s reading that off. And you can hear the gaps. But data continues after the shuttle has exploded. That gap of 12 seconds is our gap between data and us. My hunch is someone tapped that chap on the shoulder and told him to stop reading… but that gap is closing…
Everything now is coded spaces. Even in a lift we are coded. Lifts are programmed to default to bosses floor. And in emergencies to do differently. We did an experiment where we hacked the lift! When you enter a lift it’s hierachichal, you skip the floors of others and don’t find out about them. We did an we installed a “random” button – that cost us £6000 to do. Pressing it sets you at a random floor. It’s designed to mix things up. To break up hegemony of time and space.
At the turn of the millennium we had lots of freak outs – the Millennium Bug was a cultural construct for us to freak out. We saw loads of zombie and disaster movies appear such was our fear of disruptions. We live in a world of networked artefacts: connections cannot be cut. Things are connected all the time everywhere but you can reconceive and manipulate those spaces. Things like Inconception and the Adjustment Bureau absolutely play with the idea of non linear space and time. There is a public psyche to talk about the past as past can be present. So I started following @DriverRoss – a real person who died years ago in WWII but his jottings are twitter size so he tweets on.
And now we have ghosts, becoming and haunting. Learning to live with the ghosts. Walking Through Time. If you watch people in Edinburgh they have devices. People tend to frame their images so carefully – they pick out the older parts of the city in some way. They walk their own narrative. One of the most amazing things are the trees, Trees that were planted hundreds of years ago are like portholes to those times. you can hug the same tree!
We need to shape those arrows, those things we are waiting for – I’ve been waiting for the iPhone 5 to appear for ages now! – we need to shape what we want from data. Cue a video from IBM on the Internet of Things. But if you look at how the internet of things and how it is described in that IBM video and in an EU video they invision connectivity in linear narratives – like the tracing of a batch of prawns in a supermarket. It’s another cradle to grave idea. They never talk about what happens after consumption…
A few years ago we were lucky enough to be awarded an EPSRC grant around the internet of things project called Tales of Things. We worried about old things – the new stuff’s being tagged by shops and industry – but what about the past. Tales of Things lets you bring an object, tell a story, and generate a QR code. When you scan it in it tells you a story of the thing. But it was becoming more a website than about the things. So we, in 2010, did a project in Oxfam we captured 50 objects and sold them with stories attached. You could use a phone or, in this shop, could use a magic wand to scan and play the audio in the shop. These things are what we begin to mediate.
So to play an example this is a jumper that had been donated by someone who had worn it, met a boyfriend at a barbeque, been together, and had broken up. Suddenly that sweater smelled like barbeques, it was a great sweater for short love affairs! And looks so much different from some new Primark sweater now. And if I scan this dress… we hear that this dress, very nice as it is, was Annie Lennox’s and was worn to a Nelson Mandela event. You’ll pay £170 for that dress that perhaps Oxfam could have charged £30 perhaps otherwise. We did a test store in Selfridges and celebrities donated items. Oxfam saw that they had a 52% increase of sales when stories were attached.
People think barcodes are one way, but it’s not, it’s a two way web interface!
We did some work with the Scotland: a Changing Nation exhibit at the National Museums of Scotland which is their 20th century collections – really important for adding new memories. We had to create ghosts for this – a copy, a thing, that people could engage with and tell their stories of that artefacts and leave their comments (because objects had to stay in cases unfortunately). You can tag and share and contest your material relationships. That non consensual space. the world works because it’s not consensual. This creates weird implications for things…
Possessions aren’t the things that they might be. Tracy Emin’s tent was burnt in a fire at the Saachi storage. It’s no longer there but I have memories, I can take them to the story. A carriage clock is a thing but it could be any thing (cue music from The Omen), it all gets scary!
And there was a final thing I was going to mention… Watch the news on Sunday! Chris can’t tell us more stuff it’s embargoed.
So this stuff is really happening. Some significant paradigm shift taking place here.
We presented Tales of Things at UbiComp – they said Oxfam will never go for it, 50% of their stuff is from dead people. But Oxfam were fine with that. They sell stuff to help people out, to prevent death. We will get ghost stories and that’s just fine.
Q1) That was great. Given the challenge of identifying the border of an object, what are views on ownership of objects?
A1) Property is theft – is that where I go with that! I have to be careful, there are huge ethical issues here. Whilst the Big Society was being talked about Wikileaks was bringing down that type of formal organisation. There will be activity in many directions. But that’s going to happen when modern paradigms more to something more complex.
Q2) I wanted to ask about democratisation of voice. The creator also has stories. How can we recognise the creators stories as being perhaps more valuable.
A2) There is something important there. We’re doing something with the British Museum. There is noise there. We have to tune things… we may need facilities for tuning. So with the dance halls we had loads of Abba stories – one girl just sang as their story. If our next project comes to fruition it will be about noise, tuning, and curation.
Q3) Isn’t there a sense that properties aren’t theft, they are loss. Perhaps we can imagine that nothing is lost. What does that mean for our desire to retrieve things…
A3) Another really good question. A few years ago there was a book called Delete which was the antithesis to this. In terms of deletion… people are good at letting go. We bury people when they die not just for practical reasons but for emotional reasons. So we have a Totem pole for memories in Wester Hailes which comes with a 12 hour delete – needed to prevent bullying. We have to think about how we deal with the balance of ubiquity with the need and want to forget and tuning is part of that too.
Q4) The Oxfam story is great but slightly concerned me… sometimes the media interest is at novelty, one is concerned about Week 2 and Year 5 everyone will have moved on and the stories will be mundane…
A4) Well the nature of these experiments is to come in, to make an intervention, and to see what happens. But Oxfam is in this to make money. But perhaps the project is part of a wider wave. I’ve seen some of you in shops. You try on a shirt, and then you buy the same one but in plastic as if it’s different or changes it! It’s abject. I was going to show you Micheal Hanakes Cache… there’s a whole stream of popular culture about living with ghosts.
Q5) You talked about the thing becomes the story. But digitally there is an issue of the authentic – the Gay Girl in Damascus issue. So with that dress how do you verify that story is true, that’s its not another dress. And especially when there is money attached!
A5) Well the QR code is supposed to help as unique – albeit reprintable – but the story is authentic, the conversation is the authentic. If we go live with 750 shops in Oxfam the research will be in mining the data. Richard Coin says things aren’t things, they’re events. A pint and a chat is an event, the pint isn’t a thing…
The final section of the day is simple Digital data chaired by Dr Miles Osborne. He is introducing our last speaker Dr Erez Lieberman Aiden who is being joined in his presentation by his partner Aviva (sp?). Miles has been Googling Erez, he holds 37 patents, including a computerized shoe insert that can tell before you fall! But today he will be talking about the Google Books projects and how books change over time.
Culturomics: quantitative analysis of culture using millions of digitized books – Dr Erez Lieberman-Aiden, fellow at the Harvard Society of Fellows and Visiting Faculty at Google
Apologies if my mic fall off, I have weirdly shaped ears… however… I have been doing quantitate analogy of culture using Google Books as the corpus – work I’ve been doing with Jean Baptiste Michelle – and my wide Aviva will also be talking about analysis she has done on this body of material. Back in 2002-4 thinking about the issues of libraries – how do we understand everything in a library? How do we see how culture change?
When I come to this issue I come to it as a scientist. And we come to the question of “How does culture change?”. But it’s such a vague idea. So how about “How does language change?”. So here we look at a regular equation here – past = present + “ed” etc. But for very frequent verbs they tend not to obey the rule (get vs got). But irregular verbs regularize over time it seems.
So we came up with a plan to track this stuff… This is a big text not in the english language. We have loads of undergrads at Harvard with spare time. We’ll put up flyers and ask them to read and record any irregular verb appears. We recruited undergrads. But we had a marketing fail. We had 1 undergrad but fortunately she was a WonderGrad! So she instead identified 11 grammer texts – Old English and Middle English eras to see how many irregular verbs there had been and what the change was. As you look at this table of frequency we see that lower frequency verbs are more likely to regularise – very obvious in the table here. So Regularisation follows a simple trend which sort of makes sense, you use a word less, you don’t really know it is irregular. But we didn’t realise we could predict a sort of halflife for irregular verbs: “if a verb is 100 times less frequent it regularizes 10 times as fast”.
And we made a pretty image to accompany this idea – and it made the cover of Nature! Which was exciting as language study was still being part of the sciences. So we got more ambitious. So we wanted to look at the change of language time
On our y axis for awesome it is like way high. On the x axis of practical it’s like way unpractical. So we’d gone for that less awesome and more practical secondary sources approach. BUT Google had started digitising books and so we got in touch with them about doing basic quantative analysis.
So a quick look at where those books came from… 129 million books have been published since the invention of the printing press. Many are in libraries. Google has scanned 15 million books (as of when we published on this) and OCR scanned them. What jumped out of us was the variability of OCR and metadata. So we threw the dubious ones out – that was about two thirds. So we had 5 millions texts to worth with. In 11pt font it would go to the moon and back 5 times. We had a substantial proportion of the printed books of the world. Armed with that sort of hyperbole we said “Stand Back, I’m going to do Science”.
Firstly there was the issue of data that can be released. But 5 million books = 5 million(+) authors and a lawsuit Google didn’t want! So fulltext here wasn’t going to be doable – that would be totally awesome but not practical. So instead to release this work we would use n-grams. This is the number of words in a phrase so “the” is a 1 n-gram, “united states of america” is a 3 n-gram. And we would quantify historical change in use of those particular n-grams. So the normalised instances of the word divided by total words that year helped you see usage over time. So using “thrived” and “throve” as an example we get a great picture of change. And we don’t have 1 of these, we have 2 billion n-grams now.
So.. I don’t know if the liberal media tells you about this but the verb “chid”/”chode” and “chided” is the fastest verb on the planet – it has changed in 200 years, it’s an increadibly fast and distinguished verb.
Sneak Snuck in… So this is an exception. If we look at “snuck” and “sneaked” then sneaked is still dominant but snuck is coming up… and this is data to 2000, it’s moving fast, may be in the lead now!
Some heartening news… The US, leading exporter of verbs! So looking at “-t pattern” – use of “burnt” or “learnt” rather than “burned” or “learned”. As in the room it’s about 50/50 in UK usage but in the US most would say burned or learned.
N-grams can be quite abstract “spirit of the nation” for instance, is it probable, abstract concepts? We wanted to have a think…
Lets talk about 1950. If we look at books from the 1850s, 1870s, 1910s etc. people didn’t care about 1950. But in the late 1940s it suddenly picked up. And never more so than in 1950. It was so fascinating that people debriefed long after. But around 1955 the bubble burst, things were looking to pass out of trend. This is the story of every year on record. It falls out of memory quickly and then continues a slow decline. But we have this quantative data. So we plotted the half life of a year. And we found that these are getting shorter and shorter… it’s hard to interpret this sort of data. It’s really challenging. Maybe we are losing interest in the past more rapidly than we have before. But it’s challenging data.
Another example – another of Aviva’s fantastic ideas – things get invented at various points in time. We look at the telephone and the radio and when they were invented. So we show mentions of these. The telephone takes ages to get going. The radio immediately attracts excitement. So we split technology into three chunks of time of invention. Inventions spread much faster than they did and it’s robust data here.
So we were kind of curious to get advice from the data to live lives in a more principaled way… So we looked at most famous people of all time in the data. We looked in every year on record to find 25 most famous people for that year. And found there are many routes to fame. So Ernest Rutherford became famous for experiments in scatter, Orville Wright for inventing the plane, Marcel Proust for his writing. But there is sort of a pattern to knowledge of people. They aren’t known, slow interest, fast fame and slow decline…
So it looks like in 1865 people get famous at age 34, it doubles for 4 years, then at the age of 70 it starts to decline… Over time it’s changing. The doubling times are now faster but so is the decay – fame is more fleeting now.
Lets think about Careers… What should you do? To become famous fast you want to be an actor. Looking at the 19th century you come to fame in your 20s. Authors get more famous but it takes longer. However politicians don’t get famous until your fifties and get super famous! Academics? Well you can be as famous as the (not very) famous actors but it takes terribly long. But mathematicians is really a bad choice. From a fame pursuit standpoint, you don’t get famous young, not event old…
On that sobering note it’s over to Aviva.
Culture can impose constraints on things in the context of suppression. So if you look at Marc Chagall. He has a typical fame pattern in English his pattern is straightforward. He gets pretty famous, more famous, and eventually slows. But he was a Jewish artist and if you look at his profile in German he gets famous and then disappears for some time before reappearing again.
Similarly Jesse Owens in English isn’t famous but reaches rapid fame at the Olympics. If you look in German his pattern is similar… but begins only after 1945 when the Nazis lose power.
But it’s not just about Nazi Germany. In Soviet Russia we see the pattern of Stalin suppressing Trotsky in the literature. It shows the effectiveness of the suppression of information in fact. If we look at Tiananmen Square we see that in english there is huge discussion of the protests in the 80s. In Chinese it barely features – although the building of a mausileum in Tiananmen Square 10 years earlier did get discussion in the Chinese texts. So it indicates possible suppression.
The Nazis had a censorship list for intellectuals, and artworks – they also created a degenerate art exhibit – to show and mock art. And we wanted to see the impact of this on literature. We looked at the appearance of these names etc. They drop off from the mid 30s to the mid 40s. And indeed names of Nazis barely featured then peaked at this time.
So looking at targets of suppression de novo. Looking at Henri Matisse – we can plot expected fame level based on pre and post 1940 fame and then we can see actual fame – and we call the difference the suppression index. So if the index is around 1 then you are probably not someone who has been suppressed. If less than 1 then you were probably subject to suppression. And more than 1 you may be a beneficiary of suppression – someone whose profile has been raised by suppression of others.
Back to Erez for quick questions – newspapers are excluded, just use books. Words are normalised effectively by hand. Can you find suppression without looking for it? Yes you can – as you’ve just seen. We expect to launch a new n-gram interface to explore these and we will be compiling all our publications on this work – and we have loads of materials to share on this work.
Matej Kren is an artist who creates sculptures with big piles of books. We said you have to submit your photo to Science – that’s our image we’ve been using today on these slides. And that was used as a Science cover.
So we designed a tool and just as we were publishing our work Google also produced a version – the Google Books Ngram Viewer (books.google.com/ngrams) and you can access this and see change over time.
The beft query ever! – Fabulous graph or beft or best. What’s going on? Well it’s just that s at the beginning or middle of a word it looked kind of like an “f”, the Google OCR thinks it is an “s”. So that’s not a change at all… frustrating as the supplemental material to our paper covered this and that was fine but people blogged and tweeted as though we hadn’t thought of this.
We call this an Anrgh! situation – apparently Argh is more annoying and AAArgh ish in latter years except for a surge of AArgh in the 80s.
How many of you read xkcd? (some of us do) Well my ambition to be part of xkcd has finally happened! They did a strip on sustainable – the word sustainable is unsustainable!
JB and I call this Culturenomics!
The hisorical record is being digitised – from all sorts of forms. The results of this digitisation is changing the methods and research we can do. And it will result in change that will transform our idea of culture, society, literature.
Q1) So, you have all these words. What is special about the words whose frequency changes?
A1) That’s a really interesting question. Were I to say something entirely abstract… words like “word” or “go” they are used ubiquitously in lots of contexts. You see words that are somehow connected to a movement or idea and they are the ones that more really fast – like sustainable.
Q2) Have you done Wordles for each decades?
A2) Yes, we’ve done lots of Wordles – the guy who came up with Wordles sat at a desk across from them. And you can see some really interesting stuff. I should shout out to Ben Schmidt at the Digital Observatory at Harvard who is looking at TV shows like Mad Men etc. which are historical. And he does analysis of screamers – totally wrong turns of phrase, wrong linguistic constructions. So “just might” was used in Downton Abbey but it should have been “might just”. It’s super wrong for that era but not for today. Subtle things that it’s hard to pick up on but these things emerge really clearly. We hope to clean up historical shows on TV!
Q3) I confess to being an anthropologist… one of the worries I have about my discipline… we share with you the fascination with culture and cultural change. Most anthrpologists are numerophobes – they don’t like numbers! And yet they are of course closet positivists and numerophiles as they use phrases “more than”, “less than”, “frequently” that imply measuring the world. Do you have a widget I could use to prove to anthropologists that they are using numbers when they don’t think they don’t.
A3) That’s a fascinating questions. These are increadibly deep questions really but in terms of a short term perspective… it’s nice to play with the ngram view… we could look for quantitive anthropology here… hmm… so we can try the Google Ngrams, or we can see Bookworm, which is a tool to look for particular words in a particular corpus. This only looks at public domain works so looks at limited collections. So you can do analyses… but it poops out in 1920 when books cease to be public domain. So if I look at anthropology books only and look for quantity or quantitative and you can see changes over time… so there’s a tool for you to play with!
The day is now coming to a close with Dr Sian Bayne and Dr Sarah Anderson of ERI. Sarah is going to begin by talking about funding sources for taking ideas forward.
So first of all I am ashamed that I missed the last talk as I did my PhD on word analysis. Anyway… in your pack you will find a one pager on funding sources and the digital humanities. We’ll also email it out. The key thing is that you don’t need to look for funds directly labelled “digital humanities”, there are many others that are useful. But I also wanted to draw your attention to specific funds – the ESRC has an e-Social Science programme and a digital social research strategy and you can apply for £5k for small collaborations. The Digging into Data programme is also a major funder for digital humanities. JISC, a UK funder, is reviewing that programme but is optimistic about a third phase of the project. And finally I just want to thank all the organisers today.
And back to Sian. I mainly wanted to thank everyone who has come to speak today, to all who have been organising and attending today. See you next year!
A final note from me (that’s Nicola again btw): a gigantic thank you to all who arranged and spoke and came to the event today: it was fantastic!