Today I am at the Digital Scholarship Day of Ideas, organised by the Digital Scholarship programme at University of Edinburgh. I’ll be liveblogging all day so, as usual, I welcome additions, corrections, etc.
Welcome & Introduction – Melissa Terras, Professor of Digital Cultural Heritage, University of Edinburgh
Hi everyone, it is my great pleasure to welcome you to the Digital Day of Ideas 2018 – I’ve been on stage here before as I spoke at the very first one in 2012. I am introducing the day but want to give my thanks to Anouk Lang and Professor James Loxley for putting the event together and their work in supporting digital scholarship. Today is an opportunity to focus on digital research methods and work.
Later on I am pleased that we have speakers from sociology and economic sociology, and the nexus of that with digital techniques, areas which will feed into the Edinburgh Futures Institute. We’ll also have opportunity to talk about the future of digital methods, and particularly what we can do here to support that.
Lynn Jameson – Introduction
Susan Halford is professor of sociology but also director of the institution-wide Web Science Institute.
Symphonic Social Science and the Future of Big Data Analytics – Susan J Halford, Professor of Sociology & Director of Web Science Institute, University of Southampton
Abstract: Recent years have seen ongoing battles between proponents of big data analytics, using new forms of digital data to make computational and statistical claims about the social world, and many social scientists who remain sceptical about the value of big data, its associated methods and claims to knowledge. This talk suggest that we must move beyond this, and offers some possible ways forward. The first part of the talk takes inspiration from a mode of argumentation identified as ‘symphonic social science’ which, it is suggested, offers a potential way forward. The second part of talk considers how we might put this into practice, with a particular emphasis on visualisation and the role that this could play in overcoming disciplinary hierarchies and enabling in-depth interdisciplinary collaboration.
It’s a great pleasure to be here in very sunny Edinburgh, and to be speaking to such a wide ranging audience. My own background is geography, politics, english literature, sociology and in recent years computer sciences. That interdisciplinary background has been increasingly important as we start to work with data, new forms of data, new types of work with data, and new knowledge – but lets query that – from that data. All this new work raises significant challenges especially as those individual fields come from very different backgrounds. I’m going to look at this from the perspective of sociology and perhaps the social sciences, I won’t claim to cover all of the arts and humanities as well.
My talk today is based on work that I have been doing with Mike Savage on “big data” and the new forms of practice emerging around these new forms of data, and the claims being made about how we understand the social world. In this world there has been something of a stand off between data scientists and social scientists. Chris Anderson (in 2008), a writer for Wired, essentially claimed “the data will speak for itself” – you won’t need the disciplines. Many have pushed back hard on this. The push back is partly methodological: these data do not capture every aspect of our lives, they capture partial traces, often lacking in demographic detail (do we care? sociologists generally do…) and we know little of its promise. And it is very hard to work with this data without computational methods – tools for pattern recognition generally, not usually thorough sociological approaches. And present concerning, something ethically problematic, results that are presented as unproblematic. So, this is highly challenging. John Goldthorpe says “whatever big data may have for “knowing capitalism” it’s value to social science has… remained open to questions…”.
Today I want to move beyond that stand out. The divisiveness and siloing of disciplines is destructive for the disciplines – it’s not good for social science and it’s not good for big data analytics either. From a social science perspective, that position marginalises social sciences, sociology specifically, and makes us unable to take part in this big data paradigm which – love it or loathe it – has growing importance, influence, and investment. We have to take part in this for three major reasons: (1) it is happening anyway – it will march forward with or without it; (2) these new data and methods do offer new opportunities for social sciences research and; (3) we may be able to shape big data analytics as the field emerges – it is very much in formation right now. It’s also really bad for data science not to engage with the social sciences… Anderson and others made these claims ten years ago… Reality hasn’t really shown that happen. In commercial contexts – recommendations, behaviour tracking and advertising, the data and analysis is doing that. But in actually drawing understanding from the world, it hasn’t really happened. And even the evangelists have moved on… Wired itself has moved to saying “big data is a tool, but should not be considered the solution”. Jeff Hammerbacker (co-credited for coining the term “data science” in 2008, said in 2013 “the best minds of my generation are thinking about how to make people click ads… that sucks”.
We have a wobble here, a real change in the discourse. We have a call for greater engagement with domain experts. We have a recognition that data are only part of the picture. We need to build a middle ground between those two positions of data science and social science. This isn’t easy… It’s really hard for a variety of reasons. There are bodies buried here… But rather than focus on that, I want to focus on how we take big steps forward here…
The inspiration here are three major social science projects: Bowling Alone (Robert Putnam); The Spirit Level – Richard Wilkinson and Kate Pickett; Capital – Thomas Piketty. These projects have made huge differences, influencing public policy and in the case of Bowling Alone, really reshaped how governments make policy. These aren’t by sociologists. They aren’t connected as such. The connection we make in our paper is that we see a new style of social science argumentation – and we see it as a way that social scientists may engage in data analytics.
There are some big similarities between these books. They are all data driven. Think about sociologists at the end of 20th century was highly theoretical… At the beginning of the 21st century we see data driven works. And they haven’t done their own research generating data here, they have drawn on existing research data. Piketty has drawn together diverse tax data… But also Jane Austen quotes… Not just mixed methods but huge repurposing. These books don’t make claims for causality based on data, their claims for causality is supported by theory. However they present data throughout and supporting their arguments. Data is key, with images to hold the data together. There is a “visual consistency”. The books each have a key graph that essentially summarises the book. Putnam talks about social capital, Piketty talks about the rise and fall of wealth inequality in the 20th century.
In each of these texts data, method and visualisation are woven into a repeat refrain, combined with theory as a composite whole to makes powerful arguments about the nature of social life and social change over the long term. We call this a “Symphonic Aesthetic” as different instruments and refrains build, come in and go… and the whole is greater than the sum of the parts.
OK, thats an observation about the narrative… But why does that matter? We think it’s a way to engage with and disrupt big data. There are similarities: re-purposing multiple and varied “found” data sources; an emphasis on correlation; use of visualistion. There are differences too: theoretical awareness; choice of data; temporality is different – big data has huge sets of data looking at tiny focused and often real time moments. Social Science takes long term comparisons – potentially over 100 years. The role of correlation is different. Big data analytics looks for a result (at least in the early stage), in symphonic aesthetics there is a real interest in correlation through statistical and theoretical understandings. Practice of visualisation varies as well. In big data it is the results, in symphonic aesthetics it is part of the process, not the end of the process.
Those similarities are useful but there is much still to do: symphonic authors do not use new forms of digital data, their methods cannot simply be applied, big data demand new and unfamiliar skills and collaborations. So I want to talk about the prospective direction of travel around data; method; theory; visualisation practice.
So, firstly, data. If we talk about symphonic aesthetics we have to think about critical data pragmatism. That is about lateral thinking – redirection of what data exist already. And we have to move beyond naivety – we cannot claim they are “naturally occurring” mirrors/telescopes etc. They are deliberately social-technical constructions. And we need to understand what the data are and what they are not: socio-technical processes of data construction (eg carefully constructed samples); understanding and using demographic biases (go with the biases and use the data as appropriate, rather than claiming they are representative; or maybe ignore that, look at network construction, flows, mobilities – e.g. John Murrey’s work).
Secondly method. We have to be methodologically plural. Normally we do mixed methods – some quantitative, some qualitative. But most of us aren’t yet trained for computational methods, and that is a problem. Many of the most interesting things about these data – their scale, complexity etc. – are not things we can accommodate in our traditional methods. We need to extend our repertoire here. So social network analysis has a long and venerable history – we can apply the more intensive smaller version of large scale social network analysis. But we also need machine learning – supervised (with training sets) and unsupervised (without). This allows you to seek evidence of different perhaps even contradictory patterns. But also machine learning can help you find the structures and patterns in the data – which you may well not know in data sets at this scale.
We have this quote from Ari Goldberg (2015): “sociologists often round up the usual suspects. They enter the metaphorical crime scene every dat, armed with strong and well-theorised hypotheses about who the murderer should or at least plausibly might be.”
To be very clear I am not suggesting we outsource analysis to computational methods: we need to understand what the methods are doing and how.
Thirdly, theory. We have to use abductive reasoning – a constant interplay between data, method and theory. Initial methods may be informed by initial hunches, themes, etc. We might use those methods to see if there is something interesting there… Perhaps there isn’t, or perhaps you build upon this. That interplay and iterative process is, I suspect, something sociologists already do.
So, how do we bring this all together in practice? Most sociologists do not have a sophisticated understanding of the methods; and most computer scientists may understand the methods but not the theoretical elements. I am suggesting something end to end, with both sociologists and computer scientists working together.
It isn’t the only answer but I am suggesting that visualisation becomes an analytical method, rather than a “result”. And thinking about a space for work where both sociological and computer science expertise are equally valid rather than combatorial. At best visualisations are “instruments for reasoning about quantitative information. Often the most effective way to describe, explore and summarise a set of numbers – even a very large set – is to look at pictures of those numbers” (Tufte 1998). Visualisations as interdisciplinary boundary objects. Beyond a mode of argumentation… visualisation becomes a mode of practice.
An example of this was a visualisation of the network of a hashtag that was collaborative with my colleague Ramin, which developed over time as we asked each other questions about how the data was presented and what that means…
In conclusion, sociology flourished in the C20th. Developing methods, data and theory that gave us expertise in “the social” (a near monopoly). This is changing – new forms of data, new forms of expertise… And claims being made which we may, or may not, think are valid. And that stands on the work of sociologists. But there is some promise in the idea of symphonic aesthetic: for data science – data science has to be credible and there is recognition of that – see for instance Cathy O’Neil’s work on data science, “Weapons of Math Destruction” which also pushes in this direction. ; for sociological research – but not all of it, these won’t be the right methods for everyone; for public sociology – this being used in lots of ways already, algorithm sentencing debates, Cambridge Analytics… There is a real place for sociologists to reshape sociology in the public understanding. There are big epistemological implications here… Changing the data and methods changes what we study… But it has always been like that. Big data can do something different – not necessarily better, but different.
Q1) I was really interested in your comments about visualisations as a method… Joanna Drucker talks about visual technology and visual discourse – and issues of visualisations as being biased towards positivistic approaches, and advocates for getting involved in the design of visualisation tools.
A1) I’m familiar with these concepts. That work I did with Ramin is early speculative work… But it builds and is based on classic social network analysis so yes, I agree, that reflects some issues.
Q2 – Tim Squirrel) I guess my question is about the trade off between access and making meaningful critiques. Often sociology is about critiquing power and methods by which power is transmitted. The more data proliferates, the more the data is locked behind doors – like the kind of data Facebook holds. And in order to access that data you ahve to compromise the kinds of critiques you can make. How do you navigate that narrow channel, to make critiques without compromising those…
Q2) The field is quite unsettled… It looks settled a year ago but I think Cambridge Analytica will have major impact… That may make the doors more closed… Or perhaps we will see these platforms – for instance Facebook – understanding that to retain credibility it has to create a segregation between their own use of the data, and research (not funded by Facebook), so that there is proper separation. But I’m not naive about how that will work in practice… Maybe we have to tread a careful line… And maybe that does mean not being critical in all the ways we might be, in every paper. Empirical data may help us make critical cases across the diverse range of scholarship taking place.
Q3 – Jake Broadhurst) Data science has been used in the social world already, how do we keep up and remain relevant?
A3) It is a pressing challenge. The academy does not have the scale or capacity to address data science in the way the private sector does. One of the big issues is ethics… And how difficult it is for academics to navigate ethics of social media and social data. And it is right that we are bound to ethical processes in a way data scientists and even journalists do not need to. But it is also absolutely right that our ethics committees have to understand new methods, and the realities of the gold standard consent and other options where that is not feasible.
The discussion we are having now, in the wake of Cambridge Analytica, is crucial. Two years ago I’d ask students what data they felt was collected, they just didn’t know. And understanding that is part of being relevant.
Q4 – Karen Gregory) If you were taking up a sociology PhD next year, how would you take that up?
A4) My official response would be that I’d do a PhD in Web Science. We have a programme at University of Southampton, taking students from a huge array of backgrounds, and giving them all the same theoretical and methodological backgrounds. They then have to have 2 supervisors, from at least 2 different disciplines for their PhD.
Q5 – Kate Orton Johnson) How do we tackle the structures of HE that prevent those interdisciplinary projects, creating space, time, collaborative push to create the things that you describe?
A5) It’s a continuous struggle. Money helps – we’ve had £10m from EPSRC and that really helps. UKRI could help – I’m sceptical but hopeful about interdisciplinary possibilities here. Having PhD supervision across really different disciplines is a beautiful thing, you learn so much and it leads to new things. Universities talk about interdisciplinary work but the reality doesn’t always match up. Money helps. Interdisciplinary research helps. Collaboration on small scales – conference papers etc. also help.
Q6 – David, research in AI and Law) I found your comments about dialogues between data scientists and social scientists… How can you achieve similar with law scholars and data scientists… Especially if trying to avoid hierachichal issues. Law and data science is a really interesting space right now… GDPR but also algorithmic accountability – legal aspects of equality, protected categories, etc. Very few users of big data have faced up to the risks of how they use the data, and potential for legal challenge on the basis of discrimination. You have to find joint enthusiasm areas, and fundable areas, and that’s where you have to start.
The Economics Agora Online: Open Surveys and the Politics of Expertise – Tod van Gunten, Lecturer in Economic Sociology, University of Edinburgh
Abstract: In recent years, research centres in both the United States and United Kingdom have conducted open online surveys of professional economists in order to inform the public about expert opinion. Media attention to a US-based survey has centred on early research claiming to show a broad policy consensus among professional economists. However, my own research shows that there is a clear alignment of political ideology in this survey. My talk will discuss the value and limitations of these online surveys as tools for informing the public about expert opinion.
Thank you for the invitation to speak today, and for Susan’s great and inspiring talk. I wouldn’t claim the label “symphonic” for this talk, but I think there is something of that spirit in this talk. This project is based on found and repurposed data. It isn’t particularly “big” data… But the “found” aspect of the data raises profound questions. Data never holds the answers on its own, it is always crucial to understand method and context. Visualisation is a big part of this. And it about public sociology – so it hasn’t just been published in journals but in popular press as well.
I am an economist who studies economists as a sociological object in their own right. So, this is a famous moment in 2008 when the Queen, during the midst of the largest global financial crisis since 1929, asked an economist “why did nobody notice it”. Because she is the queen, the British Academy convened a panel to respond to this question. And they said that lots of people did a good job, but it was no-one had it as their job to put everything together. Meanwhile with Brexit we’ve seen economists as a profession receiving substantial criticism.
Economists are hugely influential, we study them because it is the politics of expertise. It is the most politically influential social science. So, I’m going to talk about properties we would like politically influential experts to have:
- A high level of professional consensus within the the relevant community of experts. Gold standard here is climate science. If we have a community of experts that all agree, there seems to be a need for action. That’s a good principle.
- Form policy opinions independently of their own political ideology. We will receive and have confidence in advice from an independent expert more than someone presenting their own views.
- Acknowledge professional debate in expressing their views. That they acknowledge that issues are not settled issues.
So in this paper I want to look at how we may use data to measure these aspects. And I’mm be going through some theory around the cultural structure of belief spaces and how this relates to data, big data in the context of economics – but this theory can be used in other contexts as well).
I want to open on the “economics agora” online. I want to talk about two surveys here – these are open online surveys of economists since the financial crisis. It is no coincidence that these have emerged at this time. These surveys are in the UK and in the USA. And unusually the results include publishing the full responses, and the names of the responders – by their consent. These are famous/well known individuals in their field. This allows us to do more… Bring in data that is not in the survey – the CVs of the respondents for instance so including universities, political activities, their co-authorship network, etc. The survey organisers’ goal is to inform the public, but finding patterns in the data requires aggregation and analysis. This isn’t just individual responses, but understanding the context of the data. And again, this isn’t big data, this is quite small data. But these approaches apply to big data too.
So one of these surveys is the Chicago Booth IGM Economic Experts Panel. Each month they put a question to 40 economists about some issue of the moment – the impact of autonomous cars for instance. The second survey is the Centre for Math and Economics, based in London, and again they ask a panel for responses. Typically the UK/European survey shows much more disagreement than the US survey.
There are a lot of issues with these surveys: they are small (the UK/EU one is expanding) and non-random samples; deliberately elitist samples (US survey – “top 7” economics departments in US universities, mainly Ivy League) – why would you take this sample? Well you wouldn’t really… But you have very high status economists. The UK survey has a much wider range in its samples. I think these surveys are great… But I think they should do a better job! Another problem is that you have a high rate of “softball” questions – in the US survey, not in the UK/EU surveys. For instance “imposing new US tariffs on steel and alumnium will improve Americans’ welfare” – it’s timely but we already know that there is high consensus here. We need to ask harder questions! And finally we need to think about the motivations of the people who produce the data – the survey designers are looking to raise the profile of the profession. In a Wall Street Journal the designers of the US survey talked about wanting to counteract the idea of a lack of consensus in the field – and they are the ones asking the questions.
Gordon and Dahl (2013) looked at views and consensus in the field based on the surveys. They presented this as being a “remarkably high degree of consensus” and little variance across schools and departments. And thus look at how influential this field should be. This got big pick up… the Washington Post picked it up. Nobel winning economist Paul Krugman picked this up in his opinion column in the Economist. He is on record (New York Times 2009) as saying pretty much the opposite – that there is polarisation between the “saltwater” economists in the Keynesian camp, and the “freshwater” economists who are very much the opposite.
So, a bit of theory… What do we mean by consensus, polarisation, factions etc? How do groups of people structure their belief systems? We do have twenty years of literature and theory here around understanding belief systems. This goes back to political scientists in the 1960s. Philip Converse (1964) found that most american voters do not adhere to a coherent political ideology – this is still the case. Their believe systems are disorganised or “unconstrained” – so one belief does not let you predict another belief. So for instance comparing a belief that you should “reduce immigration” and “reduce corporate tax” – could show little correlation, those beliefs don’t automatically go together. Now if you are a voter in the UK in 2018 there probably is more alignment. That pattern is a “constrained or aligned” correlation. If you look at polarisation you see clusters of correlation.
So, that paper on economists looks for clusters. I looked at polarisation to look at latent ideology, noting partisanship (where known involvement in e.g. being part of political left or right leaning think tanks etc. – or marked as “none”), current department (freshwater vs saltwater) and belief dimension. Unsurprisingly those involved in Republican/conservative organisations and those with backgrounds in democratic/liberal organisations were very different, leaning right and left respectively. This is the same data that generated that paper that showed consensus and little variance.
There is a high degree of consensus in this survey but you can also see idealogical alignment. That can be consistent. But it depends on what you think, and what you ask. The UK survey – more recently expanded to Europe – shows much less consensus. This could mean there is more consensus in the US than in Europe; but it could also mean that the questions being asked in the UK survey are harder questions. The UK survey asks very complex questions… e.g. “Do you agree that, in a period of great uncertainty and after a prolonged period of weak real wage growth, monetary policy makers can afford to wait for greater certainty about real wage developments and building inflationary pressure before raising interest rates?”. So, you can’t measure consensus without a comparison with another group. You can see consensus on a question, not of a group/community or set of beliefs.
So, looking at a recent UK/EU survey on looking at anti-establishment vs monetary conservatism you can see a diversity of views here.
So, back to those qualities. Professional consensus is harder to measure than it first appears.
One of the questions respondents are asked to give is their vote and their level of confidence. So, when experts give an opinion on hot topics you’d really want a low confidence score to show you don’t have a partisan respondent on your hands. Looking at the data here in the US surveys we see a lot of overly confident responses. Respondents with a stronger idealogical disposition (aligned belief structure) exhibit systematic overconfidence. In general, across all questions, when asked politically salient questions they state higher confidence than questions with little/no political salience.
By way of conclusion… Am I joining ranks with Michael Gove “people in this country have had enough of experts”? No. I would say something more nuanced. Arguably professions in general, economists in particular, has lost political legitimacy, then professional over-reach (“look how much consensus we have”) is not the answer. Claiming consensus where none exists is over-reach. Transparency about professional debate is always better than overstating consensus. Political legitimacy is a scarce resource and should be treated as such.
The economics agora online is a useful tool for studying the beliefs of an important community of experts… but survey designers should up their game. If you want an “unbiased” expert, chose someone whose belief structure is unconstrained. You probably want someone in the middle – people whose belief systems are not correlated. You need a theory of how groups form beliefs…. So read cultural sociology!
Q1) In thinking about the resistance to “naturally occurring data” and the idea of an “unbiased expert” – do you have a sense that that isn’t possible… Rather than getting that, should we instead shift the conversation to make the politics relevant – to be clear in a way that makes the numbers make sense…
A1) If we chose which experts to listen to, which do we listen to…
Q1) It was interesting to think of economists as “not political” – if that’s the conversation… I think the non-biased expert… That raises issues. We query that that even exists… Maybe we can shift the conversation.
A1) I guess I would want to push back a little bit. I am sympathetic that there is no unbiased expert but… I do a lot of work on economists on how they influence policy. I think the world does need economists, especially for monetary policy, technical aspects of policy. So, having some tools to understand this profession, how they structure beliefs… We need more tools to unpack that set of questions… I’m trying to find ways to study this profession studying quantitative tools and qualitative tools and understand impact on politics and society.
Q2) You mentioned a graph to show polarisation – how did you do that?
A2) This is not based on data, this is based on theoretical patterns… A series of plots using a test data set to illustrate the patterns of the theory – it’s theoretical rather than empirical data.
Q3) A slight follow up… How much have you played with non linear tools… Consensus and confidence… Research on scientific knowledge shows that people who know a little about science have higher confidence than those who know more… That could impact that data on confidence.
A3) We did look an non-linearity – doesn’t make a big difference to some measures here.
Q4) What definition of “expert” are you using, and why?
A4) People with PhDs in economics. In the US case are high status people in the field… In the UK/EU case it is broader. Most work as professors of economics, some work in the private sector in financial sectors. For my purposes it’s holding a PhD in economics… Other work I’ve done on organisations in Latin America you have senior political elites with those credentials, a lot don’t boundary work becomes more important here.
Q5) I think some of the Chicago questions also go to the public. Have you looked at that?
A5) It’s not publicly available… I’ve been thinking about asking for that. But it would be interesting to know if members of the public structure their belief systems differently. There is some work that compares public beliefs to these questions.
Q6) I work on spatial models around expert agreement and disagreement – interesting measures there and on polarisation. Also dimensionality reduction. Since you are trying to identify latent ideological positions… Not sure if you’ve looked at that. Political behaviour research has
Q7) I wanted to ask about how much the very different types of respondents and samples you have between the US and UK/EU surveys. I was particularly wondering about the high status nature of the US experts and how much that status plays a part… You talked about doing some social network and contextual work here so I was wondering the degree to which their network and co-authorship and professional standing feeds into wanting to be seen to take a particular view, or visibly agree.
A7) The social network part, and co-authorship data is going to lead to a paper. We found people who are closer in co-authoring papers are ideologically closer – not totally surprising… So there is a social approval thing and a selection vias. We think that is the more likely interpretation here – the homophily effect. They co-author non-political papers, they still pick ideologically aligned authors. The status thing is interest… The UK/EU experts is less hierarchical – maybe reflects practice. In terms of monitoring each others responses… I think it’s more contrarian thing… They want to find ways to disagree… They can add comments… So lots of “My colleagues all think this, but if you think about it this other way you get this opposite response”.
Q8) My question/comment is about the “unconstrained” idea space – it feels funny and attractive… But also quite negative… Unconstrained… Disorganised… But you are talking it about a positive quality. But does that suggest they haven’t thought this stuff through?
A8) I’m glad you asked this. This question came up in the 1960s and it was seen as terrible that the ideologies didn’t align to political parties… The field has turned on it’s head now. In the 1960s though this was seen as politically naive. Actually more educated voters are seen to have more constrained beliefs… But with the economists that unconstrained belief system is good as it shows that they are not bring in their partisan/idealogical stand point. There is a contraction there. The idea that the more information you have, the more constrained your belief system should be… But only to a point. There is a really interesting paper by ? de Surrey and Ari Goldberg that compares idealogical voters, the unconstrained voters, and they find a third group that is e.g. politically liberal and economically conservative. This is a really interesting area of the literature. There are a bunch of new methods that are getting us nearer that question…
We broke for lunch and workshops at this point…
Workshops: Parallel workshop sessions – please see descriptors below.
- Text Analysis for the Tech Beginner – Suzanne Black, PhD student in LLC
- An Introduction to Digital Manufacture – Mike Boyd (uCreate Studio Manager, UoE)
- ‘I have the best words’: Twitter, Trump and Text Analysis – Dave Elsmore (EDINA)
- An Introduction to Databases, with Maria DB & Navicat – Bridget Moynihan (LLC, UoE)
- Introduction to Data Visualisation in Processing – Jules Rawlinson (Music, ECA, UoE)
- Jupyter Notebooks and The University of Edinburgh Noteable service – Overview and Introduction – James Reid (EDINA)
- Obtaining and working with Facebook Data – Simon Yuill (Goldsmiths)
I attended the Introduction to Data Visualisation in Processing workshop which was really interesting, and left me wanting to have a further play to see where it may potentially be useful.
Round Table Discussion
- Melissa Terras (MT), Professor of Digital Cultural Heritage
- Kirsty Lingstadt (KL), Head of Digital Library and Depute Director of Library and University Collections
- Ewan McAndrew (EM), Wikimedian in Residence
- Tim Squirell (TM), PhD Student, Science, Technology and Innovation Studies working on communities and expertise and negotiations of those concepts.
MT: I wanted to start with quite a personal place… I realised last year that I was sort of grieving for the internet. I grew up with the internet, it’s been a big part of my life and friendships… But the internet has taken a different turn… And there is a need to step away from that a bit to stay sane. There is a need to step back and reflect, and think about the University Space. I feel maybe we could have stepped in… The questions of Facebook, Twitter, the use of data… The human nature of trust… And how we use and engage and archive and preserve some of these spaces… I think that makes it interesting to an academic in the digital space right now.
EM: I think the idea of the web was quite sour after Cambridge Analytica. Tim Berners-Lee spoke on Channel 4 News about how it’s not enough to build and run the open web, but we have to look critically at what is being done with it, what people are building. I also thought that the Scottish Referendum, and Glasgow Strathclyde University which called upon all librarians to support political literacy. But that could be “universities” not just “libraries” – there is a need for much more information literacy as a service almost.
KL: The role of the university is about knowledge and supporting and preserving knowledge, with the library central to that… As the digital world changes we need those skills of information literacy, to think critically about what we see on the web, and how we understand that. That’s an important thread the library offers and supports. The arts, humanities and social sciences really support that development of critical engagement, literacy, context and the origins of big data. I was very much chiming with CILIPS work on information literacy – the university library has a really important part to play here…
TS: I want to make three brief points on engagement, expertise and access. One of the things I’ve observed on the web around online communities, is that there is a tendency to only notice a community until something happens. I study some quite extreme communities, including the involuntary celibate community, and you can’t raise interest until people go out and kill people. We really need to see more engagement and understanding, not as an object of interest. The second point is about experts and what that means… I think that reification of expertise is niave at best, and often dangerous. Only engaging with experts, or corroborating your beliefs, or feeling that you only engage with an expert class, overlooks the way most people engage with issues. And finally on access… In light of Cambridge Analytica, Facebook has shut down access for all but their own Facebook programme (with funding councils) of research. Doing that means only people working at the companies, or the elite universities with particular track records…
Comment: Interesting that you mentioned Tim Berners-Lee as he was the reason Web Science got set up at Southampton. The narrative was… I invented the web (discuss) and it has gone wrong (discuss). That was a perspective that didn’t problematise information or communication etc. The idea was that we would reengineer the web (discuss) as if it is technical, not a complex socio-technical network. I’m not being negative but supporting your statements. The restructing of Information Technology GCSE was a travesty – there was no attempt at critical engagement, just at programming. And it is really important that we envision what we want the web to be. There is no fixed idea of the web. We have gone down the rabbit hole of behavioural tracking and advertising as the only economic model… But we could play with that. I would make a pitch for Utopianism… With Donna Harraway: looking at the trouble and thinking about what else we could do.
Comment: I wondered about… that sense of the internet as being what we hoped it could be… But also the issue of the attack on net neutrality in the US, and immediate recognition that that isn’t ok… How do we back away, not engage in the toxic parts of the internet… But also save the parts that are worth saving… Keeping an eye on legislation? Do we protect without participating?
MT: I immediately started to think of how we talk about bitcoin – very utopian visions and turning it into a profit making machine, as has happened in the internet… How do we build structures that can be used to make money… Without that consuming the rest of it… The internet is consuming all the other stuff… I think bitcoin will be the same… The same people who had money 200 years ago, will be the same people who’ll make money now… Partly information literacy, partly being cynical, being civic… Being alive to issues…
TS: I am going to say two contradictory sounding things… So many of these issues seem to be engineering issues to social problems. I was at a conference with someone talking about a blockchain based education network, with a smart contract to validate credentials. Taking the human out of the process, in order to improve the situation. Bitcoin is supposed to be trustless… But at some point you have a human interface, it will fail… You will always face problems you couldn’t spot – unless you spoke to a social scientist. But that goes with us as social scientists is the need for us to engage with the engineering sides of things… Lots of “if only we could have known what would happen with Cambridge Analytica”, but we’ve known about that for years… We struggle to be listened to by policy makers when compared with businesses who have legitimate routes in, and argue for a lack of accountability. Platforms are not neutral, you can engineer the behaviours available in the space. You have to understand the feedback loop between administration and engineering.
EM: Thinking about democratisation.. And thinking about utopian visions… Putting my wikimedian hat on… I think that it has been amazing to see the work done by students here… There is real benefit to having a very transparant space online where you can query or change or contribute to the world. Wikipedia is committed to keeping the human element at its core. One of the ways that Wikipedia checks and balances the data is that you can’t edit a page unless you’ve had an account for four days.
KL: That’s where libraries of all kinds come in – a space or platform to trace the source, the archive materials… And digital data… Data curation and longer term lifecycles.. Digital content being created… To check, to contribute.
Comment: There’s an interesting underlying narrative that the web has gone wrong, and that the economy has gone wrong… As if these structured inequalities are accidental but they are not, they are deliberate. We need a critical historical narrative of the web and how this has taken place…. And the historical narrative of where the web has come from. We need more engagement from the humanities here… There are underlying themes here.
Comment: From literary and fan fiction studies we have for years been talking to a literature and community that exists online and how that interacts online. Fan fiction is often written by women, by BME and LGBTQ and non-binary people… We have a cry of “own the servers” to avoid exploitation… Could anyone comment on that type of utopian vision – the local and the global… Who accesses the data…
KL: From my context of the library, it’s about putting materials out there to access what they need as equitably as possible… But that’s difficult… For archives and personal material there are restrictions and limitations for good reason… We haven’t cracked that perfectly… It is a challenge, there isn’t an easy answer to it…
EM: From a Wikipedia angle… Wikipedia had a conversation within and around the community about where the community is going by 2030… Where they were going, what they needed to do to share and access knowledge around the world… To enable better understanding… To more civic and better societies. But there are huge disparities of access. Out of that came the sense of knowledge not as a product but as a service. And the idea of knowledge equity – in terms of access but recognising only 10% of editors are female, it’s Northern Hemisphere orientated, only 2.5% of geotagged content relates to Africa. It’s not shying away from that, instead trying to address that over time… Which is why Wiki Project Medicine has created “the internet in a box” to enable access to a downloaded medical version of the content to improve access to information.
Comment: From Biological sciences background… My question underpins everything here… We haven’t really touched on digital preservation, it’s a big and worrying thing. I’ve listened to comment on big gaps in digital data, it’s really difficult in the long term. How will that be affected by GDPR and what can be done there in terms of preservation and access. We are looking more and more at the cloud… The carbon footprint of ICT is expected to be 40% by 2040. Thinking about preservation and the more and more carbon intensive nature of the web, what can universities do to tackle these years…
KL: Digital preservation is close and dear to us. It is challenging and not easy. It’s not a commodity you can just buy, there isn’t one way to do this. We are trying to tackle certain areas. We are trying to preserve the university’s history. We also look actively on research data produced by the University. Addressing those two areas, there is still a huge area of web output and web archiving there… There is interest in the University output, but less interest in the wider context. We acknowledge that agenda and push it up in the university – and digital humanities helps here, and that means access to information which helps us make our case. With GDPR does present complexity, it does mean working with encryption… For company/global content that’s broader.
Comment: In terms of the issue of experts… I think it’s interesting to see experts by credentials, or by reputation… And how that relates to the internet… It seems like a great way to be a self-made expert… To promote yourself as an expert because you have a blog. You may have stature and influence… But that’s very different from a PhD or an academic expertise… I’m interested that part of being an expert is admitting when you don’t know something… It seems the public wants experts to tell you the answer right now… What is the role of the internet right now here.
TS: I have a lot of thoughts on this. It’s basically my PhD. If I ramble… Stop me… I think this is fundamentally about the way we reconceptualise expertise.. There is the idea of it being reiified, as rare and based on credentials, and that being in conflict with other types of self-made influential. Steven Taylor has a paper on experts across three types, including this group of self-made experts… They come to represent a much larger group of experts – it hasn’t democratised broadcast but it’s certainly opened up and broadened the field somewhat. When we understand expertise as only credentialed people in specific organisations, we limit communication. We have to be able to engage as compellingly as these people able to weaponise, essentially, nonsense and see how we can be as engaging with them. We have to be provocative and interesting. We can’t expect people to just come and ask the right experts. The burden shouldn’t be on audiences, the burden should be on “experts” to be palatable and appealing as experts.
MT: The anti expertise thing isn’t a new thing too… It goes right back to founding of universities, particularly in the Victorian era… I have a book coming out on professors in childrens literature, and accompanying anthrology, and every single story is “the professor is rubbish”. All of them. All about not trusting experts, just when expertise is being formalised… The general populace ridiculing them… The internet has boosted that again. But a positive thing… Crowd sourcing is a positive development… We did a few crowd sourcing projects that truly changed access and use of information – work that used to only be done by paleographist, looking at Jeremy Bentham’s papers… The internet helped us speed that all up… If we have the right platforms, the right structures, we can do the right things… But we can’t let “expertise is rubbish to perpetuate”.
EM: Again with digital preservation, there is a cost attached… There may be volunteers… If there is a platform or a lack of cost… You can do a lot. And archive a lot in public ways…
KL: I was going to add that the cultural heritage sector has an interesting relationship with working with the community… But there is this tension about how and who can contribute how, and who can do it best. But the crowd is full of enthusiasm… As long as work is provenanced…. That is a really good way to positively use the web.
Comment: In response to the Cambridge Analytica stuff… And why didn’t they listen to the social scientists… Isn’t GDPR an example of the law doing as good a job as it could… And data ownership… Legislative work in Europe on copyright and data ownership… If we want to set the right example, it’s not enough to throw up our hands in horror… You have to engage in legislative process… Laws do have an impact in cyberspace.
Comment: Business models – and how do we change that – it shapes the platform. Investment doesn’t go in equally – and as universities we do start ups, we do engagement with industry. How do we move beyond all of these businesses being set up by young wealthy guys, and opening that up… And reconceptualising success as more than just exit, and data as asset – and that being personal data. I also wanted to note that web archiving does take place – with the Internet Archive who operate in the more permissive US copyright context (and mirrored in Canada – they were concerned that Trump might interfere with the archive). There is a small but politically aware web archiving community but part of making that and any platform work is about acknowledging that there is cost to running platforms, to archiving materials…
Comment: That idea of “an expert” – surely we reconceptualise the expert as a distributed thing.
MT: And with that I’d like to thank the panel and draw this to a close. We hope to have some announcements in the next year about expanding this work, and this day takes place in an environment that contributed to my coming to Edinburgh, with the City Deal, and with the work driving Edinburgh to be the Data Driven Innovation capital of Europe.