Oct 132015
 
Michael Dewar, Data Scientist at The New York Times, presenting at the Data Science for Media Summit held by the Alan Turing Institute and University of Edinburgh, 14th October 2015..

Today I am at the “Data Science for Media Summit” hosted by The Alan Turing Institute & University of Edinburgh and taking place at the Informatics Forum in Edinburgh. This promises to be an event exploring data science opportunities within the media sector and the attendees are already proving to be a diverse mix of media, researchers, and others interesting in media collaborations. I’ll be liveblogging all day – the usual caveats apply – but you can also follow the tweets on #TuringSummit.

Introduction – Steve Renals, Informatics

I’m very happy to welcome you all to this data science for media summit, and I just wanted to explain that idea of a “summit”. This is one of a series of events from the Alan Turing Institute, taking place across the UK, to spark new ideas, new collaborations, and build connections. So researchers understanding areas of interest for the media industry. And the media industry understanding what’s possible in research. This is a big week for data science in Edinburgh, as we also have our doctoral training centre so you’ll also see displays in the forum from our doctoral students.

So, I’d now like to handover to Howard Covington, Chair, Alan Turing Institute

Introduction to the Alan Turing Institute (ATI) – Howard Covington, Chair, ATI

To introduce ATI I’m just going to cut to out mission, to make the UK the world leader in data science and data systems.

ATI came about from a government announcement in March 2014, then bidding process leading to universities chosen in Jan 2015, joint venture agreement between the partners (Cambridge, Edinburgh, Oxford, UCL, Warwick) in March 2015, and Andrew Blake, the institute’s director takes up his post this week. He was before now the head of research for Microsoft R&D in the UK.

Those partners already have about 600 data scientists working for them and we expect ATI to be an organisation of around 700 data scientists as students etc. come in. And the idea of the data summits – there are about 10 around the UK – for you to tell us your concerns, your interests. We are also hosting academic research sessions for them to propose their ideas. 

Now, I’ve worked in a few start ups in my time and this is going at pretty much as fast a pace as you can go.

We will be building our own building, behind the British Library opposite the Frances Crick building. There will be space at that HQ for 150 peaople. There is £67m of committed funding for the first 5 years – companies and organisations with a deep interest who are committing time and resources to the institute. And we will have our own building in due course.

The Institute sits in a wider ecosystem that includes: Lloyds Register – our first partner who sees huge amounts of data coming from sensors on large structures; GCHQ – working with them on the open stuff they do, and using their knowledge in keeping data safe and secure; EPSRC – a shareholder and partner in the work. We also expect other partners coming in from various areas, including the media.

So, how will we go forward with the Institute? Well we want to do both theory and impact. So we want major theoretical advances, but we will devote time equally to practical impactful work. Maths and Computer Science are both core, but we want to be a broad organisation across the full range of data science, reflecting that we are a national centre. But we will have to take a specific interest in particular interest. There will be an ecosystem of partners. And we will have a huge training programme with around 40 PhD students per year, and we want those people to go out into the world to take data sciences forward.

Now, the main task of our new director, is working out our science and innovation strategy. He’s starting by understanding where our talents and expertise already sit across our partners. We are also looking at the needs of our strategic partners, and then the needs emerging from the data summits, and the academic workshops. We should then soon have our strategy in place. But this will be additive over time.

When you ask someone what data science is that definition is ever changing and variable. So I have a slide here that breaks the rules of slide presentations really, in that it’s very busy… But data science is very busy. So we will be looking at work in this space, and going into more depth, for instance on financial sector credit scoring; predictive models in precision agriculture; etc. Undercutting all of these is similarities that cross many fields. Things like security and privacy is one such area – we can only go as far as it is appropriate to go with people’s data, and issue both for ATI and for individuals.

I don’t know if you think that’s exciting, but I think it’s remarkably exciting!

We have about 10 employees now, we’ll have about 150 this time next year, and I hope we’ll have opportunity to work with all of you on what is just about the most exciting project going on in the UK at the moment.

And now to our first speaker…

New York Times Labs – Keynote from Mike Dewar, Data Scientist

I’m going to talk a bit about values, and about the importance of understanding the context of what it is we do. And how we embed what we think is important into the code that we write, the systems that we design and the work that we do.

Now, the last time I was in Edinburgh was in 2009 I was doing a Post Doc working on modelling biological data, based on video of flies. There was loads of data, mix of disciplines, and we were market focused – the project became a data analytics company. And, like much other data science, it was really rather invasive – I knew huge amounts about the sex life of fruit flies, far more than one should need too! We were predicting behaviours, understanding correlations between environment and behaviour. I’

I now work at the New York Times R&D and our task is to look 3-5 years ahead of current NYT practice. We have several technologists there, but also colleagues who are really designers. That has forced me up a bit… I am a classically trained engineer – to go out into the world, find the problem, and then solve it by finding some solution, some algorithm to minimise the cost function. But it turns out in media, where we see decreasing ad revenue, and increasing subscription, that we need to do more than minimise the cost function… That basically leads to click bait. So I’m going to talk about three values that I think we should be thinking about, and projects within that area. So, I shall start with Trust…

Trust

It can be easy to forget that much of what we do in journalism is essentially surveillance, so it is crucial that we do our work in a trustworthy way.

So the first thing I want to talk about is a tool called Curriculum, a Chrome browser plug in that observes everything I read online at work. Then it takes chunk of text, aggregates with what others are reading, and projects that onto a screen in the office. So, firstly, the negative… I am very aware I’m being observed – it’s very invasive – and that layer of privacy is gone, that shapes what I do (and it ruins Christmas!). But it also shares what everyone is doing, a sense of what collectively we are working on… It is built in such a way as to make it inherently trustworthy in four ways: it’s open source so I can see the code that controls this project; it is fantastically clearly written and clearly architected so reading the code is actually easy, it’s well commented, I’m able to read it; it respects existing boundaries on the web – it does not read https (so my email is fine) and respects incognito mode; and also I know how to turn it off – also very important.

In contrary to that I want to talk about Editor. This is a text editor like any others… Except whatever you type is sent to a series of micro services which looks for similarity, looking for NYT keyword corpos, and then sends that back to the editor – enabling a tight mark up of their text. The issue is that the writer is used to writing alone, then send to production. Here we are asking the writer to share their work in progress and send it to central AI services at the NYT, so making that trustworthy is a huge challenge, and we need to work out how best to do this.

Legibility

Data scientists have a tendency towards the complex. I’m no different – show me a new tool and I’ll want to play with it and I enjoy a new toy. And we love complex algorithms, especially if we spent years learning about those in grad school. And those can render any data illegible.

So we have [NAME?] an infinite scrolling browser – when you scroll you can continue on. And at the end of each article an algorithm offers 3 different recommendation strands… It’s like a choose your own adventure experience. So we have three recommended articles, based on very simple recommendation engine, which renders them legible. These are “style graph” – things that are similar in style; “collaborative filter” – readers like you also read; “topic graph” – similar in topic. These are all based on the nodes and edges of the connections between articles. They are simple legible concepts, and easy to run so we can use them across the whole NYT corpus. They are understandable to deal with so has a much better chance of resonating with our colleagues.

As a counter point we were tasked with looking at Behavioural Segmentation – to see how we can build different products for them. Typically segmentation is done with demography. We were interested, instead, on using just the data we had, the behavioural data. We arranged all of our pageviews into sessions (arrive at a page through to leave the site). So, for each session we representated the data as a transition matrix to understand the probability of moving from one page to the next… So we can perform clustering of behaviours… So looking at this we can see that there are some clusters that we already know about… We have the “one and dones” – read one article then move on. We found the “homepage watcher” who sit on the homepage and use that as a launching point. The rest though the NYT didn’t have names for… So we now have the “homepage bouncer” – going back and forth from the front page; and the “section page starter” as well, for instance.

This is a simple caymeans (?) model and clustering, very simple but they are dynamic, and effective. However, this is very very radical at NYT, amongst non data scientist. It’s hard to make it resonate to drive any behaviour or design in the building. We have a lot of work to do to make this legible and meaningful for our colleagues.

The final section I want to talk about is Live…

Live

In news we have to be live, we have to work in the timescales of seconds to a minute. In the lab that has been expressed as streams of data – never ending sequences of data arriving at our machines as quickly as possible.

So, one of our projects, Delta, produces a live visualisation of every single page views of the NYT – a pixel for person starting on the globe, then pushing outwards, If you’ve visited the NYT in the last year or so, you’ve generated a pixel on the globe in the lab. We use this to visualise the work of the lab. We think the fact that this is live is very visceral. We always start with the globe… But then we show a second view, using the same pixels in the context of sections, of the structure of the NYT content itself. And that can be explored with an XBox controller. Being live makes it relevant and timely, to understand current interests and content. It ties people to the audience, and encourages other parts of the NYT to build some of these live experiences… But one of the tricky things of that is that it is hard to use live streams of data, hence…

Streamtools, a tool for managing livestreams of data. It should be reminscent of Similink or LabView etc. [when chatting to Mike earlier I suggested it was a superpimped, realtime Yahoo Pipes and he seemed to agree with that description too]. It’s now on it’s third incarnation and you can come and explore a demo throughout today.

Now, I’ve been a data scientist and involved when we bring our systems to the table we need to be aware that what we build embodies our own values. And I think that for data science in media we should be building trustworthy systems, tools which are legible to others, and those that are live.

Find out more at nytlabs.com.

Q&A

Q1) I wanted to ask about expectations. In a new field it can be hard to manage expectations. What are your users expectations for your group and how do you manage that?

A1) The expectations in R&D, in which we have one data scientist and a bunch of designers. We make speculative futures, build prototypes, bring them to NYT, to the present, to help them make decisions about the future. In terms of data science in general at NYT… Sometimes things look magic and look lovely but we don’t understand how they work, in other places it’s much simpler, e.g. counting algorithms. But there’s no risk of a data science winter, we’re being encouraged to do more.

Q2) NYT is a paper of record, how do you manage risk?

A2) Our work is informed by a very well worded privacy statement that we respect and build our work on. But the other areas of ethics etc. is still to be looked at.

Q3) Much of what you are doing is very interactive and much of data science is about processing large sets of data… So can you give any tips for someone working with Terrabytes of data for working with designers?

A3) I think a data scientist essentially is creating a palate of colours for your designer to work with. And forcing you to explain that to the designer is useful, and enables those colours to be used. And we encourage that there isn’t just one solution, we need to try many. That can be painful as a data scientist as some of your algorithms won’t get used, but, that gives some great space to experiment and find new solutions.

Data Journalism Panel Session moderated by Frank O’Donnell, Managing Editor of The Scotsman, Edinburgh Evening News and Scotland on Sunday

We’re going to start with some ideas of what data journalism is

Crina Boros, Data Journalist, Greenpeace

I am a precision journalist.  and I have just joined Greenpeace having worked at Thomson Reuters, BBC Newsnight etc. And I am not a data scientist, or a journalist. I am a pre-journalist working with data. At Greenpeace data is being used for investigate journalism purposes, areas no longer or rarely picked up by mainstream media, to find conflicts of interest, and to establish facts and figures for use in journalism, in campaigning. And it is a way to protect human sources and enable journalists in their work. I have, in my role, both used data that exists, created data when it does not exist. And I’ve sometimes worked with data that was never supposed to see the light of data.

Evan Hensleigh, Visual Data Journalist, The Economist

I was originally a designer and therefore came into information visualisation and data journalism by a fairly convoluted route. At the Economist we’ve been running since the 1890s and we like to say that we’ve been doing data science since we started. We were founded at the time of the Corn Laws in opposition to those proposals, and visualised the impact of those laws as part of that.

The way we now tend to use data is to illustrate a story we are already working on. For instance working on articles on migration in Europe, and looking at fortifications and border walls that have been built over the last 20 to 30 years lets you see the trends over time – really bringing to life the bigger story. It’s one thing to report current changes, but to see that in context is powerful.

Another way that we use data is to investigate changes – a colleague was looking at changes in ridership on the Tube, and the rise of the rush hour – and then use that to trigger new articles.

Rachel Schutt, Chief Data Scientist, Newscorp

I am not a journalist but I am the Chief Data Scientist at Newscorp, and I’m based in New York. My background is a PhD in statistics, and I used to work at Google in R&D and algorithms. And I became fascinated by data science so started teaching an introductory course at Columbia, and wrote a book on this topic. And what I now do at Newscorp is to use data as a strategic asset. So that’s about using data to generate value – around subscriptions, advertising etc. But we also have data journalism so I increasingly create opportunities for data scientists, engineers, journalists, and in many cases a designer so that they can build stories with data at the core.

We have both data scientists, but also data engineers  – so hybrid skills are around engineering, statistical analysis, etc. and sometimes individual’s skills cross those borders, sometimes it’s different people too. And we also have those working more in design and data visualisation. So, for instance, we are now getting data dumps – the Clinton emails, transcripts from Ferguson etc. – and we know those are coming so can build tools to explore those.

A quote I like is that data scientists should think like journalists (from DJ Patel) – in any industry. In Newscorp we also get to learn from journalists which is very exciting. But the idea is that you have to be investigative, be able to tell a story, to

Emily Bell says “all algorithms are editorial” – because value judgements are embedded in those algorithms, and you need to understand the initial decisions that go with that.

Jacqui Maher, Interactive Journalist, BBC News Labs
I was previously at the NYT, mainly at the Interactive News desk in the newsroom. An area crossing news, visualisation, data etc. – so much of what has already been said. And I would absolutely agree with Rachel about the big data dumps and looking for the story – the last dump of emails I had to work with were from Sarah Palin for instance.

At the BBC my work lately has been on a concept called “Structured Journalism” – so when we report on a story we put together all these different entities in a very unstructured set of data as audio, video etc. Many data scientists will try to extract that structure back out of that corpus… So we are looking at how we might retain the structure that is in a journalist’s head, as they are writing the story. So digital tools that will help journalists during the investigative process. And ways to retain connections, structures etc. And then what can we do with that… What can make it more relevant to readers/viewers – context pieces, ways of adding context in a video (a tough challenge).

If you look at work going on elsewhere, for instance at the Washington Post working on IS, are looking at how to similarly add context, how they can leverage previous reporting without having to do that from scratch.

Q&A/Discussion

Q1 – FOD) At a time when we have to cut staff in media, in newspapers in particular, how do we justify investing in data science, or how do we use data science.

A1 – EH) Many of the people I know came out of design backgrounds. You can get pretty far just using available tools. There are a lot of useful tools out there that can help your work.

A1 – CB) I think this stuff is just journalism, and these are just another sets of tools. But there is a misunderstanding that you don’t press a button and get a story. You have to understand that it takes time,  there’s a reason that it is called precision journalism. And sometimes the issue is that the data is just not available.

A1 – RS) Part of the challenge is about traditional academic training and what is and isn’t included here.. But there are more academic programmes on data journalism. It’s a skillset issue. I’m not sure that, on a pay basis, whether data journalists should get paid more than other journalists…

A1 – FOD) I have to say in many newsrooms journalists are not that numerate. Give them statistics, even percentages and that can be a challenge. It’s almost a badge of honour as wordsmiths…

A1 – JM) I think most newsrooms have an issue of silos. You also touched on the whole “math is hard” thing. But to do data journalism you don’t need to be a data scientist. They don’t have to be an expert on maths, stats, visualisation etc. At my former employer I worked with Mike – who you’ve already heard from – who could enable me to cross that barrier. I didn’t need to understand the algorithms, but I had that support. You do see more journalist/designer/data scientists working together. I think eventually we’ll see all of those people as journalists though as you are just trying to tell the story using the available tools.

Q2) I wanted to ask about the ethics of data journalism. Do you think that to do data journalism there is a developing field of ethics in data journalism?

A1 – JM) I think that’s a really good question in journalism… But I don’t think that’s specific to data journalism. When I was working at NYT we were working on the Wikileaks data dumps, and there were huge ethical issues there and around the information that was included there in terms of names, in terms of risk. And in the end the methods you might take – whether blocking part of a document out – the technology mignt vary but the ethical issues are the same.

Q2 follow up FOD) And how were those ethical issues worked out?

A1 – JM) Having a good editor is also essential.

A1 – CB) When I was at Thomson Reuters I was involved in running womens rights surveys to collate data and when you do that you need to apply research ethics, with advice from those appropriately positioned to do that.

A1 – RS) There is an issue that traditionally journalists are trained in ethics but data scientists are not trained in ethics. We have policies in terms of data privacy… But much more to do. And it comes down to the person who is building a data model – ad you have to be aware of the possible impact and implications of that model. And risks also of things like the Filter Bubble (Pariser 2011).

Q3 – JO) One thing that came through listening to ? and Jackie, it’s become clear that journalism is a core part of journalism… You can’t get the story without the data. So, is there a competitive advantage to being able to extract that meaning from the data – is there a data science arms race here?

A3 – RS) I certainly look out to NYT and other papers I admire what they do, but of course the reality is messier than the final product… But there is some of this…

A3 – JM) I think that if you don’t engage with data then you aren’t keeping up with the field, you are doing yourself a professional misservice.

A3 – EH) There is a need to keep up. We are a relatively large group, but nothing like the scale of NYT… So we need to find ways to tell stories that they won’t tell, or to have a real sense of what an Economist data story looks like. Our team is about 12 or 14, that’s a pretty good side.

A3 – RS) Across all of our businesses there are 100s in data science roles, of whom only a dozen or so are on data journalism side.

A3 – JM) At the BBC there are about 40 or 50 people on the visual journalism team. But there are many more in data science in other roles, people at the World Service. But we have maybe a dozen people in the lab at any given moment.

Q4) I was struck by the comment about legibility, and a little bit related to transparancy in data. Data is already telling a story, there is an editorial dimension, and that is added to in the presentation of the data… And I wonder how you can do that to improve transparancy.

A4 – JM) There are many ways to do that… To show your process, to share your data (if appropriate). Many share code on GitHub. And there is a question there though – if someone finds something in the data set, what’s the feedback loop.

A4 – CB) In the past where I’ve worked we’ve shared a document on the step by step process used. I’m not a fan of sharing on GitHub, I think you need to hand hold the reader through the data story etc.

Q5) Given that journalims is about holding companies to account… In a world where, e.g. Google, are the new power brokers, who will hold them to account. I think data journalism needs a merge between journalism, data science, and designers… Sometimes that can be in one person… And what do you think about journalism playing a role in holding new power brokers to account.

A5 – EH) There is a lot of potential. These companies publish a lot of data and/or make their data available. There was some great work on 5:38 about Uber, based on a Freedom of Information request to essentially fact check Uber’s own statistics and reporting of activities.

Q6) Over the years we’ve (Robert Gordan Univerity) worked with journalists from various organisations. I’ve noticed that there is an issue, not yet raised, that journalists are always looking for a particular angle in data as they work with it… It can be hard to get an understanding from the data, rather than using the data to reinforce bias etc.

A6 – RS) If there is an issue of taking a data dump from e.g. Twitter to find a story… Well dealing with that bias does come back to training. But yes, there is a risk of journalists getting excited, wanting to tell a novel story, without being checked with colleagues, correcting analysis.

A6 – CB) I’ve certainly had colleagues wanting data to substantiate the story, but it should be the other way around…

Q6) If you, for example, take the Scottish Referendum and the General Election and you see journalists so used to watching their dashboard and getting real time feedback, they use them for the stories rather than doing any real statistical analysis.

A6 – CB) That’s part of the usefulness of reason for reading different papers and different reporters covering a topic – and you are expected to have an angle as a journalist.

A6 – EH) There’s nothing wrong with an angle or a hunch but you also need to use the expertise of colleagues and experts to check your own work and biases.

A6 – RS) There is a lot more to understand how the data has come about, and people often use the data set as a ground truth and that needs more thinking about. It’s somewhat taught in schools, but not enough.

A6 – JM) That makes me think of a data set called gdump(?), which captures media reporting and enables event detection etc. I’ve seen stories of a journalist looking at that data as a canonical source for all that has happened – and that’s a misunderstanding of how that data set has been collected. It’s close to a canonical source for reporting but that is different. So you certainly need to understand how the data has come about.

Comment – FOD) So, you are saying that we can think we are in the business of reporting fact rather than opinion but it isn’t that simple at all.

Q7) We have data science, is there scope for story science? A science and engineering of generating stories…

A7 – CB) I think we need a teamwork sort of approach to story telling… With coders, with analysts looking for the story… The reporters doing field reporting, and the data vis people making it all attractive and sexy. That’s an ideal scenario…

A7 – RS) There are companies doing automatic story generation – like Narrative Science etc. already, e.g. on Little League matches…

Q7 – comment) Is that good?

A7 – RS) Not necessarily… But it is happening…

A7 – JM) Maybe not, but it enables story telling at scale, and maybe that has some usefulness really.

Q8/Comment) There was a question about the ethics and the comment that nothing needed there, and the comment about legibility. And I think there is conflict there about

Statistical databases  – infer missing data from the data you have, to make valid inferences but could shock people because they are not actually in the data (e.g. salary prediction). This reminded me of issues such as source protection where you may not explicitly identify the source but that source could be inferred. So you need a complex understanding of statistics to understand that risk, and to do that practice appropriately.

A8 – CB) You do need to engage in social sciences, and to properly understand what you doing in terms of your statistical analysis, your P values etc. There is more training taking place but still more to do.

Q9 – FOD) I wanted to end by coming back to Howard’s introduction. How could ATI and Edinburgh help journalism?

A9 – JM) I think there are huge opportunities to help journalists make sense of large data sets. Whether that is tools for reporting or analysis. There is one, called Detector.io that lets you map reporting for instance that is shutting down and I don’t know why. There are some real opportunities for new tools.

A9 – RS) I think there are areas in terms of curriculum, on design, ethics, privacy, bias… Softer areas not always emphasised in conventional academic programmes but are at least as important as scientific and engineering sides.

A9 – EH) I think generating data from areas where we don’t have it. At the economist we look at China, Asia, Africa where data is either deliberately obscured or they don’t have the infrastructure to collect it. So tools to generate that would be brilliant.

A9 – CB) Understand what you are doing; push for data being available; and ask us and push is to be accountable, and it will open up…

Q10) What about the readers. You’ve been saying the journalists have to understand their stats… But what about the readers who know how to understand the difference between reading the Daily Mail and the Independent, say, but don’t have the data literacy to understand the data visualisation etc.

A10 – JM) It’s a data literacy problem in general…

A10 – EH) Data scientists have the skills to find the information and raise awareness

A10 – CB) I do see more analytical reporting in the US than in Europe. But data isn’t there to obscure anything. But you have to explain what you have done in clear language.

Comment – FOD) It was once the case that data was scarce, and reporting was very much on the ground and on foot. But we are no longer hunter gatherers in the same way… Data is abundant and we have to know how we can understand, process, and find the stories from that data. We don’t have clear ethical codes yet. And we need to have a better understanding of what is being produced. And most of the media most people consume is the local media – city and regional papers – and they can’t yet afford to get into data journalism in a big ways. Relevance is a really important quality. So my personal challenge to the ATI is: how do we make data journalism pay?

And we are back from lunch and some excellent demos… 

Ericsson, Broadcast & Media Services – Keynote from Steve Plunkett, CTO

Jon Oberlander is introducing Steve Plunkett who has a rich history of work in the media. 

I’m going to talk about data and audience research, and trends in audience data. We collect and aggregate and analyse lots of data and where many of the opportunities are…

24,000 R&D very much focused on telecoms. But within R&D there is a group of broadcast and media services, and I joined as part of a buy out of Red Bee Media. One part of these services are a metadata team who create synposes for EPGs across Europe (2700 channels). We are also the biggest subtitlers in Europe. And we also do media management – with many hundreds of thousands of hours of audio and tv and that’s also an asset we can analyse (the inventory as well as the programme). And we operate TV channels – all BBC, C4, C5, UKTV, France, Netherlands, and in US and our scheduling work is also a source of data. And we also run recommendation engines embedded in TV guides and systems.

Now, before I tak about the trends I want to talk about the audience. Part of the challenge is understanding who the audience is… And audiences change and the rate of change is accellerating. So I’ll show some trends in self-reported data from audiences on what they are watching. Before that a quote from Reed Hastings, Amazon: “TV had a great 50 year run, but now it’s time is over”. TV is still where most impact and viewing hours are but there are real changes now.

So, the Ericsson ConsumerLab Annual Report – participants across the world – 1000 consumers across 20 countries. In home interview based understanding their viewing context, what they are watching and what preferences are. Of course self reported behaviour isn’t the same as real data but we can compare and understand that.

So, the role of services varies between generations. The go-to services are very different between older generations and younger generation. For older viewers it’s linear TV, then DVR, then Play/catch-ip, then YouTube etc. For Younger Generations SVOD is top viewing services – that’s things like Netflix, Amazon Prime etc.

In terms of daily media habits we see again a real difference between use of scheduled linear TV vs. streamed and recorded TV. Younger people again much more likely to use streaming, older using scheduled much more. And we are seeing YouTube growing in importance – generally viewing over 3 hrs per day has increased hugely in the last 4 years, and it is used as a go to space to learn new things (e.g. how to fix the dishwasher).

In terms of news the importance of broadcast news increases with age – still much more important to older consumers. And programming wise 45% of streamed on demand viewing of long content is TV series. Many watch box sets for instance. As broadcasters we have to respect that pattern of use, not all are linear scheduled viewers. And you see this in trends of tweeting and peaks of tweaks of how quickly a newly released online series has been completed.

There is also a shift from fixed to mobile devices. TV Screens and desktop PCs have seen a reduction in viewing hours and use compared to mobile, tablet and laptop use. That’s a trend overtime. And that’s again following generational lines… Younger people more likely to use mobile. Now again, this is self-reported and can vary between countries. So in our broadcast planning understanding content – length of content, degree of investment in High Def etc. – should be informed by those changes. On mobile user generated content – including YouTube but also things like Periscope – still dominant.

In terms of discovering and remembering content it is still the case that friends, reviews, trailers etc. matter. But recommendation engines are important and viewers are satisfied with them. For last two years we’ve asked study group about those recommedation engines: their accuracy; their uncanniness and data and privacy concerns; and an issue of shared devices. So still much more to be done. The scale of Netflix’ library is such that recommendations are essential to help users navigate.

So, that was self-reported. What about data we create and collect?

We have subtitle coverage, often doing the blanket subtitle coverage for broadcasters. We used to use transcribers and transcription machines. We invested in respeaking technologies. And that’s what we use now and those respeakers clean up grammar etc and the technology is trained for their voice. That process of logging subtitles includes very specific timestamps… That gives us rich new data, and also creates a transcript that can sit alongside the subtitles and programme. But it can take 6-7 hours to do subtitling as a whole process, including colour coding speakers etc. And we are looking to see what else subtitlers could add – mood perhaps? etc. as part of this process.

We have a database of about 8.5 million records that include our programme summaries, images on an episode level, etc. And we are working on the system we use to manage this, to improve it.

I mentioned Media Management and we do things like automated transcription – it wouldn’t be good enough for use in broadcast but

Media RIM – 60 telecom operators use it for IPTV and collects very granular data from TV viewing – all collected with consent. Similar for OTT. And similar platforms for EPG. Search queries. Recommendations and whether acted upon. And we also have mobile network data – to understand drop off rates, what’s viewed for a particular item etc.

We are in the middle of the broadcaster and the audience, so our work feeds into broadcasters work. For insight like segmentation, commissioning, marketing, scheduling, sales… For personalisation – content recommendations, personalised channels that are unique to you, targeted advertising, search, content navigation, contextual awareness. One of the worst feedback comments we see is about delivery quality so when it comes to delivery quality we apply our data to network optimisation etc.

In terms of the challenges we face they include: consumer choice; data volumes – and growing fast so finding value matters; data diversity – very different in structure and form so complex task; expertise – there is a lack of skills embedded in these businesses to understand our data; timeliness – personal channels need fast decisions etc. real time processing is a challenge; privacy – one of the biggest ones here, and the industry needs to know how to do that and our feedback on recommendation engines is such that we need to explain where data is coming from, to make that trusted.

In terms of opportunities: we are seeing evolving technology; cloud resources are changing this fast; investment – huge in this area at the moment; consumer appetite for this stuff; and we are in an innovation white space right now – we are in early days…

And finally… An experimental application. We took Made in Chelsea and added a graph on the viewing plan that shows tweets and peaks… And provide as a navigation system based on tweets shared. And on the right hand side navigation by character and follow their journey. We created some semantic visualisation tools for e.g. happy, sad, funny moments. Navigation that focuses on the viewers interest.

Audience Engagement Panel Session – Jon Oberlander (Moderator), University of Edinburgh

Jon is introducing his own interest in data science, in design informatics, and linguistics and data science, with a particular mention for LitLong, similarly a colleague in Politics is analysing the public interest in the UK and EU, but also reaction to political messages. And finally on the Harmonium project at the Edinburgh International Festival – using music and data on musical performers to create a new music and visualisation project, with 20k in person audience and researchers monitoring and researching that audience on the night too…

Pedro Cosa – Data Insights and Analytics Lead, Channel 4

I’m here to talk a bit about the story of Channel 4 and data. Channel 4 is a real pioneer in using data in the UK, and in Europe. You’ve all heard Steve’s presentation on changing trends – and these are very relevant for Channel 4 as we are a public service broadcasting but also because our audience is particularly young and affluent. They are changing their habits quickly and that matters from an audience and also an advertising issue for us. Senior management was really pushing for change in the channel. Our CEO has said publicly that data is the new oil of the TV industry and he has invested in data insights for Channel 4. The challenge is to capture as much data as possible, and feed that back to the business. So we used registration data from All4 (was 4OD) and to use that site you have to register. We have 13 million people registered that way and so that’s already capturing details on half our target audience in the UK. And that moves us from one to many, to one to one. And we can use that for targeted advertising, and that comes with a premium paid for advertisers, and to really personalise the experience. So that’s what we are doing at the moment.

Hew Bruce-Gardyne – Chief Technology Officer, TV Squared

We are a small company working on data analytics for use by advertisers, that in turn feed back into content. My personal background is as an engineer, the big data of that side of number crunching is where I come from. From where I am sitting audience engagement is a really interesting problem… If you see a really big engaging programme that seems to kill the advertising so replays, catch up and seeing opportunities there is, for us, gold dust.

Paul Gilooly – Director of Emerging Products, MTG (Modern Times Group)

MTG are a Scandinavian pan-European broadcaster, we have the main sports and Hollywood rights as well as major free to air channels in Scandinavian countries. And we run a thing called ViPlay which is an SVOD service like (and predating) Netflix. Nordics are interest as we have high speed internet, affluent viewers, markets where Apple TV is significant, disproportionately compared to the rest of Europe. So when I think of TV I think of subscribing audience, and Pay TV. And my concern is churn – and a amore engaged customer is more likely to stick around. So any way to increase engagement is of interest, and data is a key part of that. Just as Channel 4 are looking at authentication as a data starting point, so are we. And we also want to encourage behaviours like recommendations of products and sharing. And some behaviours to discourage. And data is also the tool to help you understand behaviours you want to discourage.

For us we want to increase transactions with viewers, to think more like a merchandiser, to improve personalisation… So back to the role of data – it is a way to give us a competitive advantage over competitors, can drive business models for different types of consumer. It’s a way to understand user experience, quality of user experience, and the building of personalised experiences. And the big challenge for me is that in the Nordics we compete with Netflix, with HBO (has direct to air offering there). But we are also competing with Microsoft, Google, etc. We are up against a whole new range of competitors who really understand data, and what you can do with data.

Steve Plunkett – CTO, Broadcast & Media Services, Ericsson

No intro… as we’ve just heard from you… 

Q&A

Q1 – JO) Why are recommendations in this sector so poor compared to e.g. Amazon?

A1 – SP) The problem is different. Amazon has this huge inventory, and collective recommendation works well. Our content is very different. We have large content libraries, adn collective recommendation works differntly. We used to have human curators programming content, they introduced serendipity nad recommendation engines are less good at that. We’ve just embarked on a 12 month project with three broadcasters  to look at this. There is loads of research on public top 10s. One of the big issues is that if you get a bad recommendation it’s hard to say “I don’t like this” or “not now”, they just sit there and the feedback is poor… So important to solve. Netflix invested a great deal of money in recommendations. They invested $1 million for a recommender that would beat their own by 10% and that took a long time. Data science is aligned with that of course.

A1 – PC) Recommendations are core for us too. But TV recommendations are so much more complex than retail… You need to look at data analyse… You have to promote cleverly, to encourage discovery, to find new topics or areas of debate, things you want to surface in a relevant way. It’s an area C4 and also BBC looking to develop.

A1 – HBG) There is a real difference between retail and broadcast – about what you do but also about the range of content available… So even if you take a recommendation, it may not reflect true interest and buy in to a product. Adds a layer of complexity and cloudiness…

A1 – SP) Tracking recommendations in a multi device, multi platform space is a real challenge… Often a one way exchange. Closing loop between recommendation and action is hard…

Q2 – JO) Of course you could ask active questions… Or could be mining other streams… How noisy is that, how useful is that? Does it bridge a gap.

A2 – SP) TV has really taken off on Twitter, but there is disproportionate noise based on a particular audience and demographic. That’s a useful tool though… You can track engagement with a show, at a point of time within a show… But not neccassarily recommendations of that viewer at that time… But one of many data sets to use…

Q3 – JO) Are users engaging with your systems aware of how you use their data, are they comfortable with it?

A3 – PC) C4 we have made a clear D-Word promise – with a great video from Alan Carr that explains that data. You can understand how it is use, can delete your own data, can change your settings, and if you don’t use the platform for 2 years then we delete your data. Very clear way to tell the user that you are in control.

A3 – SP) We had a comment from someone in a study group who said they had been categorised by a big platform as a fan of 1980s supernatural horror and didn’t want to be categorised in that way, or for others to see this. So a real interest in transparancy there.

A3 – PG) We aren’t as far ahead as Channel 4, they are leading the way on data and data privacy.

Q4 – JO) Who is leading the way here?

A4 – PG) I think David Abrahms (C4) needs great credit here, CEO understands importance of data science and it’s role in the core business model. And that competitors for revenue are Facebook, Google and so forth.

Q5 – JO) So, trend is to video on demand… Is it also people watching more?

A5 – SP) It has increased but much more fragmented across broadcast, SVOD, UGC etc. and every type of media has to define its space. So YouTube etc. is eating into scheduled programming. For my 9 year old child the streaming video, YouTube etc. is her television. We are competing with a different set of producers.

A5 – PG) The issue isn’t that linear channels do not allow you to collect data. If you have to login to access content (i.e. Pay TV) then you can track all of that sort of data. So DR1, Danish TV channel and producer of The Killing etc. is recording a huge drop in linear viewing by young people, but linear still has a role for live events, sport etc.

A5 – HBG) We do see trends that are changing… Bingeathons are happening and that indicates not a shortness of attention but a genuine change. Watching a full box set is the very best audience engagement. But if you are at a kitchen table, on a device, that’s not what you’ll be watching… It will be short videos, YouTube etc.

To come back to the privacy piece I was at a conference talking about the push to ID cards and the large move to restrict what people can know about us… We may lose some of the benefits of what can be done. And on some data – e.g. Medical Informatics – there is real value that can be extracted there. We know that Google knows all about us… But if our TV knows all about us that’s somehow culturally different.

Q6) Privacy is very high, especially at younger age ranges, so what analysis have you done on that?

A6) Not a huge amount on that, but this is self-reported. But we know piracy drops down where catch up and longer catch up windows are available – if content can be viewed legitimately and it seems that it is when available.

Q6 – follow up) Piracy seems essentially like product failure, and how do you win back your viewers and consumers.

A6 – HBG) A while back I saw a YouTube clip of the user experience of pirated film versus DVD… In that case the pirated film was easier, versus the trailers, reminders not to pirate etc. on the DVD. That’s your product problem. But as we move to subscription channels etc. When you make it easy, that’s a lot better. If you try to put barriers up, people try to find a way around it….

A6 – PG) Sweden has a large piracy issue. The way you compete is to deliver a great product and user experience and couple that with content unique to your channel. So for instance premium sports for example – so pirate can’t meet all needs of consumer. But also be realistic with price point.

A6 – HBG) There is a subtle difference between what you consume – e.g. film versus TV. But from music we know that pirating in the music industry is not a threat – that those are also purchasing consumers. And when content creators work with that, and allow some of that to happen, that creates engagement that helps. Most successful brand owners let others play with their brand.

A6 – PC) Piracy is an issue… But we even use piracy data sources for data analysis. Using bit torrent to understand popularity of shows in other places, to predict how popular they will be in the UK.

Comment – JO) So, pirates are data producers?

A6 – PC) Yes, and for scheduling too.

Q7) How are you dealing with cross channel or cross platform data – to work with Google or Amazon say. I don’t see much of that with linear TV. Maybe a bit with SVOD. How are mainstream broadcasters challenging that?

A7 – PC) Cross platform can mean different things. It may be Video On Demand as well as broadcast on their TV. We can’t assume they are different, and should look to understand what the connections are there… We are so conscious and cautious of using third party data… But we can do some content matching – e.g. advertiser customer base, and much more personalised. A real link between publisher and advertiser.

Q7 follow up) Would customer know that is taking place?

A7 – PC) It is an option at sign up. Many say “yes” to that question.

A7 – PG) We still have a lot to do to track the consumer across platforms, so a viewer can pick up consuming content from one platform to another. This technology is pretty immature, an issue with recommendation engines too.

A7 – SP) We do have relationships with third party data companies that augment what we collect – different from what a broadcaster would do. For this it tends to be non identifiable… BUt you have to trust the analyst to have combined data appropriately. You have to understand their method and process, but usually they have to infer from data anyway as usually don’t have source.

Q8 – JO) We were talking about unreliable technologies and opportunities… So, where do you see wearable technologies perhaps?

A8 – SP) We did some work using facial recognition to understand the usefulness of recommendations. That was interesting but deploying that comes with a lot of privacy issues. And devices etc. also would raise those issues.

A8 – PC) We aren’t looking at that sort of data… But data like weather matters for this industry, local events, traffic information – as context for consumption etc. That is all being considered as context for analysis. But we also share our data science with creative colleagues – that, say, technology will tell you when content is performed/shown. There is a subjective human aspect that they want to see, to dissect elements of content so machine can really learn… So is there sex involved… Who is the director, who is the actress… So many things you can put in the system to find this stuff out. Forecasting really is important in this industry.

A8 – HBG) The human element is interesting. Serendipity is interesting. From neuroscientist point of view I always worry about the act of measure… We see all the time that you can see the same audience, same demographic, watching the same content and reacting totally differently at different times of day etc. And live vs catch up say. My fear, and a great challenge, is how to get a neuroscience experiment valid in that context.

Q9 – from me) What happens if the data is not there in terms of content, or recommendation engines – if the data you have tells you there is a need for something you don’t currently have available. Are you using data science to inform production or content creation, or for advertising?

A9 – SP) The research we are currently doing is looking at ways to get much better data from viewers – trying things like a Tinder-like playful interface to really get a better understanding of what users want. But we also, whenever there are searches etc. capture not only what is available on that platform but also what is in demand but not yet available, and also provding details of that search iss to commissioning teams to inform what they do.

A9 – PG) There are some interesting questions about what is most valuable… So. you see Amazon Prime deciding on vale of Jeremy Clarkson and Top Gear team… And i think you will increasingly see purchasing based on data. And when it comesto commissioning we are looking to understand gaps in our portfolio.

A9 – PC) We are definitely interested in that. VOD is a proactive thing… YOu choose as a view… So we have an idea of micro genres that are specific to you… So we have say, Sex/Pervert corner; we have teenage american comedy; etc. and you can see how micro genres are panning out… And you can then telling commissioners what is happening on a video on demand side… BUt that’s different to commissioning for TV, and convincing that

A9 – HBG) I think that you’ve asked the single greatest question at a data science conference: what do you do if the data is not there. And sometimes you have to take a big leap to do something you can’t predict it… And that happens when you have to go beyond the possibilities of the data, and just get out there and do it.

A9 – SP) The concern is such that the data may start to reduce those leaps and big risks, and that could be a concern.

JO) And that’s a great point to finish on: that no matter how goos the data science we have to look beyond the data.

And after a break we are back… 

BBC – Keynote from Michael Satterthwaite, Senior Product Manager

I am senior project manager on a project called BBC Rewind. We have three projects looking at opportunities, especially around speech to text, from BBC Monitoring, BBC Rewind, and BBC News Labs. BBC Rewind is about maximising value from the BBC archive. But what does “value” mean? Well it can be about money, but I’m much more interested in the other options around value… Can we tell stories, can we use our content to improve people’s health… These are high level aims but we are working with the NHS, Dementia organisations, and running a hack event in Glasgow later this month with NHS, Dementia UK, Dementia Scotland etc. We are wondering if there is any way that we can make someone’s life better…

So, how valued is the BBC’s Archive? I’m told it’s immeasurable but what does that mean? We have content in a range of physical locations some managed by us, some by partners. But is that all valuable if it’s just locked away? What we’ve decided to do to ensure we do get value, is to see how we can extract that value.

So, my young niece, before she was 2 she’d worked out how to get into her mum’s ipad… And her dad works a lot in China, and has an iphone. In an important meeting he’d gotten loads of alerts… Turns out she’d worked out how to take photos of the ceiling and send them to him… How does this relate? Well my brother in law didn’t delete those pictures… And how many of us do delete our photos? [quick poll of the room: very very few delete/curate their digital images]

Storage has gotten so cheap that we have no need to delete. But at the BBC we used to record over content because of the costs of maintaining that content. That reflected the high price of storage – the episodes of Doctor Who taped over to use for other things. That’s a decision for an editor. But the price of storage has dropped so far that we can, in theory, keep everything from programmes to script and script notes, transcripts etc. Thats hard to look through now. Traditionally the solution is humans generating metadata about the content. But as we are now cash strapped and there is so much content… is that sustainable?

So, what about machines – and here’s my Early Learning Centre bit on Machine Learning… It involves a lot of pictures of pandas and a very confused room… to demonstrate a Panda and Not a Panda. When I do this presentation to colleagues in production they see shiny demos of software but don’t understand what the realistic expectations of that machine are. Humans are great at new things and intelligence, new problems and things like that…

Now part two of the demo… some complex maths… Computers are great at scale, at big problems. There is an Alan Turing quote here that seems pertinent, about it not being machine or humans, its finding ways for both to work together. And that means thinking about what machines are good at? Things like initial classification, scale, etc. What are humans good at? Things like classifying the most emotional moment in a talk. And we also need to think about how best we can use machines to complement humans.

But we also need to think about how good is good enough? If you are doing transcripts of an hour long programme, you want 100% or close enough and finish with humans. But if finding a moment in a piece of spoken word, you need to find the appropriate words for that search. That means your transcript might be very iffy but as long as it’s good enough to find those key entities. We can spend loads of time and money getting something perfect, when there is much more value in getting work to a level of good enough to do something useful and productive.

This brings me to BBC Rewind. The goal of this project is to maximise the value from the BBC Archives. We already have a lot of digitised content for lots of reasons – often to do with tape formats dying out and the need to build new proxies. And we are doing more digitising of selected parts of the BBC Archives. And we are using a mixture of innovative human and computer approaches to enrichment. And looking at new ways to use archives in our storytelling of audiences.

One idea we’ve tried is BBC Your Story which creates a biography based on your own life story, through BBC Archive content. It is incredibly successful as a prototype but we are looking at how we can put that into production, and make that more personalised.

We’ve also done some work on Timeline, and we wanted to try out semantic connections etc. but we don’t have all our content marked up as we would need so we did some hand mark up to try the idea out. My vision is that we want to reach a time when we can search for:

“Vladimir Putin unhappily shaking hands with Western Leaders in the rain at the G8, whilst expressing his happiness.” 

So we could break that into many parts requiring lots of complex mark up of content to locate suitable content.

At the moment BBC Rewind includes speech-to-text in English based on the Kaldi toolset – it’s maybe 45% accurate off the shelf – but that’s 45% more of the words than you had before, and a confidence value; Speech-to-text in the Welsh language; Voice identification; speaker segmentation – Speech recognition that identify speakers is nice, but we don’t need that just yet. And even if we did we don’t need that person to be named (a human can tag that easily) and then train algorithms off that; face recognition – is good but hard to scale, we’ve been doing some work with Oxford University in that area. And we get to context…. Brian Cox versus (Dr) Brian Cox can be disentangled with some basic contextual information.

Finally, we have an exciting announcement. We have BBC Monitoring – a great example of how we can use machines to help human beings in their monitoring media. So we will be creating tools to enable monitoring of media. In this project BBC are partnering with University of Edinburgh, UCL, Deutsche Welle and others in an EU funded Horizon 2020 project called SUMMA – this project has four workstreams and we are keen to make new partnerships

The BBC now runs tech hack events which resulted in new collaborations – including SUMMA – more hack events coming soon so contact Susanne Weber, Language Technology Producer in BBC News Labs. The first SUMMA hack event, will be end of next year and will focus on the automated monitoring of multi-media sources: audio-visual, text etc.

Lets try stuff faster and work out what works – and what doesn’t – more quickly!

Unlocking Value from Media Panel Session – Moderator: Simon King, University of Edinburgh

Our panel is…

Michael Satterthwaite – Senior Product Manager, BBC
Adam Farqhuar – Head of Digital Scholarship, British Library
Gary Kazantsev R&D Machine Learning Group, Bloomberg
Richard Callison – brightsolid (DC Thomson and Scottish Power joint initiative)

Q1 – SK) Lets start with that question of what value might be, if not financial?

A1 – GK) Market transparancy, business information – there are quantitative measures for some of these things. But a very hard problem in general.

A1 – AF) We do a lot of work on value in the UK, and economic impact, but we also did some work a few years back sharing digitised resources onto Flickr and that generated huge excitement and interest. That’s a great example of where you can create valuge by being open, rather than monetising early on.

A1 – MS) Understanding value is really interesting. Getty uses search to aid discovery and they have learned that you can use search to do that, to use the data you are capturing to ensure users access what they want and want to buy quickly. For us, with limited resources, the best way to understand value and impact is to try things out a bit, to see what works and what happens.

A1 – AF) Putting stuff out there without much metadata can give you some really great crowd data. With a million images we shared, our crowd identified maps from those materials. And that work was followed up with georeferencing those maps on the globe. So, even if you think there couldn’t possibly be enough of a community interested in doing this stuff, you can find that there really is that interest and who want to help…

A1 – MS) And you can use that to prioritise what you do next, what you digitise next, etc.

Q2 – SK) Which of the various formats of media are most difficult to do?

A2 – MS) Images are relatively straight forward but video is essentially 25 pictures per second… That’s a lot of content… That means sampling content else we’d crash even Amazon with the scale of work we have. And that sampling allows you to understand time, an aspect that makes video so tricky.

Q3 – SK) Is there a big difference between archive and current data…

A3 – RC) For me the value of content is often about extracting value from very local context, And it leads back to several things said earlier, about perhaps taking a leap of faith into areas the data doesn’t show, and which could be useful in the future… So we’ve done hand written data which was the only Census that was all handwritten – 32m rows of records on England and Wales and had to translate that to text… We just went offshore, the BPO outsourced… That was just a commercial project as we knew there was historical and genealogical interest… But not so many data sets like that around.

But working with the British Library we’ve done digitisation of newspapers both from originals and microfilm. OCR isn’t perfect but it gets it out there… The increase we have in multimedia online trigged by broadcast – Who Do You Think You Are? triggers huge interest in these services and we were in the right place at the right time to make that work.

A3 – GK) We are in an interesting position as Bloomberg creates it’s own data but we also ingest more than 1 million news documents in 30 languages from 120k sources. The Bloomberg newsroom started in 1990 and they had the foresight to collect clean clear digital data from the beginning of our work. That’s great for accessing, but extracting data is different… For some issues like semantic mark up and entity disambiguation… And huge issues of point in time correctness – named entities changing meanings over time. And unless someone encoded that into the information, then it is very difficult to disambiguate. And the value of this data, it’s role in trading etc., needs to be reliable.

I kind of don’t recognise Mike’s comments on video as there is object recognition available as an option… But I think we get more value out of text than most people, and we get real value from audience. Transcription and beyond… Entity recognition, dialogue structure, event extraction… A fairly long NLP pipeline there…

A3 – AF) The description of what you want to identify, those are very similar desires to those we want in the hunanities, and has additional benefit to journalists too. Is text search enough? Not really. They are an interesting way in… But text isn’t the best way to understand either historical images in a range of books, but also isn’t that useful in the context of the UK Web Archive and images in that. Much of what may be of interest is not the text, but perhaps better reduced to a series of shapes etc.

Q4) There has been a mention of crowd sourcing already and I was wondering about that experience, what worked and did not work, and thinking back to Mike’s presentation about what might work better?

A4 – AF) We found that smaller batches worked better… People love to see progress, like to have a sense of accomplishment. We found rewards were nice – we offered lunch with the head of maps at the British Library and that was important. Also mix it up – so not always the same super hard problems all the time

A4 – MS) I was going to give the BL example of your games machine… A mix of crowdsourcing and gamification.

A4 – AF) It’s very experimental but, as mentioned in the earlier panel session about the Tinder-like app. So we’ve worked with Adam Crimble to build an arcade game to do image classification and we are interested to see if people will use their time differently with this device. Will they classify images, help us build up our training sets. But the idea is that it’s engagement away from desktop or laptops…

A4 – RC) We have tried crowdsourcing for corrections. Our services tend to be subscriptions and Pay as You Go. But people still see value in contributing. And you can incentivise that stuff. And you see examples across the world where centrally or government websites are using crowd sourcing for transcription.

A4 – GK) You could argue that we were innovators in crowd sourcing at Bloomsberg, through blogs etc. And through tagging of entities. What we have learned from crowdsourcing is that it isn’t good for everything. But hard when specialist knowledge is needed, or specific languages needed – hard to get people to tag in Japanese. We aren’t opposed to paying for contribution but you have to set it up effectively. We found you have to define tasks very specifically for instance.

Q5) Talking about transposing to text implies that that is really possible. If we can’t do image descriptions effectively with text then what else should we be doing… I was wondering what the panel thought in terms of modalities of data…

A5 – MS) Whatever we do to mark up content is only as good as our current tools, understanding, modalities. And we’d want to go back and mark it up differently. In Google you can search for an image with an image… It’s changed over time… Now it uses text on the page to gather context and present that as well as the image back to you… If you can store a fingerprint to compare to others… We are doing visual searches. searches that are not text based. Some of these things already exist and they will get better and better. And the ability to scale and respond will be where the money is.

Q6) The discussion is quite interesting as at the moment it’s about value you define… But you could see the BBC as some form of commons… It could be useful for local value, for decision making, etc. where you are not in a positiion to declare the value… And there are lots of types of values out there, particularly in a global market.

A6 – MS) The BBC have various rules and regulations about publishing media, one of which is humans always have to check content and that is a real restriction on scale, particularly as we are looking to reduce staff. We ran an initiative called MCB with University of Edinburgh that opened some of the idea But ideally we would have every single minute of broadcast TV and radio into the public domain… But we don’t have the rights to everything… In many cases we acquired content before digital which means that you need to renegotiate content licenses etc. before digitising etc.

A6 – AF) Licenses can be an issue, privacy and data protection can be an issue. But we also have the challenge of how we meet user needs and actually listening to those needs. Someone we have to feel comfortable providing a lower level service, and may require higher skills (e.g. coding) to use… That can be something wonderful, not just super polished services required. But that has to be a service that is useful and valuable. But that’s super useful. And things will change in terms of what is useful, what is possible, etc.

A6 – GK) For us it’s an interesting question. Our users won’t say what they want, so you have to reverse engineer then do rapid product development… So we do what you (Micheal) suggest – building rapid prototypes to try ideas out. But this isn’t just a volatile time, but a volatile decade, more!

Q7) Can you tell us anything about how you manage the funnel for production, and how context is baked in in content creation process…

A7 – GK) There is a whole toolset for creating and encoding metadata, and doing so in a way meaningful to people beyond the organisation.. But I could talk about that for an hour so better to talk about this later I think.

Q8 – SK) How multilingual do you actually need to be in your work?

A8 – GK) We currently ingest content in 34 languages, but 10 languages cover the majority – but things changes quickly. Used to be 90% of content ingested was in English, now 70-80%. That’s a shift… We have not yet seen the case that suddenly lots of data that appears in a language where there was previously none. Instead we see particularly well resourced languages. Japanese is a large well resourced language and many resources in place, but very tricky from a computational perspective. And that can mean you still need humans.

A8 – MS) I probably have a different perspective on languages… We have BBC Research working in Africa with communities just going online for the first time. There are hundreds of new languages in Africa, but none will be a huge language… A few approaches… Can either translate directly, or you can convert into English, then translate from there. Some use speech to text – with Stephen Hawking type voice to provide continuity.

A8 – AF) Our collections cover all languages at all times… an increasingly difficult challenge.

Comment  – Susanne, BBC) I wanted to comment on speed of access to different language. All it takes is a catastrophe like an Ebola outbreak… Or disaster in Ukraine, or in Turkey… And you suddenly have the use case for ASR – machine translation. And you see audience expectations there.

A8 – MS) And you could put £1M into many languages and make little impact… But if you put that into one key language, e.g. Pashtu you might have more impact… We need to consider that in our funding and prioritisation.

A8 – GK) Yes, one disaster or event can make a big difference… If you provide the tools for them to access information and addt their own typing of their language… In the case of, say, Ebola you needed doctors speaking the language of the patient… But I’m not sure there is a technological solution. Similarly a case on the Amazon… Technology cannot always help here.

Q9) Do you have concerns that translations might be interpreted in different contexts and be misinterpreted? And the potential to get things massively wrong in another language. Do you have systems (human or machine) to deal with that?

A9 – AF) I won’t quite answer your question but a related thing… In some sense that’s the problem of data… Data becomes authoritative and unless we make it accessible, cite it, explain how it came about… Then it becomes authoritative. So we have large data collections being made available – BBC, BL etc. – and they can be examined in a huge set of new ways… They require different habits, tools, approaches than many of us are used to using, and different tools that e.g. academics in the humanities. And we need to emphasise the importance of proper citing, sharing, describing etc.

A9 – MS) I’d absolutely agree about transparency. Another of Susanne’s projects, Babel, is giving a rough translation that can then be amended. But an understanding of the context is so important.

A9 – GK) We had a query last week, in German, for something from Der Speigel… Got translated to The Mirror… But there is a news source called The Mirror… So translating makes sense… Except you need that outside data to be able to make sense of this stuff… It’s really an open question about where that should be and how you would do that.

Q10 – SK) So, a final question: What should ATI do in this space?

A10 – RC) For us we’d like to see what can be done on an SME level, and some product to go to market…

A10 – GK) I think that there are quite a lot of things that the ATI can do… I think there is a lot of stuff the industry won’t beat you too – the world is changing too rapidly for that. I think the University, the ATI should be better connected to industry – and I’ll talk about that tomorrow.

A10 – AF) As a national institution has a lot of data and content, but the question is how we can make sense of it… That large collection of data and content. The second issue is Skills – there is a lot to learn about data and working with large data collections. And thirdly there is convening… data and content, technologists, and researchers with questions to ask of the data and I think ATI can be really effective in bringing those people together.

A10 – MS) We were at an ideas hack day at the British Library a few weeks back and that was a great opportunity to get those people who create data, who research etc. and bringing it together. And I think ATI should be the holder of best practice to connect the holders of content, academia, etc. to work together to add value. For me trying to independently add value where it counts really makes a difference. For instance we are doing some Welsh speech to text work which is work I’m keen to share with others  in some way…

SK: Is there anything else that anyone here wants to add to the ATI to do list ?

Comment: I want to see us get so much better at multilingual support, the babelfish for all spoken languages ideally!

 

Closing Remarks – Steve Renals, Informatics, University of Edinburgh

I think today is something of a kick off for building relationships and we’ve seen some great opportunities today. And there will be more opportunity to do this over drinks as we finish for today.

And with that we are basically done, save for a request to hand in our badges in exchange for a mug – emblazoned with an Eduardo Paolazzi inspired by a biography of Alan Turing – in honour of Turing’s unusual attachment to his mug (which used to be chained to the radiator!).

Apr 122013
 

Today is the third and final day of the Lancaster Twitter & Microblogging conference. For more on the event see my Day One and Day Two liveblogs. Today there are only a few sessions over a half day so this will be a rather shorter post.

Firstly it’s off to:

Cracking the code: Towards a semiotic understanding of Twitter and its use by media fans by Rhiannon Bury

Since Twitter came on the scene on 2006 it has seen rapid growth, particularly since 2009. There are thought to currently (2013) be 500 million registered users, 200 of whom classified as monthly user. Smith counts over a billion monthly users of Facebook and Pew has found 65% of US adults using Facebook, 8% using Twitter in 2011. Latest figures were 67% use of Facebook, 16% use Twitter. Huge growth but still niche use. Twitter is just ahead of Pinterest which didn’t even exist in 2011. But it is being used by fans, so Neilson has found – not sure of process though – 20 million TV viewers tweeted whilst watching. I did some work on Television 2.0 research a couple of years ago and found most don’t tweet but that younger fans use twiter more frequently than older fans (Pearson -1.26) ; more female fans (39.3%) using twitter than male fans (30.9%) but that reflects the fact that women are more likely to take part in this sort of participative media.
I don’t have time to go through all of the literature but there is a lot on self-presentation here – goth identity etc. Ruth’s work is really useful here, Goffman is trotted out a lot. I wanted to conceptualize Twitter as a structure for communication, a social orientated approach from Rowland Barthes, and John Fisk’s work in television studies. I won’t have time to cover everything here but I want to start with the semiotics of twitter, the framework for my work. So there are two types of relations: Syntagmatic relations (horizontal axis) – this is about linearity; combination; addition; deletion (going back to Barthes here). Paradigmatic relations (Vertical axis) – is about. See Daniel Chandler 2002 etc. The difference is about the presence or absence of signifiers in the text. So we have ideas of Denotation (first order of signification – Saussure) and Connotation (second order of signification – Bathes).
So the tweet is about using or breaking the rules here. I will look at this as a visual and temporal paradigm. There are so many versions here but I’m sticking to the web version of Twitter here. So an example of a tweet here. One can be very bounded, no relationship to other. But authorship is important e.g. quotations. And profile image, name, username and the importance of temporality – that date stamp. And we have that @symbol which is built into the system  to allow exchange (added to meet user needs/demands). We can see intertextuality here, communicative structures. Honeypot and Herring talk about classifying tweets, they were very interested in @sign and conversation. But I think we can look at structure of this beyond that. The role of RTs, the time issue. I want to understand that text box as tweets. I want to understand them as a way to understand  the secondary texts – the TV studies and media studies perspective. And that other type of secondary texts – the other fan writing etc. As Henry Jenkins points out there is always a set of relations of affect. There is an emotional connection to the text for the viewer. I want to look at some of those aspects of fan tweets here.
Microblogging and aggregation (it’s complicated). We talk about this stuff as microblogging but is it? But this is not a paradigmatic relationship. There is a shared syntagm – all it really has is newest to oldest. We do that with email but Twitter changes how we interact and experience that – using on mobile and tablets that structure has far more impact, the software emphasises nowness and currency. All the aggregation tools emphasise these. And there is minimal intratextuality – not always connections, often loose. It’s a bit like images on your digital camera – there may be a connection but it’s not neccassarily present or obvious.
So when we think about Twitterverse… It’s central to the idea of microblogging but we can’t clap eyes on it. We only ever see a partial view. So Twitter has this idea of aggregation as connotation – but that is that partial and incomplete view. Twitter’s structure emphasises that… the homepage is that partial personal view. And that stream is on the right hand side, very western left to write thing. But without us creating these posts there is no content so that drafting button is prominent. Quote here from Nancy Baym (@nancybaym) about how you can possibly pay attention to large numbers of followers. So, understanding these texts is about understanding context, ideology, etc.
I have done over 60 inteviews and wanted to pull some quote out. My fans emphasised consumption over production. So one comments on following everything posted by two people they are fans of / with. Another talks about multiple areas of fandom and engaging in that traditional fandom through twitter. But there is discussion about the direct connection between fans and stars, I’m not suggesting this is all the time, but my fans really emphasise the importance of a really direct connection to key stars, to these representations of them. But we see hints of syntagmatic failure, a fan complaining that the structure limits understanding of news to the “now” so they use other media to find out what is actually happening. Going back to the Honeypot and Herring 2009 idead of the potential of addressivity: that possibility of direct comment to the creators. That being able to address comments directly matters, whether or not there is a reply, and those reply having huge impact on those fans.
I do want to look at areas such as hashtags, recipriocity etc. For myself I found it’s giving me more of a rigour to my own analysis. Twitter is important, not just describing its usage.
Q&A
Q) Could you further explain the failure of the syntagmatic fail?
A) main rule of Twitter is no more than 140 character. So that really limits what you can put in there. That’s why Twitter conversations can be trivial. People get around it, they use abbreviation from IM etc. But what the fan is saying here is that for a fan who wants to know more about what’s happening with, in this case, Nathan Fillon, their tweets are not enough. That is a failure of what they say and do to some extent.
Comment) But is that a failure? Maybe that switching is a success?
A) Absolutely. I’m not saying that 140 characters is failure. But it’s that idea of needing to supplement in this case. But in larger picture it’s really value added.
Q) What about dialogue between fans and dotted line back to original object, adding things in?
Comment) Yes, bulk of what I see on Twitter is commentary, addition, etc.
A) Absolutely, this is one piece from user aggregation riff. But yes, we need to get into that, we need to think about user exclusivity. And that issue of viewing conversation. It’s absolutely central.
Comment) It makes it really complicated as time then splits. Not a single temporal line but multiple branches. And in terms of asyncronicity what about retweets? Some people retweeted a year old tweet when I was presenting the other day.
A) Sure, the retweet does show that age of tweet, but it shows up the way that you would read it as a the perceiver of the retweets. Hypertext theory and intertextually is really essential for understanding Twitter.
Comment) Top Tweets also warps the idea of age of tweets etc.
A) Yup and that’s a whole big ball of wax but I’m trying to factor that in.
Next up will be:

Use of Weibo by UK universities and Chinese students: A study of intercultural micro-blogging by Nick Pearce, Durham University (and Yimei Zhu, University of Manchester)

Nick teaches a lot of international students and works with recruitment areas of the university as part of his teaching work around social anthropology. It occured to me that we could be using Weibo to market what we do, but also interested in that idea of engaging with a social media in a language you don’t speak but where you can sort of see what’s going on. I have been working with Yimei Zhu and she does speak Mandarin so whilst I talk about not understanding what is being said, she does and her research is very much around analysing exchange on Twitter.

So today I’m going to talk about this in the context of UK Universities and some exploratory work.

UK Universities and social media – the focus here is on marketing/recruitment. It’s a cheap/low cost means of interacting with multiple audiences. So when I came to Durham I set up a Facebook page for our courses and we get maybe an enquiry a week through Facebook. Not sure if they might have contacted us another way but they are able to. And similar idea behind our Weibo page. Although this would be an institution-wide Weibo page. That multiple audiences aspect does matter though. You can see a top five chart of Universities on Twitter – ranked by followers and retweets (http://craig-russell.co.uk/demos/uk_uni_twitter/index.html). Not a surprise that OU is biggest – it’s the biggest university in the UK so that makes sense. They have an active Facebook page as well – although that was not all good news. You might get 1 in 10 bad comments, but 9 out of ten challenging that comment. You lose some of that control but your “product” (though I hate that word) gets out there. The chart of followers drops off fast… For Durham our Twitter profile is busier than our Facebook page.

There are issues in follower numbers. There is the issue of fake/zombie followers (zombie is a Weibo term but it’s a lovely term). There was a hoohah when Yale joined Weibo, and they got a huge number of followers instantly. But analysis showed that there were only a few hundred followers that were actually active. There’s a suggestion that Weibo may have had something to do with those fake followers, Yale certainly didn’t pay for them. But it’s easy to gather raw figures but it’s not really a good measure. For me Facebook comments and enquiries count far more than “likes”. Even more so for the OU perhaps. Interactions matter here. There are alternaive analytics – e.g. Klout, PeerIndex… but not much better than raw figures. It’s important to be careful and critical of numbers.

So what about China. Chinese students are a significant part of international cohort – 79k in 2012. Facebook/Twitter restricted in China (you can access them but only through dubious means, but they do get use). What makes that interesting is that those using Facebook in China tend to be more political, to go to that effort you need that. Weibo was set up in August 2009, has over 360 million users, it has restrictions but no one is being forced to use it. It is clearly popular and clearly became popular very quickly… regardless of whether we think many of those users are fake/zombies.

So looking at my Weibo profile it is quite Twitter-like. I follow people… but I’m not sure how I came to follow them. You can post in English here. And you have animated gifs – which I’d love to see in Twitter! But interface all in Mandarin. My colleague speaks Mandarin but uses no social media. I use social media all the time but don’t speak Mandarin… when she asked how to log out I found it faster – that’s a reflection of how like other sites Weibo looks. Other big similarities here. There are assymetrical follower relations (like Twitter, unlike Facebook) and there is 140 character limit. But in Mandarin that’s a lot! Ai Wei Wei says that that’s a short story. But some differences here. There is ID verification – it is supposed to be your passport number but there’s a suggestion that that’s not a rigorous checking process. Censorship is rather opaque, you only really encounter it by gaps and absences. But remember that Twitter is moderated – you get taken down or reported for some actions. Thinking back to Lee Salter’s plenary we saw people jailed for Tweets. There are differences there but also more similarities than first apparent maybe. And you have other differences: animated gifs but also gamification. So I am “level 1”. In Twitter it’s maybe about follower counts, in Weibo you get to new levels and you get a special patch/badge. So Weibo may have started as a clone but I think it’s gone beyond Twitter in some features.

So I went to my boss, the head of the University. I wanted to look at UK Universities on Weibo. There is no table of these. So Yimei did a manual search using the HESA list of universities and various search terms. I talked about ID verification but there is also verification of pages, a whole other levels. 58% had presences; 43% had verified presences. Posts are mainly in Mandarin, some in English. So, we did set up a Weibo page but verification is tricky from the UK. I got sent a very tricky Mandarin form, no indication of who should sign in. They wanted an official stamp, and that had to be in red. It takes weeks… So we have a page, tweeting going on – mostly retweeting comments about Durham.

Looking at and understanding Weibo when you don’t speak the language… it’s odd. Twitter isn’t global. Weibo isn’t global either, although expanding and just launched a Thai version. No reason that Weibo couldn’t launch an English language interface and have that take off. People are happy to sign up to Apple and hand over power and choice to some extent. I’m not saying that will happen with Weibo but seeing that other cultural context lets you look at these things in a new light.

Q&A

Q) The counts on your profiles are different on Twitter and Weibo. Any of those numbers can be normalised in some way. APIs for both will show you more detail of that data.

A) Yimei has been looking at content and interactions and she’s been noting changes, interactions and the role of time.

Q) Your comparison of censorship – I don’t think Twitter is that harsh.

Comment) There was an article this morning, in Hong Kong there are removed Tweets and censored Tweets. Also papers on censorship of Weibo, based on large data set.

A) Chinese Communist party control the broadcast media. Their response to social media, there are some who fear social media… You can censor afterwards but you can’t stop people tweeting. One of the founders of Weibo, a private company, was making a democracy point.

Me) Different types of censorship: political in China, commercial on Twitter – much more about brands etc. Now that may have different impact and ethical implications but those are both forms of censorship.

Comment) Yes, I think so. In China censorship really isn’t a line here though, it’s never clear what is/is not censored. Sometimes things appear to have been censored relatively at random.

Comment) Weibo functions because the government lets it and works within it’s mandate. Twitter chooses to censor

Me) Yes, but Twitter is not just making choices, it also comes under government pressure to censor – they have censored tweets in the Middle East after pressure from some governments, they were also pressured to censor and restrict during London Riots

A) Find that feature of social media very odd: people like Pinterest ignore copyright law and sort of reset rules in how they run having ignored those rules. Would YouTube have taken off if all non-cleared video content been removed/censored?

And with that we have to finish a really lively wee Q&A.

Plenary: Saying ‘sorry’: Corporate apologies posted on Twitter by Ruth Page

Johnny is introducing Dr Ruth Page and mentioned her book Stories in Social Media, and her article on Self Branding and Celebrity in Social Media. Last year Ruth organised an event at Leicester on social media which was a particular inspiration for this week’s event. Ruth is also Chair of new special interest group in Linguistics. David Bartam, Johnny Unger and Ruth are currently coauthoring a book on researching social media.

I am a little ambivalent about going last but at least I should get the last word (ish). So my talk is on “saying sorry”. I’m going to start by making some opening remarks on Twitter, why it is significant to corporations, a bit about the data set. And I need to make an apology myself – my section on corporate “talk” is not about apologies per se but contextualising apologies. Then I’ll look at approaches to apologies, characteristics of corporate apologies, and the application of linguistics. There are some interesting potential approaches coming from our highly varied backgrounds and disciplines here.

Twitter is public, participatory environment (Jenkins 2006); virtual marketplace (Bourdieu 1977); driven by value of attention and visibility (Marwick 2010). There is a direct access there – complaining to Starbucks or an airline say – but that isn’t evenly distributed power. In 2012 I argued that it works like a virtual marketplace and that that is around attention and visibility – reach of tweets, scale of followers, influence, etc. But that’s not the only way that attention and visibility shows up. It also shows up in linguistic choices in Twitter, how people shape interactions to those affordances, and an opportunity to see how those inequalities and hierachies works.

How is Twitter used? It’s electronic word of mouth (Jansen 2009) – and that matters to company, they mine that all the time. 51% of users follow users/companies (Edison research 2010). But there are different types of twitter accounts, there are corporate and/or personal accounts. But I’m interested in corporate and branded accounts. You also see distinct accounts for specific purposes, e.g. customer care accounts.

So, where did I start to get intereted? My data isn’t designed in response to a specific research question, the research has evolved organically from my work for last few years. My data is based on around 180k tweets harvested from 100 publicly available accounts, using custom Python codes pulling data from named accounts. Firstly I was comparing company use with “ordinary” use – although “ordinary” isn’t really the right word here. people said “you haven’t looked at hashtags”, and I did. So I started to look at the very different corporate use of hashtags. I had 40 companies, 30 celebrities, 30 “ordinary accounts”. Gathered data in 2010 and 2012. And today I’m talking about around 1200 tweets with apologies in them.

When I harvested the data I wanted to distinguish between updates, things that were public but with an @username, and the RTs. It doesn’t take into account quoting or MTs etc. as those are newer practices. So I was interested in the distribution of those types of tweets. In 2010 all types of users favoured the one-to-many broadcast pattern (the update), what does that say about identies and how individuals manage their interactions with others?

So how do companies use Twitter updated? There are interactions initiated by the company – pushing things out; broadcast the brand – through hashtags; broadcast across platforms – link analysis; broadcast conversational snippets – modified RTs (less occurance in new style RTs which isn’t covered here). So looking at occurance in hashtags we see that hashtags occur in the updates, they are in the one-to-many not the @reply one-to-one posts. And the use of hashtags is increasing. But this is odd, hashtags started off as folksonomic phenomenon to allow your topic to be promoted and found. Twitter changed their search algorithm so that you can find topics easily BUT the use of hashtags is increasing over time. The most frequest hashtags across all accounts is #FF. But digging further corporate hashtags tend to highlight products, corporate positioning, or making searchable the companies as producers. Whereas ordinary people’s hashtags seem to reflect the community – mainstream media, consumer interests. Yes Twitter gives ordinary people a voice, but they are still positioned as audience, as consumers when you look at those tweets.

I also looked for links in tweets, corporate accounts use much more links and their use it on the rise. There is a rise of the amplified talk. Originally in 2010 I saw links as ways to signify authority as recommenders, as endorsers. But different things happening now. General trend in links – Twitter is more multimodal – photos and videos increasingly important. And Twitter is increasingly multi-platforms – Facebook groups, Google Plus, Pinterest, Tumblr, Instagram, Daily Booth, VintageCam, YFrog, Whosay, Mobile Apps. Posts to multiple sites or connecting sites. Images are used to indicate products, what they are selling. But that’s not all that’s happening. In 2010 ordinary users tended to share clear links to articles in their field, their own blog – you could tell what their profession was. Corporations point to own web sites, promotional offers. Real collapse of professional/personal now taking place. Now ordinary uses point to some professional identity links, but also general life, photos, interests (e.g. fashion etc). Corporate use is a little different, some corporate professional links, but also sharing of images by customers/users, of experience images etc.

The last sort of tweets being shared are modified retweets, You see that celebrities their use is declining. Their use is slightly declining for ordinary users. But their use is increasing by corporations. And how does that happen? It’s about sharing compliments, feedback, things that promote their brand. They show they are engaging but in a very specific and careful way.

When I look at distribution of tweet types in 2012 there seemed to be few changes but Corporate use is radically different – many more addressed messages. Why is this happening? Up to this point I had looked mostly at updates, so it was time to explore addressed messages. I started with concordance techniques from corpus linguistics. I looks for the words that appear much more in just those addressed messages compared to all those other messages. I used the remaining dataset as the reference dataset, addressed messages as the sample. And certain words appear much more often, such as “hi”, “thanks” etc. They don’t just occur often, they often occur together. For instance “@username Hi [name], sorry for your frustration. Please follow/DM us additional details regarding this and we can try to help. Thanks.” So we are seeing the rise of customer care here. But it’s not just corporates who apologise…

Difficulty of apologising…

  • Reluctant apologies – [cue Big Bang Theory clip of Penny reluctantly apologising to Howard. And being told to get over herself by him. Then him bursting into tears].
  • Punk apologies – [cue music video]
  • Politicians apologise – [cue Nick Clegg Apology Song video]

Even cats say sorry… even Whales say sorry when Twitter is over capacity…

Apology as a “post event speech act” (Spencer-Oatey 2008) – I’m following this understanding. This is recognition of something going wrong, acknolwedging that, reconciling parties. Enables future interaction and restoration of equilibrium (Ogiermann 2009). But research literature looks in linguistics tend to be about private apologies but there is a need for more work on public apologies, of apologies in large corpuses.

There is huge use of “sorry/apologise” here – aggregated data for both (and americanised spellings) show huge use of these terms in corporate tweets. Semantic components of an apology based on Bloom Culford(?) in 1989. Semantic components:

  • Illocutionary Force Indicating Device (IFID) e.g. we’re really sorry
  • Taking responsibility
  • Explanation or account
  • Offers of repair
  • Promise of forbearance – not to make same mistake again.

In the 1200 tweets there was only one case of taking responsibility, and only one case of promise of forbearance, both were in ordinary accounts. Maybe commonsense reasons – liability, appropriateness, responsibility or role of person speaking.

You see lots of Illocutionary Force Indicating Devices. But there are very different approaches. Companies avoid restating the problem in 66% of apologies. The reverse happens in ordinary accounts – 58% of their apologies. My favourite of the apologies was “I’m sorry for the slugs in your strawberries”. It is good to acknowledge what you are apologising for but that is very risky, you risk raising the profile of the issue or validating etc.

10% of corporate apologies give an explanation, 27% of ordinary users did. When companies did explain their apology they shifted blame: denied the offence – telling the user they got it wrong; place blame with third party; factors beyond the company’s control (e.g. legal requirements, weather, etc). And on the rare occassions companies do accept responsibility they do that in a very specific way. They use linguistic constructions that made it very hard to see responsibility, e.g “sorry for the ongoing issues caused by the Booking Office cluser, there is a staff shortage in the area and we are working on it”. You need to show yourself in best light is the theme here, and a good way to do that is to make offers of repair. When company does that they happen in a very specific way with an awareness of the multi-party nature of the interactions. Offer of repairs tends to be something monetary or tangible – but not the tweeter doing that.

So these apologies are embedded in wider interaction. You see this in the way that questions occur in corporate apologies. 22% of corporate apologies and 13% of ordinary apologies include a question. Another aspect is the use of imperative, they are telling the customer or giving a command. It happened in 33% of corporate apologies, not at all in ordinary apologies. So often further contact initiated by company – e.g. “standby for a message” or further contact required by customer – e.g. drop us an email. That latter type are often hedged. But these are risky, they don’t close the loop, they risk the customer not responding, following up etc.

Openings and closings tend to be quite specific. Companies tend to use “Hi” and end in “Thanks” and a signature. Ordinary people do not. 37% of companies include a signatures, none of the ordinary accounts to. More interestingly perhaps, 19% of apologies posted by companies include greetings, that “hi”, but again no ordinary accounts to. They seem to be trying to build rapport, but that marks them out as different from ordinary users. So companies using this startegy mark their social distance, and show structures derived from email, not from conversation. But you do see alternative openings. Discourse markers (5% in company accounts, 27%(?) in ordinary accounts) – several flavours, so can, for example be associative expressiveness e.g. “Oh, I’m sorry” etc. Emoticons are also used to intensify negative sentiment or to upgrade positive sentiment – for offers of repair say. But sometimes it doesn’t match well. Some mismatch of negative responses – smiley to mitigate negative response. And sometimes it’s about promoting rapport (especially in line with future interactions).

So, what does this mean. The reason companies apologise in this way is to avoid face-threatening damage to reputation – e.g. avoiding restating problem. Mitigate face-threatening damage… [sorry, couldn’t keep up there]. So the implications? apologising is important strategy in use as part of customer care. There are repeated, distinctive patersns suggest a particular genre shaped by purpose of interactions and positions and roles of participants. So, the application here? Well it’s interesting for it’s own shape, you see the patterns you may not otherwise see. But challenge for myself is how do we use the work we do as linguistics in a way that helps other people. Well one of the things I’m doing is talking to London company who are creating social media analytics software, to create customer care software to make this work better. Perhaps not always pushing interactions into other spaces, closing loops, showing responsiveness etc.

Final thoughts – obvious limitations here. I haven’t taken into account participant perspectives; haven’t looked at whole interactions, just the apologies, it’s not the whole iteration; small number of accounts considered and not neccassarily noting location and cultural differences between Uk and US say. Possibilities though – what do you want to do with your work? Where can other perspectives by useful?

Unfortunately we finish without time for questions. I wanted to ask whether the rise of hashtags didn’t reflect the adoption and maturity of use by companies, or the use of hashtag campaigns. Hashtags also create links unlike search terms so have added value. And I wanted to ask about the issue of collapse – there has been a Twitter corporate strategy to boost use by media, by celebrities, as part of advertising campaigns all of which encourage collapse. Wider use and adopting of Twitter beyond professional spheres also have a big impact on collapse here, of the types of interactions, of the merging of followers etc.

 April 12, 2013  Posted by at 12:56 am Events Attended, LiveBlogs Tagged with: , , , , ,  2 Responses »
Apr 112013
 

It’s day two at the Lancaster Twitter and Microblogging Conference. As I did yesterday I will be live blogging the parallel sessions I attend, the plenaries, etc. But much of the conversation around this event is to be found on Twitter on the #lutwit hashtag.

Factors influencing academics’ use of microblogging tools in teaching and learning by Nordiana Ahmad Kharman Shah

Nordiana’s research looks at the use of microblogging and the factors in that use and the complex issues around adoption and use.

My work specifically looks at the use of Twitter and particularly how Twitter can increase or enhance teaching and learning (Dunlap and Lowenthal 2009); classroom use (Junco et al 2011) student use (Wakefield et al 2010) etc. There is a qualitative study of academic tweets (G. Velatsianos 2011) found that scholars engaged education and sharing best practices, used it for information resource and media sharing, to have an online pesence etc…

But Selwyn (2011) highlights some issues around social media use: the assumptions about omnipresences/ubiquity of internet access and the digital divide; belief of “digital student”, that the students will explore and use these tools but not all will, some may struggle with use; And there is unclear discussion in terms of social media in relation to learning and teaching contexts.

The literature here is growing but evidence still lacking. See Reuben 2008 for the potential in education for Facebook and YouTube but Twitter hasn’t yet found the right niche.

My first research questions here is around the different ways that academics use Twitter. This will be investigated through a qualitative approach to obtain detailed understanding of the use of “Twitter of Academics”. The research sample will be academics in HE with a Twitter profile and regular microblogging activity (defined here as once a week). There will be a thematic analysis using the Twitter API. The theoretical framework for this work is practice theory – mapping academics ongoing interactions as revealed through recurrent practices, the concept of the “practice lens”; and academics practies of Twitter – may be conceived as a continuum in which activities dynamically change influences. The research will begin with interviews, transcribe and analyse three of these; then observation of Twitter (300 tweets); and observation of Twitter and also using sentiment analysis (of those posting).

Interviews will have thematic analysis (inductive and deductive). The observation of twitter posting qualitative content analysis, deductive approach. Both will be combined to gain a fuller understanding of use and factors. Sentiment analysis (see Pang and Lee 2008).

There are some categories of activities which academics claim they use Twitter for. Conversation for instance has been selected by them, I have defined sub themes of:question and response; for opinion; for update activity. Information and sharing breaks down into research/publication; quick information; links information/news (retweet). Engagement breaks down into student; research community; public. Connecting and networking breaks down into: professional; social; community; research collaboration; seek opportunity. Identity breaks down: professionalism/profile; online presence; self presentation. Learning and Self promotion also sub categorised here. I have also categorised tweets into status; conversation; sharing resources; social.

So the discussion is about so many differing roles and activities. There are many faces of the “new” academic – a real balance to be struck around all areas of role, public engagement and impact agenda, and of course teaching, research, and opportunities.

I have created a research model of factors influencing academic use of microblogging – academic identity; technology affordances; public engagement, etc.

Q&A

Q) Monica Lalanda, doctor in US: was it hard to find that number of academics using Twitter?

A) At this point – at the beginning of the research – it was hard to find academics using Twitter. But this summer it became clear that there were far more academics on Twitter. I think this is a good time for this research. I’m not sure about percentage of usage at the moment. Some use it mainly for teaching. Some use for publishing only. Academics are starting to engage on Twitter. Lots of training and promotion around use of these tool.

Chair) There are some studies in the UK of use of Twitter by academics – see LSE Impact Blog.

Comment) I’m sure there are a lot of prejudices, many don’t see the potential, are concerned about the timing.

A) One of my interviewees is a doctor who is very happy to be able to update colleagues and patients in what he does. He has gained patients through his use of Twitter. But I was quite surprised at his usage. At the beginning I am quite surprised at this use but he is engaged in community and research community and he found organisations and media have asked him for views because of his presence.

Q) Sentiment analysis – can you say a bit more?

A) This is a new area in some ways. Analysis tends to be on content rather than sentiment. I want to explore what they are saying on Twitter and how that relates to what they actually feel, what that relationship that. Many of the academics I interviewed don’t want to enter arguements on Twitter, they don’t want to impact their own or their institution’s reputation.

And we had a diversion there about the backchannel and tone… (not appropriate to amplify but challenging) back to the questions though:

Q) Are there particular characteristics of how academics tweet compared to how others tweet.

A) Interesting question. There are real contrasts between different academic Twitter users.

Chair) Offering to share some sentiment analysis work on corporations.

Whose piper and whose tune? Discursive practices in informal learning events on Twitter by Peter Evans

This piece of work looked at the phenomenon of tweet chats on particular professions and interests, usually regular and they vary a lot. e.g. lrnchat; innochat; edchat; PhDchat. The topic is usually selected in advance and the actual tweeting is within a time limit, usually 60 to 90 minutes. All organised around the hashtag. There are some people who always attend at the core here. Some teachers described these as their main professional learning activity.

I wanted to explore how professional practices are being “talked in to being” in discussion events held in an open online environment, I particularly looked at Human Resources professionals. I looked at three events on hashtags here over three months. These vary in how many participants attend (between 54 and 72), some had 10 tweets per minute, some less. But this stuff isn’t easily constrained. Schneider and Foot (2005) describe this issue of web spheres – the bringing in of other resources, chats, tweets, etc. So looking at a blog about the event provides additional context. One participant decided to recontextualise their Twitter contributions in a blog post. And you see comments that there are poeple outside of the chat who follow up, ask questions, and a blog post has been used to address that. And there is an example here of someone correcting themself – for own happiness as much as those who may see the Tweets.

So these events are hugely expansive, you have to cut the cloth as best you can. I focused on the event archive and then started to explore outward until I reached the point where time no longer allowed. I approached the data in terms of discourse analysis, using a division that Heracleoud (2006) came up with. This splits discourse into Communicative action; discursive structure – this is the use of shorthand etc .; generating common meaning – a shared understanding of human resources here which was required to join in, to be part of this event.  A particular problem here was “what is human resource development”. It’s really not well understood at all, real split and shorthand in academia around US approach (v. corporate and profit orientated) vs European (self empowering ideas). But that doesn’t work McKendrick suggests a hologram metaphor here. But as a professional body of practice there is no standard approach, it’s fluid and contested. There is no manual. But early stage researchers, professionals, customers etc. expect there to be a standard and professional approach.

It did prove very difficult to make sense of this all. I decided to draw on ANT and translation, different perspectives on the hologram of practice, to try to develop networks of people who agreed on translation. Those network assemblages reinforce that idea of professional structure.

I actually started off using the structure of these chats, using a structure designed for unstructured group work in classrooms. So here we see an initiation, then a string of suggestions or propositions to get conversation going. One suggestion dies out fast, another gets limited interaction, another gets little, another becomes complex and connected… it becomes hard to trace. And things migrate off into discussions entirely unrelated to the chat but using the same hashtag. Indeed you sometimes see tweets asking for help in dealing with what’s going on, what that structure is. Simpson (2005/6) talks about conversational flaws. A retranslation of a topic that allows the discussion to flow. Many of these suggestions and propositions with different levels of success in the example I’ve shown actually came from a single individual. But capture of the conversational flaw has to be accepted by the audience, it’s a two way thing, not just projection of power in itself.

There is some thematic structure that comes out of these discursive networks. And some interesting behaviours. A couple of examples here of participants dismissing topics. Kirkpatrick is a widely used model in this field, 98% of businesses use this model and yet these Twitter users mock it or treat it as a drinking game. Similarly classroom off-the-job training is treated as an irrelevant, old fashioned, dehumanising practice. Is made to look ridiculous through sharing of images of Victorian classrooms.

So I came up with these three areas of discussion here. Change – and change as being in deficit because of Managament & “the business”, and Human Resouce Development. But this mix was constantly reassembled and changed depending on who was speaking, what the topic was. So, the symmetry thing. The people organising and engaged in the chat recommend use of third party app, like Tweetdeck. So you get a multi column view and the same tweet can reoccur in separate places. For participants these chats are seen as part of much wider community of practice which they are involved with. So they have a single column for event but eyes on other things… and that all starts to merge as they take part in the event. So the technology changes how that event is consumed. It appears to potentially have some impact on how that discussion is shaped, which utterances are priviledged by which users.

So, the piper and tune… it changes all the time. It gets redefined during the events and over multiple events. The hologram is both restrictive and expansive – you have to see it from one perspective but has to be seen by multiple perspectives.

Q&A

Q) Norreen Dunnett: Did you get a sense around these hashtags that they felt like a community or was it more permeable than that?

A) I think there is a core group that sees themselves as a community, they see each other as different from others. e.g. in introductions they present themselves differently. And they try to capture conversations more often. Others are drawn in quite a bit. The core group are very similar and there is overlap between several hashtags (we looked at several) and different dynamics on different days and different times, some opportunity for social network analysis here. A slightly weird mix of community and permeable.

Q) I was really interested in your comments around the difficulty of sense making, of constraining data collected, of knowing how to deal with archives etc. But I don’t quite get the hologram metaphor, can you say a bit more.

A) McGoldrick talks about it as a reconciliation of contradictions around this. The idea of HRD as management and business discipline, about growing or improving the business, but at the same time it has the aspect of development, a discourse about developing employees, about learning and empowerment of individuals. The Hologram idea recognises all of these elements. Where are someone like Monica Lee says that we just shouldn’t even try to reconcile those factors. The metaphor is about looking in and seeing what you want to see, but also having capacity to see other things, that by shifting your gaze you can see those elements.

Q) Karin: Could I ask you a little bit more about the use of ANT and what it added in terms of your conclusions?

A) It’s a good question. I’m not sure what it adds to be honest. Other than giving a loose framework to hang ideas off. The idea of network assemblage works well, the idea of entry criteria (the Kirkpatrick issues) but also the symmetry of the material. I did some work on community managers and relationship between Twitter chat event and a presentation, that they moved from cynical position to enthusiast through chatter. But this work is part of a wider piece of work where ANT will be used to some extent.

Plenary: Working and playing on science Twitter by Greg Myers

Julia is introducing Greg and referring to his book, The Discourse of Blogs and Wikis – it visibly perks up her Understanding Media students apparently!

There is, btw, a real life handout! https://pbs.twimg.com/media/BHkEFg_CEAAG2Gu.jpg:large

I want to look at 8 tweets. This first is an Aprils Fools joke about Twitter. Mars Rover behaving like a celebrity on Twitter, sulking out of Twitter. So I want to talk about different kinds of Twitter feeds, different communities, different behaviours etc. Much of the quantitative research we do and cite looks at big stream of data without any reference to differences. But I know many of you here are focusing on qualitative aspects. Back when I wrote that book that Julia mentioned blogs were being treated the same, and that seemed so lacking in understanding of their distinction. So I want to basically ask… is Twitter really one genre?

Looking across the programme we have use in the Lords, tech companies, EU, academics, cricket fans etc…. can that be one genre? These are communities use Twitter for different activities. They do develop different kinds of texts, for instance, more or fewer RTs, URLs, replies. Even if they use some o fthe same feaures such as conversational particles. What are these questions and do those differences matter in our research?

So, why study Science Twitter? Well it’s one community: research scientists (there are other people who tweet about science of course). They are a great community as they have been good at networking since the 1660s. But they are an odd community for Twitter as they work in a system that rewards formal publication system, there is a divergence there. Networks of texts, also samples, people, skills, equipment, methods, money (see ANT). And Scientometrics links made by citations – impact comes from the idea that science is measurable in these ways.

There are two themes from Science and Technology Studies I’d like to bring in. Firstly the heterogenity if scientific networks – texts, materials, equipment, skills, publics, money (ANT, Callon, Latour, Law). But at the same time Rhetorical tensions between empiricist repertoire, impersonal and timeless claims in the formal literature <missed ref>.

But there is a huge volume of prior work on scientific texts, those working in science and technology studies would say what I’m saying about Twitter is really not new. So if we look at a quote from Henry de la Beche to Adam Sedgewick in 1834, published by Rudwick in 1985. It’s a very tweet-like exchange… BUT it takes place in letters and only becomes public when published 100 years later. And that is difference.

So in my handout you will see sample tweets. I have a corpus of tweets from Scientists, and a cohort of comparison tweets. See thelanguageofblogs.typepad.com. You might do concordance analyse such a corpus for topics – and here you will find keywords around science, research papers etc. But there are also other keywords that are telling. e.g.

  • But also I (less elision) – subjectivity indicated compared to other tweets)
  • of (more complex NPs) – more like other science writing
  • but (concession structure) – perhaps this is what makes an academic an academic!
  • may, maybe, some (hedged statements)
  • and a negative keyword: love – they just don’t use those sorts of terms, they evaluate in different ways.

So there are empirical grounds for seeing this community as a distinct community, but they also present themselves as a distinct community as well.

So, onto Tweet two! These tweets are at 10.30 at night – a “solidarity check in” to ask if anyone else is still working. Gets very playful responses. So this is “phatic communication” – that is communication solely for the sake of contact, not sharing information:

  • “still” implies this is late
  • “#GoTeam” parallels this to other shared projects (Merkhofer, Zappavigna)
  • “#ThursdayNightScience” invented for their shared activity
  • “Woot” – online gaming term
  • But where is “here” – well they are not in the same geographic place. The obvious inference being is everyone online, but also the less obvious inference is “in the lab” and this idea of solidarity.

And now to turn to Tweet Three. This is a series of tweets about a fieldtrip. Two kinds of time in these tweets. From inside “headed out” (current action), “my spring break” (current period defined in terms of work), projected future contacts. And from outside “early to mid Pleistocene” etc. Once he goes out he has limited mobile reception. We see a few images shared but of his campside at dawn, not his work but a sense of that experience.

Now to his return… tweets about a late flight. So more complex time. Present moment looking back to immediate past (nap) and forward to immediate future (this week’s lecture) and with very few verbs.

So, why do time references matter? Well they present themseleves as a community sharing norms, focusing on work, they have the shared ideas of terms, of routine, of publications. This is what everyone else does on Twitter but these references to time are a different representation of science work from that in journal articles or popularisations.

But there are more unique things here. Science tweets link A LOT and comment on them a lot. So here we have an example of something of a takedown of @drphilhammond’s tweet about children and screen usage. A response comes back with citation. response back from experimental psychologist criticising sampling basis and link to blog post. And others join in and say it was a mistake that that original cited article was published.

So the first post takes for granted that stating those two facts will be uncontroversial. Response uses irony to criticise. The response with textual citation (not a link). Response questions the cited work using a link. So reference does not settle the matter (unlike Wikipedia). And a real sense of reaction to this media person representing mediacal science (representation regress? – Collins).

Onto tweet 7, a chat about bafflement. It’s a kind of criticism. Sort of self-criticism of her own bafflement with technical term and time of day. But suggests something else to criticise there. See also LOLCAT on term “Thermodynamics” also representing confusion.

And our final tweet to the tag #chemophobia. This is about expertise. So an urban ecologist and science outreach person in Ohio. She doesn’t know about organic chemistry but has read something on a food blog about chemicals. Used #chemophobia tag, used for fear of chemistry going wrong etc. rather than toxicology. Asked for someone to fact check. Delayed response but eventually gets a response with whether this reaction occurs, does it exist, etc. Comes back with paper. Then she responds saying “but both chemicals in apples, naturally occurring”. SO I’m interested both in the intersection here between science and non science, but also science and science in another area to your own. So the original questionner is not a chemist and raises a question that acknowledges that. A (non-addressed) chemist responds with brief unsupported evaluation but she comes back with her own analysis to that.

So… two kinds of evaluation across these samples. There is personal stance (e.g. Yup, 7am, wildly wrong). And there is impersonal reference to shared norms of methods, citations, rhetoric, publication – the idea that critiquing an article that breaches norms settles the issues. And also hierachies of authority worked out for present purposes in the exchanges – this is where specific mentions become so interesting here, the idea of certain individuals as authoritative sources.

The implications for science stufies cover two themes: embedding of science in everyday life; rhetorical application of norms of evaluation to texts. Non-scientists would get useful (and entertaining) view of science by reading these tweets. It contextualises science in everyday life and work. But that message probably doesn’t get out. I was struck yesterday by the idea of a bubble around the European Commission. I think some of the same here, the tweets tend to go to scientists, science communicators and science educators. There is a large and somewhere sealed off world here. Almost no replies from outside. And many of the tweets are concerned with boundary work. They both open science up but also maintaining it’s distinctness, it’s inaccessibility.

But other questions here… If you do see this as a community, how does this compare to other communities studied here. This is a community that thinks of itself as a community. How distinct is the genre? Do they use affordances differently to others? Do they have different practices or simply a different register, different works because they talk about the same thing. And how does that relate to other kinds of practices. No point to study in detail unless it relates backs to other things. The people I look at are teaching, writing articles, outreach. Time patterns fit that. Other communities have different time cycles to them. In my sample I chose widely followed people. Between 3000 and 10,000 followers. And for most of us that sounds pretty good, particularly if you are tweeting about obscure aspects of astrophysics. Their authority on Twitter is about what they do there. Some are authoritative in their field but many are not as influential outside of Twitter. And finally how permeable are the boundaries of the community? Not very perhaps. Some other communities may reach out more, particularly in terms of followers and retweets.

And finally a picture… A fluffy toy and his genetic biologist look alike.

Q&A

Q) Ruth: My question is simple and small. You spoke about the corpus, what was your reference corpus?

A) I chose a corpus from another specific group, rather than all of Twitter. I chose 10 scientists and I chose reference corpus of 10 others tweeting on very specific topics (similar number of followers but all tweeting on differing topics). Roughly conmparable. For most recent samples I used the same people.

Q) Me: You talked about the exclusivity

A) For some they are. One is into women in science and feminism. Another is into hip hop (and science), so lots on music. So they tweet on other topics but they seem to have an exclusive type of engagement and response on their science tweets. I suspect that many of them have lots of followers because of those other interests but that mixture of interest isn’t represented by crossover of audiences responding.

Q) How do you define “community” if at all?

A) These individuals refer to it as a community, e.g. “ThursdayNightScience”. Now at conferences they are very specific but the community they refer to communally here it is all about belonging to this giant scientific community – which doesn’t really exist elsewhere and doesn’t respect traditional hierachies. Quantitatively you could see the links between them to see a nice graph.

Q) Sean, Lancaster: You talked about the boundary work around expertise. But could you see these as breaking a boundary, reaching beyond expertise to others, as in TV shows and spoilers in Jenkin’s Convergence Culture?

A) Maybe not about hierachy. It’s not about this person has a right to speak, and this one doesn’t. Quite often people ask for help from anyone on Twitter, and get a fast response/advice. You are calling this a boundary hybridity, that’s probably fair enough. Thing you see in ANT all the time. Boundary object that means different things to different actors but both can use it in an interesting way.

Q) Monica: I belong to scientist group which is more active than most. I wanted to ask your opinion about what is happening in Britain, the GMC has new social media guidelines for doctors which does not allow them to have anonymous accounts because of the privacy of information they hold. There is protest around that, a petition about that. And that is reasonable on commentary about economic cuts etc.

A) I haven’t studied doctors so not aware of professional structures but I think it’s a shame not to have the possibility of anonymity. Only one of those I am looking at is anonymous @scicurious but that allows for lots of playfulness. There have been a lot of articles where Twitter has acted as critique of peer review process, taken down research in hours. No less than Dorothy Fisher has said, ok Twitter may have a role here. Anonymity does seem to have a useful role here.

And now for lunch…!

And we’re back…

Twitter as professional practice: A case study of cricket journalism: @aggerscricket by Julia Gillen

I want to introduce a cricket journalist called Jonathan Agnew (@Aggerscricket). My work draws on David Barton’s work of “technobiography”, a very socio-cultural dynamic view. Agnew is migrating from the role of journalist and public person. It is framed as a personal Twitter account, views his own, but his website is the BBC and there are lots of pointers that this isn’t a regular Twitter user.

I’m taking a media ecology view (Barton [1994]2007 – ecological view of literacy) here but I’m quite critical of it at the same time. Postman 1970 sees it as a sort of moral issue. Nystrom 1973 sees this as a study of complex communications systems and environments, interested in interactions, technology, technique and human process of emotion. And <another key ref missed>.

In 2006 the UK Parliament decided that television coverage of international cricket should be on free to air TV. But the rights don’t always go that way. Sky have the rights for many of the live TV coverage leaving the BBC with other means, predominently radio. A key thing to say about cricket. I am talking only about Test Cricket, which takes place over five days, frequently ends in a draw, and much of the time very little is happening and there are unscheduled breaks. And that could mean dead air… so there is a great traditional of literary coverage of cricket, it’s about much more than the sport.

The cricket media ecology… start with The Economist and a comment about surprise that the Test Match Cricket on the radio was still running, and that it is now on the web. But I wanted to dig more. To see what attitudes Agnew displays towards Twitter including relations to other communications, I’ll be talking about my methodology as well.

So I have taken a sensitive ethnographic type of approach. I started in March 2010. There were interactions on the website. There were some sample tweets. And then in 2011 Agnew wrote a book and he was also enthusiastic about Twitter so I analysed his media coverage. So I did some intense data collection. Starting on 10th August 2011 I collected all tweets in real time, who he replied to etc. preserved in a word document. Also looked across other media. Some other samples here and there. In part my approach was flexible and did change as I collected through it. For example… Agnew tweeted “15 mins to live chat” so I went and joined that and recorded that to see what that was like. And one of the things that was interesting there was seeing attitudes towards Twitter expressed in other media, and how those media related to it. So he receives a direct question about Twitter. He says he enjoys it, he likes comments during his radio commentary. But also implicit references for those also on Twitter. Makes a comment about his dog in the sign off… if you only interact in other media it’s not noteworthy. But if you follow him on Twitter you know he plays with his dog on Twitter – interacting with an account someone else set up for his dog.

So I’ve coded all of Agnew mentions on media in the book. It’s ostensibly about cricket but he is very interested in media. Radio gets the most mentions but, notably, Twitter gets the second most mentions. At the time “Aggers’ Ashes” was promoted via what he called “Twitter tour”. Related to that the @theashes follower situation arises and he melds that offline persona to the online persona of that Tweeter. He also playfully recommends follow another Twitter user who does a better job than him. He doesn’t do that. When I followed Agnew on Twitter in real time he went from 7am to 8pm (he barely stopped for lunch). So I followed him on 10th August 2011, the first day of a test match but the day after the riots. One of his followers tweets about the weather, also about the riots. His response *just* addresses the weather. Listening to the radio commentary it becomes clear they have been told not to mention the riots. But they elude to it, you wouldn’t understand comments like “I can see smoke in the distance”, for instance, without that context. And you see him respond in an authoritative way about the ECB confirming test match goes ahead as planned (meaning both weather and riots).

He does engage in arguement sometimes but, for instance, shuts down someone who tweets what seems to be a homophobic insults. Bourdieu (1999) takls about “difficult spots”: “difficult to describe and think about”. Only looking at a much larger quantity of data did I realise that this hadn’t been a homophobic comment but a reference to a co-commentators criticism of another team.

There are many ways Agnew involves others in his construction of stories. So pulling out the key tweets around “Moussaka Special”. He comments “treating the wife to Moussaka Surprise. Theory being devastation in the kitchen means I won’t be asked to cook again for at least 6 months” – refers to specific type of comedy. “The Wife” is a term he doesn’t usually use so that’s a reference to a type of comedy. The narrative builds. tension mounts… then two surprise tweets, the moussaka is good, and she tweets too and joins that narrative!

So really interesting narrative construction; and merging of online and offline.

Agnew did leave tweet for a while. He gets lots of abuse but attack by another author seems to be what drives him off Twitter for a while. I’ve used this idea of “change agent” (Mullins, Kozlowski, Schmitt Howell 2008) as he is quite influential in this rather traditional cricket world, and he has that trust to be that change agent, something you see more widely in adopting in Twitter. And you see these realms of onstage and offstage personae and performances on Twitter and intersections between them.

Q&A

Q) Did you tell him you were doing this?

A) I did, he didn’t acknowledge it – sort of said a few times but I’m there in a huge (200k) audience. I’m still planning to send fully written up version to him for comment.

Q) You talked about Agnew moving from journalist to public figures. As you went through data collection did you see marked difference in tone of Tweets as follower numbers grow.

A) I’d say no actually. But by beginning of 2010 he was already well on the way, his role attracts a certain amount of attention anyway. He was already a skilled user of Twitter, building stories, orientating to audiences, some moments of tension, the abandonment. But interplay generally there throughout. But will think about that more carefully.

Q) One of the things I thought between celebrity and “real people’s” use of Twitter was how much they retweet them. One thing you could look at is how much they retweet, distribution phenomenon. Also terms of address that vary between celebrities and others. Celebrities address a collective group of followers.

A) I’m not sure he is really a celebrity here. He’s more a personality.

Comment) I’d say microcelebrity eg Alice Marwick

A) Certainly he has a lot of interactions, requests to retweets, lots of iteraction. Your comment does give me an angle. But I’m not sure celebrity is the right work.

Comment) Perhaps about tipping point here, not about celebrity status but number of followers.

A) I think Agnew is about driving change etc. He did start encouraging web 2.0 use because you could speak to everyone but actually he has become someone who extends the broadcast models.

Comment) He is followed by far far more people than he follows, classic broadcast/transmit model. Reach is the wrong term or concept but that broadcast behaviour is something that is different.

A) yes, I do make that comment in my conclusions but that’s a fair observation.

And now its me so the liveblogging stops!

An analysis of professional exchange and community dynamics on Twitter around the #OR2012 conference hashtag by Nicola Osborne and Clare Llewellyn

The Prezi is here and I highly welcome comments!

But the in-person version sparked lots of questions so I think it went well and hope some attendees pass on their own use cases to the Twitter Workbench team (I’m happy to pass comments on!).

Authenticating leadership ‘like a boss’ by Tom Van Hout

Like a Boss is parody rap. Everything if followed by “Like a Boss”. Massively popular, huge amounts of tweets about this. A tweet evergreen if you like. Refers to finesse or authority. Or just their way. People, animals, objects. e.g. Many presenters have presented Like a Boss at this conference!

Leadership discourse. From management and business studies we know that leadership is about meaning. There is the transformational leader, the charasmatic leader, visionary leader. Often quite gendered as well. There is a lovely paper by Baxter in Journal of Social Linguistics in how female leaders shift in and out of various “role traps”. The Like a Boss phenonemon ties into these aspects.

Vernacular writing is an area of huge interest, this is inter personal in nature, spontaneous and unplanned, dialogical by default. The Like a Boss hashtag and tweets displays some of these. So an example here – a micromeme here. But we see lots of sharing of quite mundane achievements. Also identity as heritage discourses (around gold star stickers). Also performances of being a boss, judgements of enoughness. Now vernacular writing, why are we drawn to it? Well the rational is pretty straight forward. More people write than ever before. They write more. Digital media enables that. And as we all saw yesterday during weird Q&A session writing goes public. And finally my interest here is about identity practices. Here we see really diverse identity practices and the use of “emblematic resources” <ref?>, and an enormous range of features that could see you doing things “Like a Boss”. From buying shoes, getting out of bed, high-fiving a shark. But the meme does cohere and what makes that happen is that these are forms of self-presentation and performance – identity-as-heritage (cultural capital – like the star stickers). And identity-as-creative-play – about subverting the norms, creative play. Cue discussion of a dog on a wingback chair – why is it funny? It’s anthropomorphised, it’s a superior expression somehow, pokes fun at self-styled ways. But I’m really interested in how that meta commentary, how far that ranges. Some of these self-performances don’t work. What is the benchmark for being or doing Like a Boss does. About “enoughness” – the resources recognised as successful. Not fully developed but… online, on YouTube, views and comments etc. index what it takes to be a Boss. That measure is policed online constantly. Ideally we want an indexical range of how you perform this successfully.

In conclusion. We approach digital communication in an ethnographic perspective. We look at vernacular cultures and cultural politics. We look at leadership discourse – management of meaning and language game. We look at vernacular writing – networked writing, scale. and we look at identity practices – enoughness, self-presentation.

Q&A

Q) Isn’t self-tagging yourself “Like a Boss” a form of Index. So the measure is whether accept it as such?

A) Yes, the tweets identify successes in being “Like a Boss”. There are wannabees. The data we have… the notion of enoughness divides those who can (e.g. retweets) and those who want to.

Q) In what way is Like a Boss different from Like a Pro? In terms of semantic features.

A) Like a Boss calls on more cultural toughness etc.

Q) Can I ask about “enoughness”… if you use rankings, popularity, as a measure…?

A) As one measure…

Q cont.) So how do you measure that they really like that. And secondly what other measures

A) No answer yet.

Q) RT as a measure of goodness for a tweet. About variation of users. theoretically scale free in Twitter. 30 RTs would be the best day of my life for me, for some that’s every day. How do you account on that?

A) We don’t want to look at celebrities. Only normal folk.

Q cont) But a continuum there. Some random people have 20k followers you know?

A) Imagine we will, and others have done, look only at a particular range of followership.

Me) Compare RTs for other stuff – so does the Like a Boss stuff

Q) Vernacular use of Boss is very different isn’t it? Gaming connotations? Slang connotations? Not just leadership here? Maybe why more sticky as a meme

A) Sure, those come in.

Comment) Like a Pro just not  as grabbing, those other uses include “that’s Boss!” a very 70s and 80s style.

Discussion breaking out around whether “Like a Boss” is a valid tag for some tweets in Tom’s data. And then on Pro vs Boss. Boss being more contentious, declaring power over others, superiority, dominance etc. And that others don’t care how their seen – cockiness, single mindedness, self-belief…. and now into the “great man” theory of leadership. Also discussion of ironic use.

Tom: I have been able to collect different things here. Initially tweets but that is harder with the API. Collected some

 

The personal in political tweets: The use of Twitter during the 2010 British and Dutch General Elections by Todd Graham

There are three players in this twitter research: politicians, media and citizens. I’m going to specifically look at how politicians behave in social media. Today I’ll look at the UK General Election but we’ve also done work on Dutch elections and on-election periods.

We took a sample from a 2 week period (April 26th to polling day, May 6th 2010). We focused on main three parties. Any candidate who tweeted in this time. 19% of conservative candidates; 22% of labour candidates; 26% of Lib Dem candidates. And a large cohort of tweets collected distributed unevenly. We did content analysis with a team of 6 trained coders. The unit of the individual tweet. The context unit of analysis was the Twitter page – that conversation. You needed that context to code them correctly, that was crucial. We had a dozen or so categories but I will focus on 4 main ones. Type of tweet; interaction with; tweet function; tweet topic. See Graham Broersma Hazelhoff and van’t Haar 2013 for the statistical analysis. I’ll be talking qualitative analysis.

Firstly to say something of the three Prime Minister candidates. Looking at frequency of mentions. 22nd April and 29th April see a big spike. Those are the last two TV debates. They had a substantial influence on tweeting. Also an increase in tweets towards polling day (less sharp).

In terms of tweet types this was basic: normal post (48.2%); reply (?%) ; retweet (?%); retweet with comment (?%). Huge difference between conservatives vs lib dems and labour. Latter two parties interacted far more. Who were they interacting with? The public lagely, politicians (mainly own party but some debate from labour candidates), party activists (hardly ever conservatives), media, etc.

We looked at tweet topics. 80% of all tweets were about Campaign and Party Affairs. Very minimal policy talk by comparison. Some “other” chat as well. The function varied more broadly. Lots of tweets about the campaign trail (23.1%); campaign promotion (20.9%); criticism and arguing (22.9); acknowledgement (9.7%); other functions less substantial. So a typology here. We saw Broadcasting with 5 behaviours: updating; promoting; party stance; etc.

Updating accounted for 23.1% of tweets. e.g. those shown from @Jeremy_Hunt and @andrew4mk. Perhaps this isn’t an unusual thing to track in the media. But some politicians did this in a more novel way. Some gave a sense of closeness, of being part of the campaign, of knocking on doors with them…

Promoting was around 20.9% of tweets. This was largely about promoting the party or the politician, their success, their performance etc. But the Lib Dems were quite interesting. They promoted the most, but they were also really creative.@CllrDaisyBenson called out for non-celebrity endorsements and got great personal endorsements and responses.

Critiquing (17%) tended to be very superficial, about style, performance, rarely any substantive critiques of policies. Created polarised Twittersphere. Really party orientated so particularly polarised during the debates for instance. Politicians complain about this BUT they do this. And a number of followers and members of the public called them on this. Stephen Coleman and Dan Jackson’s work finds that this sort of discourse puts citizens off politics.

Interacting also had 5 behaviours: a range here so:

Attacking/Debating (9.9% of tweets): these were again superficial attacks. Typically one-off exchanges. Extended debate was rare. a yell each way but no further. Ironic as the debates – which triggered many of these – was supposed to open up debate but rather shut down here.

Mobilising/Organising (3.7% of tweets): Labour and Lib Dems really led this. @DrEvanHarris tweeted about 1400 tweets in this time period. He was chat up new followers and then ask for their help in campaigning. Candidates also shared behind the scenes type tweets.

Advice giving, helping and consulting (3% of tweets): about connecting with the citizen. Consulting with the public was about 1%. Conservatives not tweeting in this way. There are maybe 7 or 8 candidates represented in these advice and consulting tweets.

In terms of Twittering about their personal life. We had a code that just marked tweets as “personal”. e.g. Louise Mensch tweeting about running. But we are recoding those as political and personal often overlap. We also want to code the personal. But looking at some samples we saw use of personal in combination with updating – combining campaign trail with personal life note (cats and chocolate + envelope stuffing). Using personal to promote. Very common with those with young children. Tom Watson tweets about his 2 year old. Using personal to attack and critique – using kids or pets to raise funny attacks. Using life experience to draw attention to a particular issue – eg “My aunty tells me”… or use of personal experience to support arguement over several tweet discussions, e.g. on Trains.

This work is still progressing. Analysis ongoing. And looking at that mix of personal and political.

Q&A

Q) What did you do about tweets with links, e.g. to policy documents etc.

A) I have a whole codebook to deal with these. so lots of politicians tweeted links to newspapers with an attacking headline – so we coded it as attack. Or links to policy on blog – coded for that function. We basically clicked on the links and coded that up.

Q) Two questions. How did you separate whom category? People act in different roles. Also for functions categories – literature theory or your own coding?

A) Coding scheme is a combination of factors. Reading the literature. Inductive coding, also Darren’s work on politicians on Twitter influenced our coding. And we had four steps in that process of checking role of the people being interacted with. Context helped, the coders clicked through to profiles, sometimes Googling that person was required, we made some rules and procedures for that.

Q) Did the party train/constrain/orchestrate these for them? If not it seems really interesting to be outside the party machine as other publications are?

A) Labour and Lib Dems had a campaign coordinator for social media. And they had keen early adopters amongst their candidates. Conservatives not so much. Found suggestions they didn’t have much.

Me) Conservatives did hire someone for social media some time back. I’ll see if I can dig that out.

A) There was a piece in Wired… may be same one…

Q) What about images and coding?

A) We did code images in tweets but there were not many of them.

Q cont) Often they mess up that personal thing…

A) We coded them as critiques on the whole (one critiquing own party leader!)

Q) Political tweets are domain specific data. Can you develop something automatic based on your work?

A) We are doing that now, have multidisciplinary team. We are working with people in linguistics. Taking out 60k tweets and having programme learn from those tweets to try and reuse it for the next election. But context would be the tricky thing here. It was a big part of the coding process.

Comment) Programmes great for explicit texts. Much harder to deal with irony. But it’s machine learning and you need a small training set hand coded, then software learns from that and can be applied automatically to wider data set.

Professional Twitter Panel

This is our final session of the day and is about Twitter use by those with “between 1000 and 10,000” followers on Twitter. Pre conference tweets suggest that’s a slightly controversial grouping but we shall see shortly…

Participation was sought in advance for this session and there have been two tweeters coming forward. One a Gothic Literature academic, the other @scicurious who is a prominent science blogger.

So this will work in two streams. Already have discussion on #lutwitrc up here, we’ll also keep an eye on #lutwit tweets.

I started by asking about when you find time to tweet. If you don’t tweet much it seems there is never time. If you tweet a lot it seems like a non issue. @scicurious just tweets regularly as part of the day – it runs in the background. Johnny is commenting that he tweets intensively at conferences like this, but less so at other times. But do retweet then etc. It varies a lot. When reading or writing tends to be less so than when doing things like marking and want a diversion. Several other comments. Me, tend to have in background but real morning/evening rythems. Comment that Twitter has made tweet archive of your own twitter archive to see clear patterns there. Greg: I asked Dorothy Bishop, Professor of Psychology about why so many tweets are at 7am and she said “well academics are filling in forms the rest of the day!”. Penny: comments on extreme unevenness over course of day and over longer term. Comment: also about how long you are on Twitter – count has different meaning. A measure of total tweets varies radically. Maybe come back to that as very active tweets mean something differently. Julia: we are assuming one person is one account but you may be tweeting to many accounts and time intersects with identity, not neccassarily 1:1. David: I think Penny was suggesting that we treat Twitter like other communication – we don’t measure how many chats we have in a day. Me: but the account is there, so tempting. But presence is inferred from tweeting as well, so not tweeting suggests not present.

Discussion of time to write a tweet. Rebekkah Kills says it takes no time. Comment that reading takes time, not tweeting. Takes time but not much. Greg: interested in issues of timing… e.g. cultural norms around eating and talking at the same time. Me: it’s about context here, what are expectations. For a personal tweet I may have hours to reply, but for a professional tweet or an enquiry to a service twitter account expects fast replies. Greg: I am constantly surprised, but shouldn’t be, about the speed of student email replies. So fast. Four hours can be slow. Comment: my girlfriend tweets and uses facebook and kind of sits down to “do facebook” and I think people do do that…

Greg: so that takes us into other media use. Johnny, do you have the same posts going to Facebook and Twitter? Johnny: no, I used to, then used to unlink now totally unlinked… Comment: I’m wondering about Facebook and Twitter and the idea on Putnum on strong and weak ties. greg: your practices change depends on speed of tweeting but also how many people you follow. Julia: I do find the temporal aspect changes. Facebook bring different time zones in at different times. Synchronicity on Twitter puts me out of touch with those in other timezones. Greg: I take part of Flickr365 and we do a daily update but we all tend to post at the same time, it’s a daily rythem to keep up with. Different to always on. Me on ties: I think strong and weak ties exist in Facebook and Twitter, not a clear distinction. And also timewise there are timezones of interest/habit beyond location – e.g. those staying up late in UK vs daytime in US.

Onto anonimity and pseudonimity. Me: recommending Violet Blue’s writing on Google+. Greg: BitPHd is a blog I read and it is very successful but wouldn’t be possible with real names for personal comments could damage tenureship chances. Any other opinions? We are all academics pretty much, do we assume freedom with our employers here? Comment: if you google me you find an old account which is an issue. Johnny: I feel ambivelent for adding “views are my own” on my profile. I think there is no need for there to be an issue there. Karen: don’t think that “all views are my own” thing, doesn’t cover you legally. Law hasn’t caught up. Look at UCU’s advice on social media. There have been cases no matter the disclaimers. Tony: social media used as brand management and reputation management, tricksiness around that. It’s problematic. See link tweeted to my study. So disclaimer not a bad idea from personal perspective, distance self from institutional or departmental position. I use it on my Twitter account. Me: its about setting expectations, perceptions. Comment: I’m from China and think pseudonyms are safer. greg: my research showed that people like to see a face, a person, not a blank or object avatar.

So, moving onto impact… I’ve thought about number of followers and of retweets. Is that impact? Is something else? Me: well Klout may be silly but that idea that followers, activity, replies, retweets and the presence of influential people in your network and their engagement with you has some merits. Klout have interest in numbers for marketing reasons… David: Why would you do that? Same concerns as impact in any area of academia Greg: I have a practical reason, for sampling. For journals you would check impact factor in choices. But maybe an issue about impact and influence being perceived as about marketing is an issue for any sorts of measures here and how they could be perceived. Todd: we’ve been interviewing dutch politicians, for them getting tweets picked up by news media is key. For political leads and celebrities the pick up by mainstream media might be a measure. Comment: opinion leaders matter here too, not about counts. Greg: indeed, concepts move beyond Twitter, to concepts from before Twitter around influence and impact here. Comment: impact in my work is about reaching sources, spreading news etc. in crisis organisations. Organisations mobilise resources on the ground through very influential presences and accounts.

Final topic… We had a question from, I think outside the institution, am I OK collecting Twitter data without institutional data? Apart from all the Terms of Service, Copyright, what are the ethics of it? Start with gut ethical issues and then move onto legal issue. David: yes, it’s public but designed for Twitter. The same as saying… would you have ethical issue with using voxpop in a newspaper and then reprinting it. Greg: I don’t see problem with that. David: an ethical issue… it’s recontextualising it. You need ethical approval for closed Facebook posts, but media does use and attribute that. Greg: do we all agree that Facebook is clearly private and needs ethical approval? Tony: issue is expectation. Facebook has expectation of privacy. Issue for Twitter is whether people quoted assume the audience is public. People get it wrong. It can be ticklish. My feeling is that hashtags are somehow signals active participation in a conversation. David: On some sites you might feel you are posting to that site but you are reposting to Twitter. Similar issue to local newspaper vs pick up more widely. Could feel very different. So continuuation of same ethical issue. Comment: Really good article about this by danah boyd about levels of publicness in social media. Applies here. Looking at hashtag corpora some feel private, single post. Some massively public hashtags with lots of responses. I tried to anonimise usernames but near impossible to anonimise tweets themselves. Have to be savvy consumer but can we expect that of everyone. Julia: ethics matter in social media, just as anywhere else. And getting to know Terms of Service matters. Looking through Twitter not all ToS are as clear as they could be but you can communicate and ask questions. Really different attitudes to text made public and screenshots. Many layers here. You as researcher in institution, the wider space and world expectation, and the legal side of things. So many layers. Can be tempted to think we are insulated from these kinds of things, a world you should be just as careful as. Johnny: the Twitter ToS directly conflict with the idea of anonimity. Twitter makes it hard for researchers. Comment: need to separate ethical and legal issues here. We should consider the ethical issues for ourselves. ToS is about brand not users. Researching, taking data for analysis, and then publishing are two different things. And publishing may be at a conference (temporary, closed) but on slideshare say that’s difference, or in a book or paper that’s different again. Comment: can we separate legal and ethical issues here really. Issues of good faith, of relationships with data providers etc. Greg: I deliberately left this to the end as I think it would be concerning if, because it easier ethically, all analysis was quantative, but there are many issues to discuss… and those discussions will hopefully carry on at #lutwitRC and #lutwit.

And with that – and my apologies if the notes are a little hard to follow, I’ll mull a better format and may update accordingly – the formal sessions ended  and we moved into the evening with the launch of Language Online: Investigating Digital Texts and Practices by David Barton and Carmen Lee followed by a lovely conference dinner for continuing those (and other) discussions… More from the final day of the conference appearing on the blog tomorrow.

Apr 102013
 

For the next three days I will be blogging from the Lancaster University Twitter and Microblogging Conference, where Clare Llewellyn (UoE Informatics) and I will be presenting our work on analysing #OR2012 tweets tomorrow.

Keep an eye on this post for notes on today’s talks (programme here) once things get going at 11am. The Twitter hashtag, #lutwit, will probably also be very busy!

Introduction to Twitter and Microblogging: Political, Professional and Personal Practices – Julia Gillen and Johnny Unger

Julia is introducing the conference by outlining the interest in this topic at Lancaster. Greg Myers work on blogs and wikis and Nathan Jurgenson’s work on digital dualism, and Julia’s interest is in political discourses. For Julia some of the motivation for today was attending AAAI on weblogs and social media in Dublin in June. In some ways a super experience, cross-discplinary and people from Twitter there. Went to each o fthe Twitter sessions and listened to what they said. Some real overlap with University of Maryland – academics embedded in Twitter. Lots of people at the conference analysing large data sets in social media, and journalists there using social media. But I was uneasy at the event. Why were the Twitter, LinkedIn, IBM poeple there? To analyse the data for profit. Academics had some research interests there. Lots of the research was linguistics research and corpus linguistics but those presenting never used the terms. So there was a need to really examine Twitter from different perspectives, from different linguistics perspectives, real world problems and issues. There are many motivations, academic and otherwise, and a really good mix over the next few days.

Johnny Unger is now giving a brief practical overview of the conference and the Lanyrd site which acts as the authoritative programme. He is also introducing this afternoon’s Twitter Q&A with Nathan Jurgenson who will be joining the room from Twitter and video feed and suggesting we read two of Nathan’s papers ahead of that:

We will be able to ask questions ahead of or during the session (and I’m sure they are welcome from outside the room #njqa) via Twitter or in the room via Johnny.
And now a comment from the chair of tomorrow evening’s Professional Twitter plenary session, with Twitter users between 1000 and 10,000 followers about #lutwitrc (rc for “Reality Check”) engaged, discussion in a talk here and with tweets in the background. This is more on practicalities of Twitter and Tweeting than on the academic side of Twitter. Again I’m sure input from others will be welcome.
A quick comment again from Johnny: please do tweet. #lutwit is the general hashtag for the sessions. Johnny suggests adding GF4 for lecture theatre 4 etc. If there’s a better system etc. that’s fine. Some special hashtags already mentioned. You can tweet any questions about the event to @lutwit13 (or ask in person).

Plenary: Online freedom and repressive law: The paradox of digital journalism by Lee Salter, University of West of England

Lee’s research loooks at interactions between new media and traditional media. Julia came to know his work through the book he co-authored with Janet Jones, Digital Journalism (Jones & Salter 2011). Lee is also in town as his film is playing in Lancaster tonight.

I wanted to start by pointing to issues we may be talking about at the rest of the conference. My issue in social media is both how it relates to traditional media but also around the paradoxes social media can lead to. I want to focus on some of those issues through some of the more controversial areas where social media has been used.

So as we know Twitter is lots of different things to different people, it’s an integral part of modern journalistic toolkit, a ranting space, a means of sharing links and photos, and it’s a campaign and protesting tool – which I’m particularly interested in, and those discourses around this.

In the book I don’t go for hyperbolism or doom and gloom. However there are real extremes in the coverage and discourses around social media. In the coverage of the Mumbai massacre Tom Sutcliffe, writing in The Independent, rallies against the coverage on Twitter. Of course months later the paper were up and tweeting. But discourses of hyperbole does need to be modified, reined in. Twitter grows out of the use of UGC by journalists. Anar Thorson argues that Moldovan and Iranian elections see Twitter being used to generate news on the ground, “a hub for first hand accounts”.

More recently we see a journalists in a hybrid environment – breaking stories on blogs and twitter before appearance on broadcast or print. Thorson sees the coverage of the election on blogs getting greater viewership than maintream press in some cases. The criticisms of Twitter describe it as nonsense, as repition. Thorson moved to work on the “Arab Spring” and he cites NPR’s Andy Carvin as one of the paragon examples of really good tweeting. He tweeted so rapidly that Twitter tried to delete his account as they thought he might be a spammer. He used Twitter to collect first hand accounts, to verify rumous and discussio, to gather fast moving information. Benjamin Doherty critizes Carvin work as he says that he couldnt see any other NPR journalists working with activists and protestors against the Israeli state in that sort of way, keeping their job.

I want to talk now about whether Twitter really has changed corporate communications, how journalism has changed, what the power relations are.

It’s notable that journalists, as gatekeepers of information, tend to reflect existing power relations. The suggestion is that Twitter and social media have changed those power relations. Thorson argues that journalists do have incentives to protect the traditional press role but that use of social media is changing those power relations. Chadwick argues the change of relations between elite sources and elite journalists, particularly in terms of temporality. The routine news day is based around routine deadlines. Powerful sources of information can sync and interface with those times of day in order to feed the news system. Social media do not follow those patterns so closely, the news routine is interrupted. But I’m not actually sure that interruption is actually occurring. We know that participation and influence is limited by the resource base that people use. The range of people that use Twitter is limited, those that use it effectively is even smaller. Pew’s research in the US finds only 15% of internet users using Twitter, only 8% doing so daily. Only a quarter of Twitter users have ever tweeted. In the Uk research shows Twitter users as being higher income. But in the US Twitter users tend to be younger, female, urban, and more black and hispanic users than might be expected. There isn’t one profile here.

Bruns and Burgess, et al. (2011) [thanks to @mdanganh for the clarification and URL] found Twitter use in Australian elections being about traditional power and professional relations, with social media amplifying those central conversations. But activists are now engaged in loosely coupled relationships with newsmakers, the majority of those interactions have little impact, but only a small number of Twitter users make a real difference, push the conversations, etc.

There has been little work on Tweet’s intregration in mainstream media. Thorson writes not about Twitter but about “Have your Say” information and the BBC during elections. The use of this tended to represent middle ground. Wiki News did things a little differently, they required a neutral point of view. But what does “neutral” mean, especially in controversial areas. Twitter is obviously different to these two spaces. It isn’t censored, it isn’t moderated. A colleague who writes for the Guardian was complaining about Twitter as a chaotic medium, that they integrate it in a conventional linear narrative, but it doesn’t fit with discursive structures of journalism.

Now when I mentioned the Arab Spring, a lot of the attention of journalists and scholars looking at the Arab Spring has been a focus on the particular conflict in the middle east and North Africa. If Twitter had really penetrated the mainstream I think we would have seen different discourses. There has been a real role for Twitter in Syria, Libya, Egypt and Tunisia, and we have seen that covered and rather overstated, but we don’t see coverage Bahrain or Saudi Arabia, say. There are various assumptions we can make about the focuses… social media tends to be used to illustrate what is going on by particular groups, very media savvy protest groups and a need to communicate to the outside world via social media, then bounce back via mainstream media (e.g. particularly in Egypt). So the discourses of social media here fitted with the mainstream news agenda and that’s why it had leverage in some areas much more so than others.

We know that Twitter power relations reflect mainstream roles – mainstream journalists have far more followers than others. Because social media has been used within the mainstream news frames. We see parallels between protest in the middle east and the UK at the same time became reported by mainstream media. I’m going to play a clip of reporting of conflict in Egypt – the images, the placement of the camera, the emphasis, etc. here (from World News Today). The interesting here is the way that the conflict between the protestors and the police are represented, the alignment of the journalist to the protestors, the reference to petrol bombs as defensive, the slight giggle in the discussion of rock throwing. Compare that with a clip from the student protests in the UK at the same time (BBC News), this time the footage includes a dismissive nod towards attacks on students by police here. Social media told a very different story to the new coverage.

This notion of communicative freedom that takes place… when Tunisia and Egypt shut down Twitter and Facebook there is outrage. In the UK David Cameron talks of shutting down BlackBerry, Facebook etc. raises far less media question. But the question

Andrew Cameron and Barbara Tool call the “Californian Ideology”, the internet is just there, it’s natural as John Perry Barlow says. Now that is not a reality, it was set up by government, by institutions etc. Laurance Lessig (in code and cyberspace) talks about political regulation and economic regulation, and norms or hegemony. And I think these may explain why Twitter regulation isn’t neccassary. The traditional public/private dichotomy is disappearing, it’s broken down, but law hasn’t caught up with a paradoxical medium like Twitter. Nor do the users. Take Paris Brown for instance, elected as a Police Youth Comissioner who is 17, and people found tweets from the ages from when she was 14 or 15… they included crude comments about sex and drugs. Do we really expect 14 year olds to understand that dichotomy? She stood down yesterday. Andrew Brown wrote some nasty comments on Facebook and was jailed for 12 weeks. An 18 year old was arrested for making comments about Tom Daley’s deceased father. And of course there is the case of Peter? because an off the cuff comment on Twitter was taken to be a terrorist threat. People aren’t very aware of this dichotomy, and the outdated law is the major problem here. Many of these cases date from the 2003 communications act. This allows prosecution for comments deemed to be offensive, obscene, etc. comments. Many comedians do not get taken to court or taken to prison for comments of those type (e.g. Frankie Boyle or Richard Littlejohn), whilst the same types of comments on Twitter and social media seem to be taken very differently. Which is particularly odd given there are no specific press freedom laws here. “improper use of public communications network”

Mitchell Stancombe was jailed for three years for a tweet asking when the riots would start in Southampton. The idea of incitement to riot predates these media and the punishment has been disproportionate. And discourses did not make it into the mainstream media. Tweets such as “This is what happens when you consistently opress [sic] the youth, have some of your own medicine #londonriots” did not make the headlines. Rob Proctor’s work on Twitter showed the most powerful tweets dominating, dissenting voices shut down, the clean up operation after the riots dwarfed the riot tweets. He argues that those who were rioting did not use Twitter, they knew they would be caught. Blackberry Messenger was far more popular for that reason.

Now another reason not to close down those spaces hasn’t really got anything to do with communicative styles, it’s to do with the modes of British Policing. We can talk about “permissive transgression”. They allow law to be broken to a certain extent “policing by consent” then cracking down, the idea being to minimise conflict. The police also lack intelligence – but social media allows huge gathering of intelligence. The other function of social media during protests and riots is quite amusing. The police try to contact and engage those that they think may be protesting. Since 2010 there really haven’t been a central starting point, an organiser, etc. They try to reach out to a leader in the protest, but there is no leader. When I looked at protests in Bristol, different students from different colleges would arrange 4 or 5 different protests, but which one got popular was luck really. no one was in charge.

Def? and his colleagues have looked in depth at how the police use Twitter in conflict. To try to calm situation, to try to engage, to provide information, and to try and name and shame individuals. On occassion some individuals were acquitted… one of these had his house burned down because he had been named as possibly involved in protest. There is a growing sense of identifiable data, of the need to be private in case seen/named/photographed etc.

And I want to end on the opposite take on this. Anonymous, as a group, is full of paradoxes. On the one hand the group is anonymous in terms of membership, representation etc. but they are also Anonymous, a brand essentially. And they are desperate for their actions and hacks to play a propaganda role so they have to play to the mainstream. They say “Twitter is their link to the world”. Looking at @YourAnonNews this is “one of many” Anonymous accounts, you cannot verify it, there is no hierachy and leadership here. They say “we are legion” and that’s a very deliberate term. Anonymous attached Israel over the weekend, I asked about a previous operation and I spoke to “Commander X” and asked him about interfaces with mainstream media. He advised we watch the “We Are Legion” documentary. They started posting quotations from the documentary at the same time as they appeared in the documentary – wasn’t a coincidence, he had control of my machine. So it’s a very weird group to interact with and talk to.

Commander X called Twitter the frontline, the key medium for publicity, it is designed to reach the mainstream journalists. They send out huge numbers of press releases as well. The operations are designed to Shock and Awe, to be too powerful for mainstream media to miss. Commander X says that that is why full blown complete take downs can force the media to discuss it. 1 million followers on Twitter. This is an important organisation. Anonymous going to war online with Israel… this seems major but all of the reporting says the same thing, from an AP release, stating that actually it hasn’t caused any real damage. Israeli government was main source for article. It illustrates those tricky power relations. But I did ask why Twitter doesn’t reject Anonymous or hack them. They say they know they are being watched, they are better at hacking than them, and they say “we haven’t done anything to offend Twitter”. It’s a paradoxical relationship.

Power relations do not seem to have been addressed by Twitter but there are clear opportunities for transformations. There is something but it’s not transforming mainstream media. The hegemonic status hasn’t changed. The integration of Twitter into news discourses doesn’t cause disruption, it follows the patters, it is normalised. But the law fails to address these mediums. Leveson was an opportunity to address that but it didn’t happen. But the paradox, the tension here, is the issue of Twitter as a sphere of public communication and as a sphere of surveillance.

Q&A

Q) In that use of Twitter by mainstream media, how does the Guardian fit in here?

A) The Guardian did become different in riots coverage but had previously been similar to other traditional media outlets. A truly different approach would be to frame those involved in the riots as “primary definers”. I argue that the riots were political but I have many debates with intelligent people who argue otherwise. In Egypt that protest wasn’t that focused but muddies the narrative. That would have been a radical alternative discourse.

Q) Can Twitter be considered unmoderated?

A) Of course it is mediated and moderated. Take downs occur, depends if you think pre-selection or post comment take downs. BBC comments are pre-publication, with Twitter moderation tends to be after the fact. Relatively unmoderated.

Q) My name is Ty Graham, been studying how dutch journalists use Twitter, how that selection might change journalists?

A) Nick Davies wrote a great book called “flat earth news” about the role of PR in news. Throughout the 80s and 90s there were fewer journalists covering more and more who became reliant on press releases and PR. Journalism already has it’s problems. The choice isn’t between seeing or calling a source, and doing that directly vs Twitter. It’s more between press releases and Twitter, and Twitter looks better there. But one of the problems is selecting tweets that fit with the journalist or the discursive angle of the journalists. The daily mail won’t highlight tweets against it’s position for instance.

Q) To what extent does reporting of conflict reflect regulation – so in the case of Egypt the UK press is not regulated by the state, in the London Riots they are, their reporting of Tweets may inevitably be related to their own relationship with the state, with the politics of that protest, with regulation (e.g. license fee negotiation).

A) There’s certainly something there, at least in terms of TV coverage as BBC, ITV, C4 etc. are very much regulated by the state. But newspapers are not regulated in the same way and yet they elected not to frame their coverage differently. People were effectively executed around the riots and that was little covered, just as death in police custody isn’t reflected in reporting. There is a much wider issue with journalism that it wants to reflect and frame it’s coverage in ways that its readers/consumers understand, are not unsettled by, are comfortable with and that really prevents alternative discourses, radical reframings etc.

Q) follow up: surely in a rolling news era TV sets the agenda here though? Surely agenda is also set by whatever has the best pictures?

A) Well perhaps but news organisations in print also produce video; TV news producers use social media; there is such blending and merging here. And there is a real concern around the use of video. In the London Riots all of the major broadcasters handed over footage to the police and there is real concern that in the next protest those journalists may well be finding themselves targets for that reason.

We’ve just had a good lunch break, now moving to parallel strands and I have selected:

 

Building rapport in conference live tweets by Giorgia Riboni, Milan

Giorgia and her colleague C. Degano analysed conference tweeting to understanding discourse. Most participants seem to be addressing tweets to academic peers, not a more public audience. They seem to represent communications within a particular community, an efficient self-promotional tool perhaps? The main question I want to ask is “If scholars’ conference tweets are targeted at an audience of peers, how do they build rapport while sharing information?” and wondering how conference tweeters strategies vary or overlap with speakers strategies, with Erving Goffman’s essay The Lecturn. I am calling on SFG Halliday 1985 (Interpersonal Funcation); Goffman 1981 (Frame Analysis); Hydland 19?? (??)

Corpus was around 2000 tweets, hashtag based collection of 8 pplied linguisitics conference live tweets. With a reference corpus of conference scripts. John Swales Conference Corpus (2009). 23 lectures and around 77k words. I will be sharing the findings of my qualitative analysis rather than my quantitative analysis although the qualitiative results are based on that. In my data 81.5% of tweets were original tweets; 12.8% were retweets; only 5.7% were replies. It doesn’t seem that the tweeters are reluctant to engage, rather they want tweets to reach all or most of their audience instead.

looking at Goffman’s typology for audiences, these don’t seem to quite apply here. Marwick and Boyd 2010 find the effect of “networked audience” – the idea that tweets might reach a global audience of peers rather than replies which only reach a small audience. Hydland 2001 talks about metadiscursive elements and engagement markers. On Twitter there are both engagement language devices (Deixis – in person this is about “I’m pleased to be here” etc. in tweets this is a little different ; questions and imperatives; conversational elements; evaluative elements – these tends to be a matter of positioning yourself in the academic community, paying compliments, etc.) and engagement markers (e.g. @, RT, etc.). Looking at markers – hashtags enact a social relationship. The @ sign are a deictic marker of addressivity at the beginning of the tweet, it is about mentioning when mid-tweet. RTs also are about mentioning another user and representing that in your own stream (User X has posted the following (Zappavigna 2012). But in conferences there is a driver for participants’ own personality and perspective to emerge, probably hence the low level of RT.

Conference live tweeter can act as an “Animator” but there are hybrid or ambigous models here. Sometimes Animator and principal; Author or principal. This hybrid ambiguity amplifies the tweeter’s voice in relation to the speaker. And this relates to the scholarly voice, the role of the speaker (Monacelli 2009) which may represent their role, organisation, etc. and the textual self of the speaker (Goffman 1981). Goffman talks about the speaker as a performer, a method of transmitting content rather than the textural self projects in the course of the lectures delivery. But for me the textual self of the speaker is equally important as, thanks to Twitter affordances, the public and private spheres are collapsing and colliding together and thus the construction of the textual self includes both the scholarly voice and the personal voice. There is a backstage frenzy. Doesn’t mean that distance-altering mechanisms are not there, you see irony and humour, parenthetical remarks, and text brackets as well.

Conclusions. Speakers establish conection with attendees by engaging their rhetorical audience, discursively constructing their textual self and altering their alignmen with the public while transmitting their talk. But that the tweeters also perform, often using the same strategies as the speaker. Conference live tweeters can send notifications; transmit their personal and public self etc. Contact Giorgia @Giorgia_Riboni.

Q&A

Q) David Matthews, Lancaster: One of the things you mentioned was the low number of retweets. One explanation was that people want their own voice to emerge. I’m wondering whether you’d think about asking them, interviews with those you examine, as there may be other reasons. for example people using twitter at a conference don’t retweet as they know their peers are present at the conference as well, that information will already be available.

A) I agree that ethnographic approach would benefit these studies. As a live tweeter I have my own opinions and have been trying to verify those using my corpus.

Q) Mark: Have you engaged with set up, e.g. TWitter wall. Like this morning the wall ran after the talk but not during them. There is a twitter wall in the foyer/lunch area. What is the impact of that display on live tweeting. What is your experience in that?

A) A good comment. Didn’t do this as part of research, think I should have. What we are experiencing today shows that what we see of the tweets can really shape Tweets and interactions.

Q) Anthony McNeill, Kingston University: Looking at research on Blogging by Susan Herring for instance, that monologue was the trend but dialogue did occur in bursts. Some continuity between blogging and microblogging there perhaps. And 140 chars means a retweet really limits what I can say… does that have an impact.

A) Yes, initially I think that was an issue. New RT button changes that somewhere. Tweeters feel the need to credit the tweet so that may be a challenge. The way that they conserve textuality is also interesting. You consider your own tweet but you also consider your tweet in the wider context of the corpus of Tweets.

Q) Is it normal that live tweeters are present?

A) Yes, but the audience may not be.

 

Polyphony of discourse on and about Twitter: Analysis of Twitter uses in the European Parliament, Commission and Council by Sandrine Roginsky (@enirdans)

My work started with an article I saw saying that Twitter was useful for commissioners to communicate. I saw a tweet from Vivian Reding, Vice President of European Comission on the election of president Elles in 2012. It certainly won’t be a commission view. Twitter allos some form of free expresion of actors who are also part of institutions. Professionally though there are guidances around speeches etc.  Maingeuneau says that “institutions mask the conflict” but on Twitter we still see that conflict. So the research question for me here is whether Twitter make it possible to combine various registers of communication or genres of discourses,and the role of the “neutral speech” the “truth of speech” etc. You cannot have both. My hypotheisis that the communications of the institutions as public organisations on Twitter leads to blurring of personal and private, and that

French discourse analysis – Caroline Olliver-Yaniv and Claire oger which is from perspective of political sociology. See also quote on Witschge 2008 the poential of the internet for opening up public discourse cannot be evaluated without properly understanding the context. My method included participant observation, interview, etc. I had ready access to the institutions for this work. I looked at the European Commission, the European Council, and the European Parliament. Today I will really focus on the European Commission.

In terms of the context you should note that there is a general injunction for institutions and staff of the Commission to use social media (see EP, Stategic Plan of Communication 2011-2014) as it offers “unique cost-efficient oppotunity for interactivity with citizens”. There is often a perception that there is a deficit around EU communication, but worth bearing in mind in the Commission’s use of others’ tools.

Who is in charge here? At the Commission there are 6 people dedicated to SNS (only). Important to know that Commission has very decentralised communications team. The Social media team is quite small – and on short-term contracts – but Twitter use is spread across the Commission. The European Council has one social media manager – not really communicating in general as much. At the European Parliament there is a team of 30 people dedicated to both social media and the website of the institution. The age average of these staff is around 25-35 year olds.

Ollivier-Yaniv and Oger found that institutional discourses re those officially produced by an individual or organisation. There is some evolution here of our understanding of the institution in Twitter. You have speakers who are Commissioners AND politicians. And you have Commissioners and Staffers both communicating on Twitter. Many voices for this one institution. Too many accounts for the institution. I did this full time for three months and I couldn’t track and follow them all! And there are even more since! There was one main institutional account, one for each Director Generals, 16 personal Commissioner accounts, 16 spokespeople of Commisioners, more for specific programmes and services, 27 nationalities etc. Very difficult to keep track of them.

The Commission has pushed for their staff to be on Twitter BUT not everyone is allowed to speak on behalf of the institution. See Bourdieu 2004 on the role of authorised speakers (and unauthorised speakers). However even for a researcher working on Tweets of the Commission it is near impossible to work out who is authorised to speak on behalf of the Comission. No clear guidelines here, even when staff were asked. Was told by one person that it will be obvious form indications of the the Twitter account. But same person said that her Commissioner’s account was “her personal account, she writes whatever she wants”. So very unclear whether personal or Comission views.

Moving to discourses. Institutional communication (Pasquier 2011); Political communication (Gerstle 1992); Personalised Communication (Jacobson 1994?). One Comms Officer tols me there are two types of tweets, political tweets “which are more interesting” and more personal.  If we take the account of Jose Manual Barroso (@BarrosoEU) – this actually changed names three times here. The rebrand changed the account to both a more personal and more political account. There are some Tweeters do speak on a more personal basis. Viviane Reding again seems to be much more personal. Laszio Andor does similarly but words those tweets in far more personal and informal ways.

However there are a great many more professional type tweets than wholly personal tweets. I found three types of tweets. But I am interested in the paradoxes of enunciation on Twitter and contradiction between the discourse on and about Twitter. Words like Objectivity, Impartiality, Loyalty, Discreion, Circumspection vs. “the best information are those which have not yet been communicated yet”. There is a tension between the rules and the aim of participation on Twitter. Really a huge contradiction. Interviewees all said the argumentative, very political tweets were the most interesting but at the same time they said that you really can’t do that. Most of what is tweeted is much more institutional communication. Probably rather less than one interviewee’s guestimate of 70% institutional to 30% personal political.

Some validation of hypothesis here. Smoothing of political discourse on Twitter.

Most interviews in their discourses on Twitter, they said it was a good way to reach “citizens” or the “man on the street”. But when we looked at who followed and interacted with them it was predominently media and others from the Brussels e-bubble. So it is not a good tool to reach the “man on the street” but very useful way to reach that “Brussels e-bubble”.

Questions – validation process and control of speech within and outside the organisation suited for Twitter.

Q&A

Q) Something about identity, and the need to have a certain professional character. It seems that in order to be “professional” they have to tweet a news feed, they are locked within a bubble of who they are, what they are representing, do they belong in a medium like Twitter.

A) A good question, I have the same question myself. I don’t have an answer exactly. The discourse they have about Twitter, especially Twitter rather than Facebook as they see Twitter as most useful, but I’m not sure at all. A researcher (geugeugis?) found that there is a competition between civil servants and politicians. And The civil servants are winning but Twitter is subversive to that, a way to produce own communication. But you can see the touch of civil servants in that prevelance of institutional tweets, very administrative form and tone for this. So many accounts though! There was also a debate around actor, around digital rights, strong debates and those did take place in those debates. None of the EU institutions took part in that debate. They said they don’t know how to act, how to participate, how to be part of the conversation and perhaps take sides in a way.

Q) Simon van Houts: covering the EU as a journalist is notoriously limited, “mediated mediation” – are the Twitter channels another part of this?

A) I think so. I have started to look at journalists and their interactions with EU accounts and I think I would agree there.

And after a biscuit we are back!

Uses and risks of microblogging in organisations by Soureh Latif Shabgahi

Soureh is particularly looking at use of social media tools in SMEs, and use of microblogging here. Two of the relevant tools here are Twitter and Yammer. From previous literature a majority of research into microblogging has taken place in larger organisations. Generally uptake has been large organisations who have adopted tools after early uptake and trials by some of their staff.

Yammer is an enterprise-orientated microblogging tool without restriction on number of characters.  (A’lvaro et al 2010; gunther 2009; Giles 2010). The messages shared are private to the organisation and it appears that over 70k organisations are using Yammer. However there is little research on enterprise microblogging, 30 papers starting in 2009. Most papers are in US and Europe. Key authors include Kai Riemer and Dejin Zhao. Riemer focuses on microblogging influence on communications. Dejin has focused more on awareness.

I have categorised themes in usage. The main themes are around coordination; reputation management; forming relationships’ Awareness and sense of connectnedd’; Record information for future reference; sharing knowledge/information; discussion. Personal dimensions also come in here, for instance using Twitter for finding work related updates, I have mapped that to Awareness and a sense of connectedness.  There are also face to face discussions that come out of microblogging discussions. These include areas such as work related stress etc.

In considering these tools I also categorised the risks associated with enterprise microblogging. One risk is the restriction on messages, that limitation to 140 characters; difficulties of using the system; distraction – particularly time cost, the noise to value ratio, ; privacy of employees; Security of the organisation.

Data collection via semi structured interviews with 20 SMEs in South Yorkshire area, most of which were IT based companies. But have also gathered results from SMEs in other areas, e.g. educational companies, sports companies. Based on interviews also observations of some companies – of messages posted by employees during the interviews or at a later date. Also used questionnaires to SMEs and received over 100 responses – some from companies I hadn’t contacted directly but had been passed on to them. All interviewees were asked to complete the questionnaires. Analysis has been carried out on a sample of 4 interviews, as the basis for thematic analysis for broader collection. In one SME case I interviewed two staff members, the first was a manager who had introduced Twitter and Yammer, the second was a manager who was using Twitter for their organisation. And I have pulled out some of the uses and some of the risks involved.  Interviewee 1 said that Yammer allowed sharing at “speed of thought” making it easy to use, saw Twitter very positively, as a way to market and engage with customers. Yammer as internal tool, Twitter as tool for reaching audience. Interviewee 2 liked perpetual connection – 24/7 mobile access – and talked about network effects, use for marketing etc. and possibility of attracting new customers.

In terms of risks Interviewee 1 said there was a risk of upsetting colleagues via Yammer – can type quickly and find an upset colleague. On Twitter you can phrase things badly, that had happened, and there was a reputation impact. Interviewee 2 also talked about the risk of hacking – Yammer includes very confidential discussion so real risk there – and of distraction of Twitter.

Turning to Interviewee 3, specifically about Twitter, found the scale very positive and liked the “follow” concept. For their business that was really important. There was a real value to restricting their message. Short messages allow people to decide quickly if they want information, they don’t have to read a lot. But also identified “accidentally just mentioning something” as a risk.

These first two organisations were very positive in tone. However Interviewee 4 was much more negative about use of Twitter. They saw Twitter as very famous, good for marketing and attention, probably needed for mass communication. However in his eyes it’s mainly a “social media tool” in that it is effective for communicating with friends, not with professional cololeagues. Because some private information had been leaked they were particularly aware of that risk. Spelling mistakes are a real risk. And they felt that most users of social media were younger and perhaps not mature in dealing with these issues.

As this research goes forward I will add further risks and uses to my diagrams. My study will focus on the risks and will look at policies and guidelines and how to handle risks when they arise. So for instance Reputation (e.g. spelling mistakes) and Upset/Offend Others will be added to the risks.

Q) Me: Risks – have legal risks been incorporated? Also were SMEs

A) Risks: looked at risks of microblogging and regular blogging tools. Lots of similarities. Some risks that are specific to blogging and probably vice versa. And probably will use policies and guidelines into those risks. In terms of SMEs: First question I ask is whether they use microblogging or not, sometimes I have to explain what microblogging is. Almost all companies I contact are using Twitter, lots of IT companies use it. Some are still in process of adoption. Still debating internally. Did interview them as well as wanted to capture that process of adoption. If aware of them and considering them I did include those SMEs.

Q) You picked companies with high IT knowledge. How would that map to companies with less IT knowledge perhaps? e.g. Health and social care.

A) My focus is mainly in the area of IT I have interviewed companies working in education, sports, some charities. It will be interesting… 80% are IT so there should be some scope for comparison of them with other companies.

Q) Peter Evans, UoE: Study is about use of microblogging by companies. Did you look at use of microblogging by employees not just by companies – and regardless of company policies, particularly around knowledge sharing.

A) It was difficult to find companies in the first place, I spoke to max 2 people and focused on manager who introduced it plus one other. One of my questions was about whether personal or business account was being used. One of the risks which majority of companies using personal accounts identified was the sharing of business information on personal accounts, felt that spelled trouble for the whole company. Some had specifically introduced policies and guidelines as a result, e.g. do not mention the business at all.

Q) So I hadn’t heard of Yammer before…?

A) It was hard to find Yammer companies because that is private communication but I met some of the Yammer staff and was able to find some companies through them.

Comment) Yammer is used a lot in local government.

More than just passing notes in class? Twitter backchannels as new literacy practice by Tony McNeill, Principal Lecturer in Educational Technology

Tony wants to start by linking his presentation to the main conference themes, on the many diverse and creative ways Twitter is being used… or maybe misused. So I love this Onion headline “Twitter creator on Iran: I Never Intended for Twitter to be useful” which is quite fun, not as funny as some others in the Onion, but it raises a few things. That negative, trivial, vacuus perception of Twitter. The other side is that all technology is really a misuse of technology – to put it to purposes for which it was never intended (e.g. using screwdriver to open a can of paint). Like Howard Bloom saying all reading is misreading, so all technology use is all misuse.

So in an image of a lecture we see a lecturer in the room and two participants in the backchannel. It was a minority force a few years ago. Now it’s commonplace, participation starts weeks or months in advance. Signing up for a conference means the schedule, the people, but also the hashtag. See eg a tweet about conferences sounding like the tic tic tic of typing – maybe out of date in era of smart phones.

Tony’s core question is whether this back channel is new or just a new take on old/existing practice. To answer it we need descriptions of practice and theorisations of practice. So I want to start by thinking of ways to theorise the backchannel. Tony defines this “the digital communications space used for primarily textual interactions alongside live spoken presentations generally delivered in a physical environment”. Sometimes there are images and multimodal elements in the backchannel too.

You can see Nathan Jurgenson’s critique of the term “there will not be separate online and offline conferences happening, […] Twitter isn’t a backchannel it’s the session at the front.”. My own take is that there are backchannels and that front channels are not automatically physical, they can be digital. Back channels go way back – looking at this painting we see chatter at the back of the room. But there are roles in this room.

So I want to stick with the notion of backchannel. My theoretical frameworks are New Literacy Studies and New Mobilities Paradigm, both of which have their origins in Lancashire. So firstly New Literacy Studies are an approach around reading and writing practices, sees literacy as plural, socially embeded, and about identity and power. And a sub area of this, New Literacies, is whether new digital technologies really lead to new literary practices. Is there a new ethos in Twitter backchannels or does technology just enable what we’ve already done.

Types of backchannel tweets:

  • minute-by-minute/live tweeting
  • note taking/resource sharing – and co construction and collaborative aspects
  • personal commentary
  • dialogue – some dialogue but more monologue than dialogue
  • fun /playfulness.

There are also new conventions in the backchannel, competances to be a participant:

  • use of event specific hashtags
  • @ messages
  • retweets (RTs)
  • invite muting/unfollowing when backchannelling

All of these are implicit knowledge we need to understand.

New Literacy Studies are interested in context and in power. I was at a conference a few years ago on podcasting and there was a hastily improvised hashtag, We had a speaker from Apple and we were interested to hear ideas etc. but we got a real sales pitch. So one comment here “I really *want* to like this talk but it’s not playing to me… ” and others followed suit. We are shown a promotional video and the tweets get more angry and more defensive of being sold to. It was interesting to see that very tense mismatch between audience giggling and speaker presentation. What was going on here? Was it just silliness, boredom, irritation? I think it was something else. Academics were going beyond that academic identity but at the same time reinforcing the importance of the conference space for sharing and discussion, a reaffirmation against the sales pitch.

My second theoretical framework are the New Mobilities Paradigm (Shelly and Urry 2006) which about social life no longer being about physical proximity, being more about moving, technologies enable remote connections. Having intense but meaningful contact at a distance. A book I’ve been reading lately, by Daniel Millar, on parenting and migrant workers from the Phillipines and how that is facilitated by Skype seems to fit with these theoretical frameworks. And Shelly and Urry are against sedentarism.

So, with a new mobilities perspective how might we see the conference. The “sedentarist” conference is about face to face encounter; bounded in time and space’ impermeable (delegates only); backchannels a “distraction”.

Backchannels potentially challenge those ideas in many ways:

  • virtual/phsyical (e.g. I tweeted various links days ago using a tweet scheduler that you should see now)
  • digital/analogue
  • then/now
  • not here/here
  • interloper/delegate

There are all sorts of bizarre things. The digital is present in physical spaces – twitter walls appear in a few conferences. danah boyd had a really Twitter wall fail a few years back when very uncomplimentary tweets weren’t visible and thus weren’t addressable.

Early ethnographic perspective. Miller and Slater (2000) wrote about needing to treat the internet as embedded and continuous in daily lives.

On the backchannel theme…. Starbucks sponsored an ice skating rink at the Natural History Museum. Tweet to their tag: #spreadthecheer and you appeared at the ice rink. Starbucks weren’t stupid, they moderated it when they saw the tweets appearing but not fast enough… a skater had photographed it. The virtual was made physical in the space, photographed and tweeted and shared over 1000 times back in the digital realm.

A tentative conclusion: Backchannels constitute a new literacy practice and structure (plus two sub points

Q&A

Q) A question here… in a previous role I looked at digital scholarship… given the cost and environmental reasons will there always be physical conferences?

A) I think that we are between two ways of doing things. We are doing some video streaming of some of the plenaries. Live bloggers here… we are mid point between physical and online conference. I’ve done some technology conferences where that participation is online and that’s comfortable, there will be more use of that in the future…

Comment) Actually I think the real back channel is in the pub so I don’t agree…

Q) I went to a conference called Gin and Joy Division specifically about that… I wanted to ask about the backchannel and the thing… it’s often very multilayered. Some events I’ve been to have huge numbers of parallel conversations, which sort of gives you more of a sense of response to what is presented. Have you thought about that at all? What does the back channel say in response to the paper? Is it always sniping and jokes?

A) I always enjoy that… If someone is looking at the laptop when I’m talking you may be doing something broadly positive – looking something up, following up interesting themes. For this conference I’m not sure but a colleague who has done more quantitative work found that some conferences with successful and busy hashtags actually it comes down to half a dozen or a dozen people producing most of the tweets. But I see it as generally positive, ranging from banal (wifi, dinner, etc.). I have seen some real dialogue and co production of knowledge going on… a lot of people talking about automatic transcriptions etc… real dialogue… One post at that conference said the participant was struggling but backchannel was keeping them engaged. But it can be snarky too.

Q) Rhianon: I have to admit the description of backchannels being snarky… maybe a British thing… in fan communities especially the back channel can be about informing those beyond the room who cannot be physically present. Not about criticising the speaker but dissemination of information… about an augmented confrerence.

A) The same happens here. But snarkiness comes from US blogs, and IRC backchat really. But still broadly positive.

And it’s back to the main room for our second plenary…

Twitter Q&A with Nathan Jurgenson (@nathanjurgenson) by Nathan Jurgenson and Johnny Unger

This session is taking place in a more novel way. Nathan is joining us via Google+ Hangout on video, and also via Twitter. Johnny is chairing in the room. We have been asked to read Nathan’s work on the IRL fetish and to think about his work on digital dualism. Do we as academic researchers critique these things, are we guilty of these things… ?

Johnny: I will start with a question from my own experience: I often talk with colleagues about students using technology in the lecture room – computes, smartphones, tablets etc. Some find that quite concerning and threatening.

Nathan: That is one of the issues of digital dualism. A lot of the time these issues are confused with access and presence. We are trying to come to terms with that. Digital dualism we often overestimate the role of the online in the offline, and the other side of that, we underestimate the role of the offline in the online. The assumption can be concerns that students are checking out of the room. They may be but they may not. But it’s not about whether to make that compromise or not, it’s not about online or offline, its about how you augment the on and offline with each other. They are different spaces but they are enmeshed. And we should neither under or over estimate that enmeshment. Personally I am terrified of MOOCs, the idea of no face to face contact scares me. I run a conference, Theorising the Web, is certainly enmeshed though, if you weren’t on Twitter you only see half the conference.

Digital dualism is the fallacy of looking at the online and offline as disconnected and unrelated. Facebook is real of course. And the IRL Fetish is about value judgements about those spaces, as seeing people on phones, using technologies etc. as anti social, to see ourselves as more real or authentic because we are offline. When Sheryl Turkle talks about walks on the beach… she wasn’t disconnected, she’s connected but her social space is the NYT op-ed pages, rather than Facebook.

And, with that Nathan is offscreen and turning to questions on Twitter… the room has fallen silent and migrated to #njqa. And I shall move with it but return to liveblogging when Nathan appears onscreen again shortly.

And… after a weird half hour…

Johnny asks how that went? Nathan says we agree too much! He has lots of stuff to read and look at after the questions, I think there was a lot of digital thinking going on there. I am joking there but some long reads, some discussion online. Feedback from the audience is “too loud” which is very much an ironic comment, it’s been super quiet. I’ve commented that we’ve privileged the backchannel to frontchannel… very little backchat in the room. So two lessons, maybe Twitter isn’t the backchannel, maybe you need music… or more person to person chat…. maybe it was too quiet. But we are in a lecture theatre and the physical layout certainly doesn’t encourage chatter. Nathan: architecture affordances is one-to-many structure. Twitter is many-to-many format so disjunction there, conceptually interesting. Comment from the room: quite creative, real creation of content in the room. Johnny: when people are asked to create content they do that rather than chat. Nathan: indeed, may not be best format for conference where networking is so important. But really great experiment, test running this novel way, really interesting.

Our final plenary for the day took place after dinner and was, it must be said, pretty much unbloggable. In:

Facebook is like Disco and Twitter is like Punk by Rebekka Kill

We were treated to something between performance art, a great retro DJ set, and a really thoughtful musing on the cultures and practices of social media. I can’t sum it up adequately here and without Kill’s unique DJing persona, but her slides and some of the music can be found in this blog post from her performance at the Shift Happens conference in 2012.

And with that Day One is truly over. Day two sees my own presentation taking place so expect a few gaps in the blogging!