Oct 132015
 
Michael Dewar, Data Scientist at The New York Times, presenting at the Data Science for Media Summit held by the Alan Turing Institute and University of Edinburgh, 14th October 2015..

Today I am at the “Data Science for Media Summit” hosted by The Alan Turing Institute & University of Edinburgh and taking place at the Informatics Forum in Edinburgh. This promises to be an event exploring data science opportunities within the media sector and the attendees are already proving to be a diverse mix of media, researchers, and others interesting in media collaborations. I’ll be liveblogging all day – the usual caveats apply – but you can also follow the tweets on #TuringSummit.

Introduction – Steve Renals, Informatics

I’m very happy to welcome you all to this data science for media summit, and I just wanted to explain that idea of a “summit”. This is one of a series of events from the Alan Turing Institute, taking place across the UK, to spark new ideas, new collaborations, and build connections. So researchers understanding areas of interest for the media industry. And the media industry understanding what’s possible in research. This is a big week for data science in Edinburgh, as we also have our doctoral training centre so you’ll also see displays in the forum from our doctoral students.

So, I’d now like to handover to Howard Covington, Chair, Alan Turing Institute

Introduction to the Alan Turing Institute (ATI) – Howard Covington, Chair, ATI

To introduce ATI I’m just going to cut to out mission, to make the UK the world leader in data science and data systems.

ATI came about from a government announcement in March 2014, then bidding process leading to universities chosen in Jan 2015, joint venture agreement between the partners (Cambridge, Edinburgh, Oxford, UCL, Warwick) in March 2015, and Andrew Blake, the institute’s director takes up his post this week. He was before now the head of research for Microsoft R&D in the UK.

Those partners already have about 600 data scientists working for them and we expect ATI to be an organisation of around 700 data scientists as students etc. come in. And the idea of the data summits – there are about 10 around the UK – for you to tell us your concerns, your interests. We are also hosting academic research sessions for them to propose their ideas. 

Now, I’ve worked in a few start ups in my time and this is going at pretty much as fast a pace as you can go.

We will be building our own building, behind the British Library opposite the Frances Crick building. There will be space at that HQ for 150 peaople. There is £67m of committed funding for the first 5 years – companies and organisations with a deep interest who are committing time and resources to the institute. And we will have our own building in due course.

The Institute sits in a wider ecosystem that includes: Lloyds Register – our first partner who sees huge amounts of data coming from sensors on large structures; GCHQ – working with them on the open stuff they do, and using their knowledge in keeping data safe and secure; EPSRC – a shareholder and partner in the work. We also expect other partners coming in from various areas, including the media.

So, how will we go forward with the Institute? Well we want to do both theory and impact. So we want major theoretical advances, but we will devote time equally to practical impactful work. Maths and Computer Science are both core, but we want to be a broad organisation across the full range of data science, reflecting that we are a national centre. But we will have to take a specific interest in particular interest. There will be an ecosystem of partners. And we will have a huge training programme with around 40 PhD students per year, and we want those people to go out into the world to take data sciences forward.

Now, the main task of our new director, is working out our science and innovation strategy. He’s starting by understanding where our talents and expertise already sit across our partners. We are also looking at the needs of our strategic partners, and then the needs emerging from the data summits, and the academic workshops. We should then soon have our strategy in place. But this will be additive over time.

When you ask someone what data science is that definition is ever changing and variable. So I have a slide here that breaks the rules of slide presentations really, in that it’s very busy… But data science is very busy. So we will be looking at work in this space, and going into more depth, for instance on financial sector credit scoring; predictive models in precision agriculture; etc. Undercutting all of these is similarities that cross many fields. Things like security and privacy is one such area – we can only go as far as it is appropriate to go with people’s data, and issue both for ATI and for individuals.

I don’t know if you think that’s exciting, but I think it’s remarkably exciting!

We have about 10 employees now, we’ll have about 150 this time next year, and I hope we’ll have opportunity to work with all of you on what is just about the most exciting project going on in the UK at the moment.

And now to our first speaker…

New York Times Labs – Keynote from Mike Dewar, Data Scientist

I’m going to talk a bit about values, and about the importance of understanding the context of what it is we do. And how we embed what we think is important into the code that we write, the systems that we design and the work that we do.

Now, the last time I was in Edinburgh was in 2009 I was doing a Post Doc working on modelling biological data, based on video of flies. There was loads of data, mix of disciplines, and we were market focused – the project became a data analytics company. And, like much other data science, it was really rather invasive – I knew huge amounts about the sex life of fruit flies, far more than one should need too! We were predicting behaviours, understanding correlations between environment and behaviour. I’

I now work at the New York Times R&D and our task is to look 3-5 years ahead of current NYT practice. We have several technologists there, but also colleagues who are really designers. That has forced me up a bit… I am a classically trained engineer – to go out into the world, find the problem, and then solve it by finding some solution, some algorithm to minimise the cost function. But it turns out in media, where we see decreasing ad revenue, and increasing subscription, that we need to do more than minimise the cost function… That basically leads to click bait. So I’m going to talk about three values that I think we should be thinking about, and projects within that area. So, I shall start with Trust…

Trust

It can be easy to forget that much of what we do in journalism is essentially surveillance, so it is crucial that we do our work in a trustworthy way.

So the first thing I want to talk about is a tool called Curriculum, a Chrome browser plug in that observes everything I read online at work. Then it takes chunk of text, aggregates with what others are reading, and projects that onto a screen in the office. So, firstly, the negative… I am very aware I’m being observed – it’s very invasive – and that layer of privacy is gone, that shapes what I do (and it ruins Christmas!). But it also shares what everyone is doing, a sense of what collectively we are working on… It is built in such a way as to make it inherently trustworthy in four ways: it’s open source so I can see the code that controls this project; it is fantastically clearly written and clearly architected so reading the code is actually easy, it’s well commented, I’m able to read it; it respects existing boundaries on the web – it does not read https (so my email is fine) and respects incognito mode; and also I know how to turn it off – also very important.

In contrary to that I want to talk about Editor. This is a text editor like any others… Except whatever you type is sent to a series of micro services which looks for similarity, looking for NYT keyword corpos, and then sends that back to the editor – enabling a tight mark up of their text. The issue is that the writer is used to writing alone, then send to production. Here we are asking the writer to share their work in progress and send it to central AI services at the NYT, so making that trustworthy is a huge challenge, and we need to work out how best to do this.

Legibility

Data scientists have a tendency towards the complex. I’m no different – show me a new tool and I’ll want to play with it and I enjoy a new toy. And we love complex algorithms, especially if we spent years learning about those in grad school. And those can render any data illegible.

So we have [NAME?] an infinite scrolling browser – when you scroll you can continue on. And at the end of each article an algorithm offers 3 different recommendation strands… It’s like a choose your own adventure experience. So we have three recommended articles, based on very simple recommendation engine, which renders them legible. These are “style graph” – things that are similar in style; “collaborative filter” – readers like you also read; “topic graph” – similar in topic. These are all based on the nodes and edges of the connections between articles. They are simple legible concepts, and easy to run so we can use them across the whole NYT corpus. They are understandable to deal with so has a much better chance of resonating with our colleagues.

As a counter point we were tasked with looking at Behavioural Segmentation – to see how we can build different products for them. Typically segmentation is done with demography. We were interested, instead, on using just the data we had, the behavioural data. We arranged all of our pageviews into sessions (arrive at a page through to leave the site). So, for each session we representated the data as a transition matrix to understand the probability of moving from one page to the next… So we can perform clustering of behaviours… So looking at this we can see that there are some clusters that we already know about… We have the “one and dones” – read one article then move on. We found the “homepage watcher” who sit on the homepage and use that as a launching point. The rest though the NYT didn’t have names for… So we now have the “homepage bouncer” – going back and forth from the front page; and the “section page starter” as well, for instance.

This is a simple caymeans (?) model and clustering, very simple but they are dynamic, and effective. However, this is very very radical at NYT, amongst non data scientist. It’s hard to make it resonate to drive any behaviour or design in the building. We have a lot of work to do to make this legible and meaningful for our colleagues.

The final section I want to talk about is Live…

Live

In news we have to be live, we have to work in the timescales of seconds to a minute. In the lab that has been expressed as streams of data – never ending sequences of data arriving at our machines as quickly as possible.

So, one of our projects, Delta, produces a live visualisation of every single page views of the NYT – a pixel for person starting on the globe, then pushing outwards, If you’ve visited the NYT in the last year or so, you’ve generated a pixel on the globe in the lab. We use this to visualise the work of the lab. We think the fact that this is live is very visceral. We always start with the globe… But then we show a second view, using the same pixels in the context of sections, of the structure of the NYT content itself. And that can be explored with an XBox controller. Being live makes it relevant and timely, to understand current interests and content. It ties people to the audience, and encourages other parts of the NYT to build some of these live experiences… But one of the tricky things of that is that it is hard to use live streams of data, hence…

Streamtools, a tool for managing livestreams of data. It should be reminscent of Similink or LabView etc. [when chatting to Mike earlier I suggested it was a superpimped, realtime Yahoo Pipes and he seemed to agree with that description too]. It’s now on it’s third incarnation and you can come and explore a demo throughout today.

Now, I’ve been a data scientist and involved when we bring our systems to the table we need to be aware that what we build embodies our own values. And I think that for data science in media we should be building trustworthy systems, tools which are legible to others, and those that are live.

Find out more at nytlabs.com.

Q&A

Q1) I wanted to ask about expectations. In a new field it can be hard to manage expectations. What are your users expectations for your group and how do you manage that?

A1) The expectations in R&D, in which we have one data scientist and a bunch of designers. We make speculative futures, build prototypes, bring them to NYT, to the present, to help them make decisions about the future. In terms of data science in general at NYT… Sometimes things look magic and look lovely but we don’t understand how they work, in other places it’s much simpler, e.g. counting algorithms. But there’s no risk of a data science winter, we’re being encouraged to do more.

Q2) NYT is a paper of record, how do you manage risk?

A2) Our work is informed by a very well worded privacy statement that we respect and build our work on. But the other areas of ethics etc. is still to be looked at.

Q3) Much of what you are doing is very interactive and much of data science is about processing large sets of data… So can you give any tips for someone working with Terrabytes of data for working with designers?

A3) I think a data scientist essentially is creating a palate of colours for your designer to work with. And forcing you to explain that to the designer is useful, and enables those colours to be used. And we encourage that there isn’t just one solution, we need to try many. That can be painful as a data scientist as some of your algorithms won’t get used, but, that gives some great space to experiment and find new solutions.

Data Journalism Panel Session moderated by Frank O’Donnell, Managing Editor of The Scotsman, Edinburgh Evening News and Scotland on Sunday

We’re going to start with some ideas of what data journalism is

Crina Boros, Data Journalist, Greenpeace

I am a precision journalist.  and I have just joined Greenpeace having worked at Thomson Reuters, BBC Newsnight etc. And I am not a data scientist, or a journalist. I am a pre-journalist working with data. At Greenpeace data is being used for investigate journalism purposes, areas no longer or rarely picked up by mainstream media, to find conflicts of interest, and to establish facts and figures for use in journalism, in campaigning. And it is a way to protect human sources and enable journalists in their work. I have, in my role, both used data that exists, created data when it does not exist. And I’ve sometimes worked with data that was never supposed to see the light of data.

Evan Hensleigh, Visual Data Journalist, The Economist

I was originally a designer and therefore came into information visualisation and data journalism by a fairly convoluted route. At the Economist we’ve been running since the 1890s and we like to say that we’ve been doing data science since we started. We were founded at the time of the Corn Laws in opposition to those proposals, and visualised the impact of those laws as part of that.

The way we now tend to use data is to illustrate a story we are already working on. For instance working on articles on migration in Europe, and looking at fortifications and border walls that have been built over the last 20 to 30 years lets you see the trends over time – really bringing to life the bigger story. It’s one thing to report current changes, but to see that in context is powerful.

Another way that we use data is to investigate changes – a colleague was looking at changes in ridership on the Tube, and the rise of the rush hour – and then use that to trigger new articles.

Rachel Schutt, Chief Data Scientist, Newscorp

I am not a journalist but I am the Chief Data Scientist at Newscorp, and I’m based in New York. My background is a PhD in statistics, and I used to work at Google in R&D and algorithms. And I became fascinated by data science so started teaching an introductory course at Columbia, and wrote a book on this topic. And what I now do at Newscorp is to use data as a strategic asset. So that’s about using data to generate value – around subscriptions, advertising etc. But we also have data journalism so I increasingly create opportunities for data scientists, engineers, journalists, and in many cases a designer so that they can build stories with data at the core.

We have both data scientists, but also data engineers  – so hybrid skills are around engineering, statistical analysis, etc. and sometimes individual’s skills cross those borders, sometimes it’s different people too. And we also have those working more in design and data visualisation. So, for instance, we are now getting data dumps – the Clinton emails, transcripts from Ferguson etc. – and we know those are coming so can build tools to explore those.

A quote I like is that data scientists should think like journalists (from DJ Patel) – in any industry. In Newscorp we also get to learn from journalists which is very exciting. But the idea is that you have to be investigative, be able to tell a story, to

Emily Bell says “all algorithms are editorial” – because value judgements are embedded in those algorithms, and you need to understand the initial decisions that go with that.

Jacqui Maher, Interactive Journalist, BBC News Labs
I was previously at the NYT, mainly at the Interactive News desk in the newsroom. An area crossing news, visualisation, data etc. – so much of what has already been said. And I would absolutely agree with Rachel about the big data dumps and looking for the story – the last dump of emails I had to work with were from Sarah Palin for instance.

At the BBC my work lately has been on a concept called “Structured Journalism” – so when we report on a story we put together all these different entities in a very unstructured set of data as audio, video etc. Many data scientists will try to extract that structure back out of that corpus… So we are looking at how we might retain the structure that is in a journalist’s head, as they are writing the story. So digital tools that will help journalists during the investigative process. And ways to retain connections, structures etc. And then what can we do with that… What can make it more relevant to readers/viewers – context pieces, ways of adding context in a video (a tough challenge).

If you look at work going on elsewhere, for instance at the Washington Post working on IS, are looking at how to similarly add context, how they can leverage previous reporting without having to do that from scratch.

Q&A/Discussion

Q1 – FOD) At a time when we have to cut staff in media, in newspapers in particular, how do we justify investing in data science, or how do we use data science.

A1 – EH) Many of the people I know came out of design backgrounds. You can get pretty far just using available tools. There are a lot of useful tools out there that can help your work.

A1 – CB) I think this stuff is just journalism, and these are just another sets of tools. But there is a misunderstanding that you don’t press a button and get a story. You have to understand that it takes time,  there’s a reason that it is called precision journalism. And sometimes the issue is that the data is just not available.

A1 – RS) Part of the challenge is about traditional academic training and what is and isn’t included here.. But there are more academic programmes on data journalism. It’s a skillset issue. I’m not sure that, on a pay basis, whether data journalists should get paid more than other journalists…

A1 – FOD) I have to say in many newsrooms journalists are not that numerate. Give them statistics, even percentages and that can be a challenge. It’s almost a badge of honour as wordsmiths…

A1 – JM) I think most newsrooms have an issue of silos. You also touched on the whole “math is hard” thing. But to do data journalism you don’t need to be a data scientist. They don’t have to be an expert on maths, stats, visualisation etc. At my former employer I worked with Mike – who you’ve already heard from – who could enable me to cross that barrier. I didn’t need to understand the algorithms, but I had that support. You do see more journalist/designer/data scientists working together. I think eventually we’ll see all of those people as journalists though as you are just trying to tell the story using the available tools.

Q2) I wanted to ask about the ethics of data journalism. Do you think that to do data journalism there is a developing field of ethics in data journalism?

A1 – JM) I think that’s a really good question in journalism… But I don’t think that’s specific to data journalism. When I was working at NYT we were working on the Wikileaks data dumps, and there were huge ethical issues there and around the information that was included there in terms of names, in terms of risk. And in the end the methods you might take – whether blocking part of a document out – the technology mignt vary but the ethical issues are the same.

Q2 follow up FOD) And how were those ethical issues worked out?

A1 – JM) Having a good editor is also essential.

A1 – CB) When I was at Thomson Reuters I was involved in running womens rights surveys to collate data and when you do that you need to apply research ethics, with advice from those appropriately positioned to do that.

A1 – RS) There is an issue that traditionally journalists are trained in ethics but data scientists are not trained in ethics. We have policies in terms of data privacy… But much more to do. And it comes down to the person who is building a data model – ad you have to be aware of the possible impact and implications of that model. And risks also of things like the Filter Bubble (Pariser 2011).

Q3 – JO) One thing that came through listening to ? and Jackie, it’s become clear that journalism is a core part of journalism… You can’t get the story without the data. So, is there a competitive advantage to being able to extract that meaning from the data – is there a data science arms race here?

A3 – RS) I certainly look out to NYT and other papers I admire what they do, but of course the reality is messier than the final product… But there is some of this…

A3 – JM) I think that if you don’t engage with data then you aren’t keeping up with the field, you are doing yourself a professional misservice.

A3 – EH) There is a need to keep up. We are a relatively large group, but nothing like the scale of NYT… So we need to find ways to tell stories that they won’t tell, or to have a real sense of what an Economist data story looks like. Our team is about 12 or 14, that’s a pretty good side.

A3 – RS) Across all of our businesses there are 100s in data science roles, of whom only a dozen or so are on data journalism side.

A3 – JM) At the BBC there are about 40 or 50 people on the visual journalism team. But there are many more in data science in other roles, people at the World Service. But we have maybe a dozen people in the lab at any given moment.

Q4) I was struck by the comment about legibility, and a little bit related to transparancy in data. Data is already telling a story, there is an editorial dimension, and that is added to in the presentation of the data… And I wonder how you can do that to improve transparancy.

A4 – JM) There are many ways to do that… To show your process, to share your data (if appropriate). Many share code on GitHub. And there is a question there though – if someone finds something in the data set, what’s the feedback loop.

A4 – CB) In the past where I’ve worked we’ve shared a document on the step by step process used. I’m not a fan of sharing on GitHub, I think you need to hand hold the reader through the data story etc.

Q5) Given that journalims is about holding companies to account… In a world where, e.g. Google, are the new power brokers, who will hold them to account. I think data journalism needs a merge between journalism, data science, and designers… Sometimes that can be in one person… And what do you think about journalism playing a role in holding new power brokers to account.

A5 – EH) There is a lot of potential. These companies publish a lot of data and/or make their data available. There was some great work on 5:38 about Uber, based on a Freedom of Information request to essentially fact check Uber’s own statistics and reporting of activities.

Q6) Over the years we’ve (Robert Gordan Univerity) worked with journalists from various organisations. I’ve noticed that there is an issue, not yet raised, that journalists are always looking for a particular angle in data as they work with it… It can be hard to get an understanding from the data, rather than using the data to reinforce bias etc.

A6 – RS) If there is an issue of taking a data dump from e.g. Twitter to find a story… Well dealing with that bias does come back to training. But yes, there is a risk of journalists getting excited, wanting to tell a novel story, without being checked with colleagues, correcting analysis.

A6 – CB) I’ve certainly had colleagues wanting data to substantiate the story, but it should be the other way around…

Q6) If you, for example, take the Scottish Referendum and the General Election and you see journalists so used to watching their dashboard and getting real time feedback, they use them for the stories rather than doing any real statistical analysis.

A6 – CB) That’s part of the usefulness of reason for reading different papers and different reporters covering a topic – and you are expected to have an angle as a journalist.

A6 – EH) There’s nothing wrong with an angle or a hunch but you also need to use the expertise of colleagues and experts to check your own work and biases.

A6 – RS) There is a lot more to understand how the data has come about, and people often use the data set as a ground truth and that needs more thinking about. It’s somewhat taught in schools, but not enough.

A6 – JM) That makes me think of a data set called gdump(?), which captures media reporting and enables event detection etc. I’ve seen stories of a journalist looking at that data as a canonical source for all that has happened – and that’s a misunderstanding of how that data set has been collected. It’s close to a canonical source for reporting but that is different. So you certainly need to understand how the data has come about.

Comment – FOD) So, you are saying that we can think we are in the business of reporting fact rather than opinion but it isn’t that simple at all.

Q7) We have data science, is there scope for story science? A science and engineering of generating stories…

A7 – CB) I think we need a teamwork sort of approach to story telling… With coders, with analysts looking for the story… The reporters doing field reporting, and the data vis people making it all attractive and sexy. That’s an ideal scenario…

A7 – RS) There are companies doing automatic story generation – like Narrative Science etc. already, e.g. on Little League matches…

Q7 – comment) Is that good?

A7 – RS) Not necessarily… But it is happening…

A7 – JM) Maybe not, but it enables story telling at scale, and maybe that has some usefulness really.

Q8/Comment) There was a question about the ethics and the comment that nothing needed there, and the comment about legibility. And I think there is conflict there about

Statistical databases  – infer missing data from the data you have, to make valid inferences but could shock people because they are not actually in the data (e.g. salary prediction). This reminded me of issues such as source protection where you may not explicitly identify the source but that source could be inferred. So you need a complex understanding of statistics to understand that risk, and to do that practice appropriately.

A8 – CB) You do need to engage in social sciences, and to properly understand what you doing in terms of your statistical analysis, your P values etc. There is more training taking place but still more to do.

Q9 – FOD) I wanted to end by coming back to Howard’s introduction. How could ATI and Edinburgh help journalism?

A9 – JM) I think there are huge opportunities to help journalists make sense of large data sets. Whether that is tools for reporting or analysis. There is one, called Detector.io that lets you map reporting for instance that is shutting down and I don’t know why. There are some real opportunities for new tools.

A9 – RS) I think there are areas in terms of curriculum, on design, ethics, privacy, bias… Softer areas not always emphasised in conventional academic programmes but are at least as important as scientific and engineering sides.

A9 – EH) I think generating data from areas where we don’t have it. At the economist we look at China, Asia, Africa where data is either deliberately obscured or they don’t have the infrastructure to collect it. So tools to generate that would be brilliant.

A9 – CB) Understand what you are doing; push for data being available; and ask us and push is to be accountable, and it will open up…

Q10) What about the readers. You’ve been saying the journalists have to understand their stats… But what about the readers who know how to understand the difference between reading the Daily Mail and the Independent, say, but don’t have the data literacy to understand the data visualisation etc.

A10 – JM) It’s a data literacy problem in general…

A10 – EH) Data scientists have the skills to find the information and raise awareness

A10 – CB) I do see more analytical reporting in the US than in Europe. But data isn’t there to obscure anything. But you have to explain what you have done in clear language.

Comment – FOD) It was once the case that data was scarce, and reporting was very much on the ground and on foot. But we are no longer hunter gatherers in the same way… Data is abundant and we have to know how we can understand, process, and find the stories from that data. We don’t have clear ethical codes yet. And we need to have a better understanding of what is being produced. And most of the media most people consume is the local media – city and regional papers – and they can’t yet afford to get into data journalism in a big ways. Relevance is a really important quality. So my personal challenge to the ATI is: how do we make data journalism pay?

And we are back from lunch and some excellent demos… 

Ericsson, Broadcast & Media Services – Keynote from Steve Plunkett, CTO

Jon Oberlander is introducing Steve Plunkett who has a rich history of work in the media. 

I’m going to talk about data and audience research, and trends in audience data. We collect and aggregate and analyse lots of data and where many of the opportunities are…

24,000 R&D very much focused on telecoms. But within R&D there is a group of broadcast and media services, and I joined as part of a buy out of Red Bee Media. One part of these services are a metadata team who create synposes for EPGs across Europe (2700 channels). We are also the biggest subtitlers in Europe. And we also do media management – with many hundreds of thousands of hours of audio and tv and that’s also an asset we can analyse (the inventory as well as the programme). And we operate TV channels – all BBC, C4, C5, UKTV, France, Netherlands, and in US and our scheduling work is also a source of data. And we also run recommendation engines embedded in TV guides and systems.

Now, before I tak about the trends I want to talk about the audience. Part of the challenge is understanding who the audience is… And audiences change and the rate of change is accellerating. So I’ll show some trends in self-reported data from audiences on what they are watching. Before that a quote from Reed Hastings, Amazon: “TV had a great 50 year run, but now it’s time is over”. TV is still where most impact and viewing hours are but there are real changes now.

So, the Ericsson ConsumerLab Annual Report – participants across the world – 1000 consumers across 20 countries. In home interview based understanding their viewing context, what they are watching and what preferences are. Of course self reported behaviour isn’t the same as real data but we can compare and understand that.

So, the role of services varies between generations. The go-to services are very different between older generations and younger generation. For older viewers it’s linear TV, then DVR, then Play/catch-ip, then YouTube etc. For Younger Generations SVOD is top viewing services – that’s things like Netflix, Amazon Prime etc.

In terms of daily media habits we see again a real difference between use of scheduled linear TV vs. streamed and recorded TV. Younger people again much more likely to use streaming, older using scheduled much more. And we are seeing YouTube growing in importance – generally viewing over 3 hrs per day has increased hugely in the last 4 years, and it is used as a go to space to learn new things (e.g. how to fix the dishwasher).

In terms of news the importance of broadcast news increases with age – still much more important to older consumers. And programming wise 45% of streamed on demand viewing of long content is TV series. Many watch box sets for instance. As broadcasters we have to respect that pattern of use, not all are linear scheduled viewers. And you see this in trends of tweeting and peaks of tweaks of how quickly a newly released online series has been completed.

There is also a shift from fixed to mobile devices. TV Screens and desktop PCs have seen a reduction in viewing hours and use compared to mobile, tablet and laptop use. That’s a trend overtime. And that’s again following generational lines… Younger people more likely to use mobile. Now again, this is self-reported and can vary between countries. So in our broadcast planning understanding content – length of content, degree of investment in High Def etc. – should be informed by those changes. On mobile user generated content – including YouTube but also things like Periscope – still dominant.

In terms of discovering and remembering content it is still the case that friends, reviews, trailers etc. matter. But recommendation engines are important and viewers are satisfied with them. For last two years we’ve asked study group about those recommedation engines: their accuracy; their uncanniness and data and privacy concerns; and an issue of shared devices. So still much more to be done. The scale of Netflix’ library is such that recommendations are essential to help users navigate.

So, that was self-reported. What about data we create and collect?

We have subtitle coverage, often doing the blanket subtitle coverage for broadcasters. We used to use transcribers and transcription machines. We invested in respeaking technologies. And that’s what we use now and those respeakers clean up grammar etc and the technology is trained for their voice. That process of logging subtitles includes very specific timestamps… That gives us rich new data, and also creates a transcript that can sit alongside the subtitles and programme. But it can take 6-7 hours to do subtitling as a whole process, including colour coding speakers etc. And we are looking to see what else subtitlers could add – mood perhaps? etc. as part of this process.

We have a database of about 8.5 million records that include our programme summaries, images on an episode level, etc. And we are working on the system we use to manage this, to improve it.

I mentioned Media Management and we do things like automated transcription – it wouldn’t be good enough for use in broadcast but

Media RIM – 60 telecom operators use it for IPTV and collects very granular data from TV viewing – all collected with consent. Similar for OTT. And similar platforms for EPG. Search queries. Recommendations and whether acted upon. And we also have mobile network data – to understand drop off rates, what’s viewed for a particular item etc.

We are in the middle of the broadcaster and the audience, so our work feeds into broadcasters work. For insight like segmentation, commissioning, marketing, scheduling, sales… For personalisation – content recommendations, personalised channels that are unique to you, targeted advertising, search, content navigation, contextual awareness. One of the worst feedback comments we see is about delivery quality so when it comes to delivery quality we apply our data to network optimisation etc.

In terms of the challenges we face they include: consumer choice; data volumes – and growing fast so finding value matters; data diversity – very different in structure and form so complex task; expertise – there is a lack of skills embedded in these businesses to understand our data; timeliness – personal channels need fast decisions etc. real time processing is a challenge; privacy – one of the biggest ones here, and the industry needs to know how to do that and our feedback on recommendation engines is such that we need to explain where data is coming from, to make that trusted.

In terms of opportunities: we are seeing evolving technology; cloud resources are changing this fast; investment – huge in this area at the moment; consumer appetite for this stuff; and we are in an innovation white space right now – we are in early days…

And finally… An experimental application. We took Made in Chelsea and added a graph on the viewing plan that shows tweets and peaks… And provide as a navigation system based on tweets shared. And on the right hand side navigation by character and follow their journey. We created some semantic visualisation tools for e.g. happy, sad, funny moments. Navigation that focuses on the viewers interest.

Audience Engagement Panel Session – Jon Oberlander (Moderator), University of Edinburgh

Jon is introducing his own interest in data science, in design informatics, and linguistics and data science, with a particular mention for LitLong, similarly a colleague in Politics is analysing the public interest in the UK and EU, but also reaction to political messages. And finally on the Harmonium project at the Edinburgh International Festival – using music and data on musical performers to create a new music and visualisation project, with 20k in person audience and researchers monitoring and researching that audience on the night too…

Pedro Cosa – Data Insights and Analytics Lead, Channel 4

I’m here to talk a bit about the story of Channel 4 and data. Channel 4 is a real pioneer in using data in the UK, and in Europe. You’ve all heard Steve’s presentation on changing trends – and these are very relevant for Channel 4 as we are a public service broadcasting but also because our audience is particularly young and affluent. They are changing their habits quickly and that matters from an audience and also an advertising issue for us. Senior management was really pushing for change in the channel. Our CEO has said publicly that data is the new oil of the TV industry and he has invested in data insights for Channel 4. The challenge is to capture as much data as possible, and feed that back to the business. So we used registration data from All4 (was 4OD) and to use that site you have to register. We have 13 million people registered that way and so that’s already capturing details on half our target audience in the UK. And that moves us from one to many, to one to one. And we can use that for targeted advertising, and that comes with a premium paid for advertisers, and to really personalise the experience. So that’s what we are doing at the moment.

Hew Bruce-Gardyne – Chief Technology Officer, TV Squared

We are a small company working on data analytics for use by advertisers, that in turn feed back into content. My personal background is as an engineer, the big data of that side of number crunching is where I come from. From where I am sitting audience engagement is a really interesting problem… If you see a really big engaging programme that seems to kill the advertising so replays, catch up and seeing opportunities there is, for us, gold dust.

Paul Gilooly – Director of Emerging Products, MTG (Modern Times Group)

MTG are a Scandinavian pan-European broadcaster, we have the main sports and Hollywood rights as well as major free to air channels in Scandinavian countries. And we run a thing called ViPlay which is an SVOD service like (and predating) Netflix. Nordics are interest as we have high speed internet, affluent viewers, markets where Apple TV is significant, disproportionately compared to the rest of Europe. So when I think of TV I think of subscribing audience, and Pay TV. And my concern is churn – and a amore engaged customer is more likely to stick around. So any way to increase engagement is of interest, and data is a key part of that. Just as Channel 4 are looking at authentication as a data starting point, so are we. And we also want to encourage behaviours like recommendations of products and sharing. And some behaviours to discourage. And data is also the tool to help you understand behaviours you want to discourage.

For us we want to increase transactions with viewers, to think more like a merchandiser, to improve personalisation… So back to the role of data – it is a way to give us a competitive advantage over competitors, can drive business models for different types of consumer. It’s a way to understand user experience, quality of user experience, and the building of personalised experiences. And the big challenge for me is that in the Nordics we compete with Netflix, with HBO (has direct to air offering there). But we are also competing with Microsoft, Google, etc. We are up against a whole new range of competitors who really understand data, and what you can do with data.

Steve Plunkett – CTO, Broadcast & Media Services, Ericsson

No intro… as we’ve just heard from you… 

Q&A

Q1 – JO) Why are recommendations in this sector so poor compared to e.g. Amazon?

A1 – SP) The problem is different. Amazon has this huge inventory, and collective recommendation works well. Our content is very different. We have large content libraries, adn collective recommendation works differntly. We used to have human curators programming content, they introduced serendipity nad recommendation engines are less good at that. We’ve just embarked on a 12 month project with three broadcasters  to look at this. There is loads of research on public top 10s. One of the big issues is that if you get a bad recommendation it’s hard to say “I don’t like this” or “not now”, they just sit there and the feedback is poor… So important to solve. Netflix invested a great deal of money in recommendations. They invested $1 million for a recommender that would beat their own by 10% and that took a long time. Data science is aligned with that of course.

A1 – PC) Recommendations are core for us too. But TV recommendations are so much more complex than retail… You need to look at data analyse… You have to promote cleverly, to encourage discovery, to find new topics or areas of debate, things you want to surface in a relevant way. It’s an area C4 and also BBC looking to develop.

A1 – HBG) There is a real difference between retail and broadcast – about what you do but also about the range of content available… So even if you take a recommendation, it may not reflect true interest and buy in to a product. Adds a layer of complexity and cloudiness…

A1 – SP) Tracking recommendations in a multi device, multi platform space is a real challenge… Often a one way exchange. Closing loop between recommendation and action is hard…

Q2 – JO) Of course you could ask active questions… Or could be mining other streams… How noisy is that, how useful is that? Does it bridge a gap.

A2 – SP) TV has really taken off on Twitter, but there is disproportionate noise based on a particular audience and demographic. That’s a useful tool though… You can track engagement with a show, at a point of time within a show… But not neccassarily recommendations of that viewer at that time… But one of many data sets to use…

Q3 – JO) Are users engaging with your systems aware of how you use their data, are they comfortable with it?

A3 – PC) C4 we have made a clear D-Word promise – with a great video from Alan Carr that explains that data. You can understand how it is use, can delete your own data, can change your settings, and if you don’t use the platform for 2 years then we delete your data. Very clear way to tell the user that you are in control.

A3 – SP) We had a comment from someone in a study group who said they had been categorised by a big platform as a fan of 1980s supernatural horror and didn’t want to be categorised in that way, or for others to see this. So a real interest in transparancy there.

A3 – PG) We aren’t as far ahead as Channel 4, they are leading the way on data and data privacy.

Q4 – JO) Who is leading the way here?

A4 – PG) I think David Abrahms (C4) needs great credit here, CEO understands importance of data science and it’s role in the core business model. And that competitors for revenue are Facebook, Google and so forth.

Q5 – JO) So, trend is to video on demand… Is it also people watching more?

A5 – SP) It has increased but much more fragmented across broadcast, SVOD, UGC etc. and every type of media has to define its space. So YouTube etc. is eating into scheduled programming. For my 9 year old child the streaming video, YouTube etc. is her television. We are competing with a different set of producers.

A5 – PG) The issue isn’t that linear channels do not allow you to collect data. If you have to login to access content (i.e. Pay TV) then you can track all of that sort of data. So DR1, Danish TV channel and producer of The Killing etc. is recording a huge drop in linear viewing by young people, but linear still has a role for live events, sport etc.

A5 – HBG) We do see trends that are changing… Bingeathons are happening and that indicates not a shortness of attention but a genuine change. Watching a full box set is the very best audience engagement. But if you are at a kitchen table, on a device, that’s not what you’ll be watching… It will be short videos, YouTube etc.

To come back to the privacy piece I was at a conference talking about the push to ID cards and the large move to restrict what people can know about us… We may lose some of the benefits of what can be done. And on some data – e.g. Medical Informatics – there is real value that can be extracted there. We know that Google knows all about us… But if our TV knows all about us that’s somehow culturally different.

Q6) Privacy is very high, especially at younger age ranges, so what analysis have you done on that?

A6) Not a huge amount on that, but this is self-reported. But we know piracy drops down where catch up and longer catch up windows are available – if content can be viewed legitimately and it seems that it is when available.

Q6 – follow up) Piracy seems essentially like product failure, and how do you win back your viewers and consumers.

A6 – HBG) A while back I saw a YouTube clip of the user experience of pirated film versus DVD… In that case the pirated film was easier, versus the trailers, reminders not to pirate etc. on the DVD. That’s your product problem. But as we move to subscription channels etc. When you make it easy, that’s a lot better. If you try to put barriers up, people try to find a way around it….

A6 – PG) Sweden has a large piracy issue. The way you compete is to deliver a great product and user experience and couple that with content unique to your channel. So for instance premium sports for example – so pirate can’t meet all needs of consumer. But also be realistic with price point.

A6 – HBG) There is a subtle difference between what you consume – e.g. film versus TV. But from music we know that pirating in the music industry is not a threat – that those are also purchasing consumers. And when content creators work with that, and allow some of that to happen, that creates engagement that helps. Most successful brand owners let others play with their brand.

A6 – PC) Piracy is an issue… But we even use piracy data sources for data analysis. Using bit torrent to understand popularity of shows in other places, to predict how popular they will be in the UK.

Comment – JO) So, pirates are data producers?

A6 – PC) Yes, and for scheduling too.

Q7) How are you dealing with cross channel or cross platform data – to work with Google or Amazon say. I don’t see much of that with linear TV. Maybe a bit with SVOD. How are mainstream broadcasters challenging that?

A7 – PC) Cross platform can mean different things. It may be Video On Demand as well as broadcast on their TV. We can’t assume they are different, and should look to understand what the connections are there… We are so conscious and cautious of using third party data… But we can do some content matching – e.g. advertiser customer base, and much more personalised. A real link between publisher and advertiser.

Q7 follow up) Would customer know that is taking place?

A7 – PC) It is an option at sign up. Many say “yes” to that question.

A7 – PG) We still have a lot to do to track the consumer across platforms, so a viewer can pick up consuming content from one platform to another. This technology is pretty immature, an issue with recommendation engines too.

A7 – SP) We do have relationships with third party data companies that augment what we collect – different from what a broadcaster would do. For this it tends to be non identifiable… BUt you have to trust the analyst to have combined data appropriately. You have to understand their method and process, but usually they have to infer from data anyway as usually don’t have source.

Q8 – JO) We were talking about unreliable technologies and opportunities… So, where do you see wearable technologies perhaps?

A8 – SP) We did some work using facial recognition to understand the usefulness of recommendations. That was interesting but deploying that comes with a lot of privacy issues. And devices etc. also would raise those issues.

A8 – PC) We aren’t looking at that sort of data… But data like weather matters for this industry, local events, traffic information – as context for consumption etc. That is all being considered as context for analysis. But we also share our data science with creative colleagues – that, say, technology will tell you when content is performed/shown. There is a subjective human aspect that they want to see, to dissect elements of content so machine can really learn… So is there sex involved… Who is the director, who is the actress… So many things you can put in the system to find this stuff out. Forecasting really is important in this industry.

A8 – HBG) The human element is interesting. Serendipity is interesting. From neuroscientist point of view I always worry about the act of measure… We see all the time that you can see the same audience, same demographic, watching the same content and reacting totally differently at different times of day etc. And live vs catch up say. My fear, and a great challenge, is how to get a neuroscience experiment valid in that context.

Q9 – from me) What happens if the data is not there in terms of content, or recommendation engines – if the data you have tells you there is a need for something you don’t currently have available. Are you using data science to inform production or content creation, or for advertising?

A9 – SP) The research we are currently doing is looking at ways to get much better data from viewers – trying things like a Tinder-like playful interface to really get a better understanding of what users want. But we also, whenever there are searches etc. capture not only what is available on that platform but also what is in demand but not yet available, and also provding details of that search iss to commissioning teams to inform what they do.

A9 – PG) There are some interesting questions about what is most valuable… So. you see Amazon Prime deciding on vale of Jeremy Clarkson and Top Gear team… And i think you will increasingly see purchasing based on data. And when it comesto commissioning we are looking to understand gaps in our portfolio.

A9 – PC) We are definitely interested in that. VOD is a proactive thing… YOu choose as a view… So we have an idea of micro genres that are specific to you… So we have say, Sex/Pervert corner; we have teenage american comedy; etc. and you can see how micro genres are panning out… And you can then telling commissioners what is happening on a video on demand side… BUt that’s different to commissioning for TV, and convincing that

A9 – HBG) I think that you’ve asked the single greatest question at a data science conference: what do you do if the data is not there. And sometimes you have to take a big leap to do something you can’t predict it… And that happens when you have to go beyond the possibilities of the data, and just get out there and do it.

A9 – SP) The concern is such that the data may start to reduce those leaps and big risks, and that could be a concern.

JO) And that’s a great point to finish on: that no matter how goos the data science we have to look beyond the data.

And after a break we are back… 

BBC – Keynote from Michael Satterthwaite, Senior Product Manager

I am senior project manager on a project called BBC Rewind. We have three projects looking at opportunities, especially around speech to text, from BBC Monitoring, BBC Rewind, and BBC News Labs. BBC Rewind is about maximising value from the BBC archive. But what does “value” mean? Well it can be about money, but I’m much more interested in the other options around value… Can we tell stories, can we use our content to improve people’s health… These are high level aims but we are working with the NHS, Dementia organisations, and running a hack event in Glasgow later this month with NHS, Dementia UK, Dementia Scotland etc. We are wondering if there is any way that we can make someone’s life better…

So, how valued is the BBC’s Archive? I’m told it’s immeasurable but what does that mean? We have content in a range of physical locations some managed by us, some by partners. But is that all valuable if it’s just locked away? What we’ve decided to do to ensure we do get value, is to see how we can extract that value.

So, my young niece, before she was 2 she’d worked out how to get into her mum’s ipad… And her dad works a lot in China, and has an iphone. In an important meeting he’d gotten loads of alerts… Turns out she’d worked out how to take photos of the ceiling and send them to him… How does this relate? Well my brother in law didn’t delete those pictures… And how many of us do delete our photos? [quick poll of the room: very very few delete/curate their digital images]

Storage has gotten so cheap that we have no need to delete. But at the BBC we used to record over content because of the costs of maintaining that content. That reflected the high price of storage – the episodes of Doctor Who taped over to use for other things. That’s a decision for an editor. But the price of storage has dropped so far that we can, in theory, keep everything from programmes to script and script notes, transcripts etc. Thats hard to look through now. Traditionally the solution is humans generating metadata about the content. But as we are now cash strapped and there is so much content… is that sustainable?

So, what about machines – and here’s my Early Learning Centre bit on Machine Learning… It involves a lot of pictures of pandas and a very confused room… to demonstrate a Panda and Not a Panda. When I do this presentation to colleagues in production they see shiny demos of software but don’t understand what the realistic expectations of that machine are. Humans are great at new things and intelligence, new problems and things like that…

Now part two of the demo… some complex maths… Computers are great at scale, at big problems. There is an Alan Turing quote here that seems pertinent, about it not being machine or humans, its finding ways for both to work together. And that means thinking about what machines are good at? Things like initial classification, scale, etc. What are humans good at? Things like classifying the most emotional moment in a talk. And we also need to think about how best we can use machines to complement humans.

But we also need to think about how good is good enough? If you are doing transcripts of an hour long programme, you want 100% or close enough and finish with humans. But if finding a moment in a piece of spoken word, you need to find the appropriate words for that search. That means your transcript might be very iffy but as long as it’s good enough to find those key entities. We can spend loads of time and money getting something perfect, when there is much more value in getting work to a level of good enough to do something useful and productive.

This brings me to BBC Rewind. The goal of this project is to maximise the value from the BBC Archives. We already have a lot of digitised content for lots of reasons – often to do with tape formats dying out and the need to build new proxies. And we are doing more digitising of selected parts of the BBC Archives. And we are using a mixture of innovative human and computer approaches to enrichment. And looking at new ways to use archives in our storytelling of audiences.

One idea we’ve tried is BBC Your Story which creates a biography based on your own life story, through BBC Archive content. It is incredibly successful as a prototype but we are looking at how we can put that into production, and make that more personalised.

We’ve also done some work on Timeline, and we wanted to try out semantic connections etc. but we don’t have all our content marked up as we would need so we did some hand mark up to try the idea out. My vision is that we want to reach a time when we can search for:

“Vladimir Putin unhappily shaking hands with Western Leaders in the rain at the G8, whilst expressing his happiness.” 

So we could break that into many parts requiring lots of complex mark up of content to locate suitable content.

At the moment BBC Rewind includes speech-to-text in English based on the Kaldi toolset – it’s maybe 45% accurate off the shelf – but that’s 45% more of the words than you had before, and a confidence value; Speech-to-text in the Welsh language; Voice identification; speaker segmentation – Speech recognition that identify speakers is nice, but we don’t need that just yet. And even if we did we don’t need that person to be named (a human can tag that easily) and then train algorithms off that; face recognition – is good but hard to scale, we’ve been doing some work with Oxford University in that area. And we get to context…. Brian Cox versus (Dr) Brian Cox can be disentangled with some basic contextual information.

Finally, we have an exciting announcement. We have BBC Monitoring – a great example of how we can use machines to help human beings in their monitoring media. So we will be creating tools to enable monitoring of media. In this project BBC are partnering with University of Edinburgh, UCL, Deutsche Welle and others in an EU funded Horizon 2020 project called SUMMA – this project has four workstreams and we are keen to make new partnerships

The BBC now runs tech hack events which resulted in new collaborations – including SUMMA – more hack events coming soon so contact Susanne Weber, Language Technology Producer in BBC News Labs. The first SUMMA hack event, will be end of next year and will focus on the automated monitoring of multi-media sources: audio-visual, text etc.

Lets try stuff faster and work out what works – and what doesn’t – more quickly!

Unlocking Value from Media Panel Session – Moderator: Simon King, University of Edinburgh

Our panel is…

Michael Satterthwaite – Senior Product Manager, BBC
Adam Farqhuar – Head of Digital Scholarship, British Library
Gary Kazantsev R&D Machine Learning Group, Bloomberg
Richard Callison – brightsolid (DC Thomson and Scottish Power joint initiative)

Q1 – SK) Lets start with that question of what value might be, if not financial?

A1 – GK) Market transparancy, business information – there are quantitative measures for some of these things. But a very hard problem in general.

A1 – AF) We do a lot of work on value in the UK, and economic impact, but we also did some work a few years back sharing digitised resources onto Flickr and that generated huge excitement and interest. That’s a great example of where you can create valuge by being open, rather than monetising early on.

A1 – MS) Understanding value is really interesting. Getty uses search to aid discovery and they have learned that you can use search to do that, to use the data you are capturing to ensure users access what they want and want to buy quickly. For us, with limited resources, the best way to understand value and impact is to try things out a bit, to see what works and what happens.

A1 – AF) Putting stuff out there without much metadata can give you some really great crowd data. With a million images we shared, our crowd identified maps from those materials. And that work was followed up with georeferencing those maps on the globe. So, even if you think there couldn’t possibly be enough of a community interested in doing this stuff, you can find that there really is that interest and who want to help…

A1 – MS) And you can use that to prioritise what you do next, what you digitise next, etc.

Q2 – SK) Which of the various formats of media are most difficult to do?

A2 – MS) Images are relatively straight forward but video is essentially 25 pictures per second… That’s a lot of content… That means sampling content else we’d crash even Amazon with the scale of work we have. And that sampling allows you to understand time, an aspect that makes video so tricky.

Q3 – SK) Is there a big difference between archive and current data…

A3 – RC) For me the value of content is often about extracting value from very local context, And it leads back to several things said earlier, about perhaps taking a leap of faith into areas the data doesn’t show, and which could be useful in the future… So we’ve done hand written data which was the only Census that was all handwritten – 32m rows of records on England and Wales and had to translate that to text… We just went offshore, the BPO outsourced… That was just a commercial project as we knew there was historical and genealogical interest… But not so many data sets like that around.

But working with the British Library we’ve done digitisation of newspapers both from originals and microfilm. OCR isn’t perfect but it gets it out there… The increase we have in multimedia online trigged by broadcast – Who Do You Think You Are? triggers huge interest in these services and we were in the right place at the right time to make that work.

A3 – GK) We are in an interesting position as Bloomberg creates it’s own data but we also ingest more than 1 million news documents in 30 languages from 120k sources. The Bloomberg newsroom started in 1990 and they had the foresight to collect clean clear digital data from the beginning of our work. That’s great for accessing, but extracting data is different… For some issues like semantic mark up and entity disambiguation… And huge issues of point in time correctness – named entities changing meanings over time. And unless someone encoded that into the information, then it is very difficult to disambiguate. And the value of this data, it’s role in trading etc., needs to be reliable.

I kind of don’t recognise Mike’s comments on video as there is object recognition available as an option… But I think we get more value out of text than most people, and we get real value from audience. Transcription and beyond… Entity recognition, dialogue structure, event extraction… A fairly long NLP pipeline there…

A3 – AF) The description of what you want to identify, those are very similar desires to those we want in the hunanities, and has additional benefit to journalists too. Is text search enough? Not really. They are an interesting way in… But text isn’t the best way to understand either historical images in a range of books, but also isn’t that useful in the context of the UK Web Archive and images in that. Much of what may be of interest is not the text, but perhaps better reduced to a series of shapes etc.

Q4) There has been a mention of crowd sourcing already and I was wondering about that experience, what worked and did not work, and thinking back to Mike’s presentation about what might work better?

A4 – AF) We found that smaller batches worked better… People love to see progress, like to have a sense of accomplishment. We found rewards were nice – we offered lunch with the head of maps at the British Library and that was important. Also mix it up – so not always the same super hard problems all the time

A4 – MS) I was going to give the BL example of your games machine… A mix of crowdsourcing and gamification.

A4 – AF) It’s very experimental but, as mentioned in the earlier panel session about the Tinder-like app. So we’ve worked with Adam Crimble to build an arcade game to do image classification and we are interested to see if people will use their time differently with this device. Will they classify images, help us build up our training sets. But the idea is that it’s engagement away from desktop or laptops…

A4 – RC) We have tried crowdsourcing for corrections. Our services tend to be subscriptions and Pay as You Go. But people still see value in contributing. And you can incentivise that stuff. And you see examples across the world where centrally or government websites are using crowd sourcing for transcription.

A4 – GK) You could argue that we were innovators in crowd sourcing at Bloomsberg, through blogs etc. And through tagging of entities. What we have learned from crowdsourcing is that it isn’t good for everything. But hard when specialist knowledge is needed, or specific languages needed – hard to get people to tag in Japanese. We aren’t opposed to paying for contribution but you have to set it up effectively. We found you have to define tasks very specifically for instance.

Q5) Talking about transposing to text implies that that is really possible. If we can’t do image descriptions effectively with text then what else should we be doing… I was wondering what the panel thought in terms of modalities of data…

A5 – MS) Whatever we do to mark up content is only as good as our current tools, understanding, modalities. And we’d want to go back and mark it up differently. In Google you can search for an image with an image… It’s changed over time… Now it uses text on the page to gather context and present that as well as the image back to you… If you can store a fingerprint to compare to others… We are doing visual searches. searches that are not text based. Some of these things already exist and they will get better and better. And the ability to scale and respond will be where the money is.

Q6) The discussion is quite interesting as at the moment it’s about value you define… But you could see the BBC as some form of commons… It could be useful for local value, for decision making, etc. where you are not in a positiion to declare the value… And there are lots of types of values out there, particularly in a global market.

A6 – MS) The BBC have various rules and regulations about publishing media, one of which is humans always have to check content and that is a real restriction on scale, particularly as we are looking to reduce staff. We ran an initiative called MCB with University of Edinburgh that opened some of the idea But ideally we would have every single minute of broadcast TV and radio into the public domain… But we don’t have the rights to everything… In many cases we acquired content before digital which means that you need to renegotiate content licenses etc. before digitising etc.

A6 – AF) Licenses can be an issue, privacy and data protection can be an issue. But we also have the challenge of how we meet user needs and actually listening to those needs. Someone we have to feel comfortable providing a lower level service, and may require higher skills (e.g. coding) to use… That can be something wonderful, not just super polished services required. But that has to be a service that is useful and valuable. But that’s super useful. And things will change in terms of what is useful, what is possible, etc.

A6 – GK) For us it’s an interesting question. Our users won’t say what they want, so you have to reverse engineer then do rapid product development… So we do what you (Micheal) suggest – building rapid prototypes to try ideas out. But this isn’t just a volatile time, but a volatile decade, more!

Q7) Can you tell us anything about how you manage the funnel for production, and how context is baked in in content creation process…

A7 – GK) There is a whole toolset for creating and encoding metadata, and doing so in a way meaningful to people beyond the organisation.. But I could talk about that for an hour so better to talk about this later I think.

Q8 – SK) How multilingual do you actually need to be in your work?

A8 – GK) We currently ingest content in 34 languages, but 10 languages cover the majority – but things changes quickly. Used to be 90% of content ingested was in English, now 70-80%. That’s a shift… We have not yet seen the case that suddenly lots of data that appears in a language where there was previously none. Instead we see particularly well resourced languages. Japanese is a large well resourced language and many resources in place, but very tricky from a computational perspective. And that can mean you still need humans.

A8 – MS) I probably have a different perspective on languages… We have BBC Research working in Africa with communities just going online for the first time. There are hundreds of new languages in Africa, but none will be a huge language… A few approaches… Can either translate directly, or you can convert into English, then translate from there. Some use speech to text – with Stephen Hawking type voice to provide continuity.

A8 – AF) Our collections cover all languages at all times… an increasingly difficult challenge.

Comment  – Susanne, BBC) I wanted to comment on speed of access to different language. All it takes is a catastrophe like an Ebola outbreak… Or disaster in Ukraine, or in Turkey… And you suddenly have the use case for ASR – machine translation. And you see audience expectations there.

A8 – MS) And you could put £1M into many languages and make little impact… But if you put that into one key language, e.g. Pashtu you might have more impact… We need to consider that in our funding and prioritisation.

A8 – GK) Yes, one disaster or event can make a big difference… If you provide the tools for them to access information and addt their own typing of their language… In the case of, say, Ebola you needed doctors speaking the language of the patient… But I’m not sure there is a technological solution. Similarly a case on the Amazon… Technology cannot always help here.

Q9) Do you have concerns that translations might be interpreted in different contexts and be misinterpreted? And the potential to get things massively wrong in another language. Do you have systems (human or machine) to deal with that?

A9 – AF) I won’t quite answer your question but a related thing… In some sense that’s the problem of data… Data becomes authoritative and unless we make it accessible, cite it, explain how it came about… Then it becomes authoritative. So we have large data collections being made available – BBC, BL etc. – and they can be examined in a huge set of new ways… They require different habits, tools, approaches than many of us are used to using, and different tools that e.g. academics in the humanities. And we need to emphasise the importance of proper citing, sharing, describing etc.

A9 – MS) I’d absolutely agree about transparency. Another of Susanne’s projects, Babel, is giving a rough translation that can then be amended. But an understanding of the context is so important.

A9 – GK) We had a query last week, in German, for something from Der Speigel… Got translated to The Mirror… But there is a news source called The Mirror… So translating makes sense… Except you need that outside data to be able to make sense of this stuff… It’s really an open question about where that should be and how you would do that.

Q10 – SK) So, a final question: What should ATI do in this space?

A10 – RC) For us we’d like to see what can be done on an SME level, and some product to go to market…

A10 – GK) I think that there are quite a lot of things that the ATI can do… I think there is a lot of stuff the industry won’t beat you too – the world is changing too rapidly for that. I think the University, the ATI should be better connected to industry – and I’ll talk about that tomorrow.

A10 – AF) As a national institution has a lot of data and content, but the question is how we can make sense of it… That large collection of data and content. The second issue is Skills – there is a lot to learn about data and working with large data collections. And thirdly there is convening… data and content, technologists, and researchers with questions to ask of the data and I think ATI can be really effective in bringing those people together.

A10 – MS) We were at an ideas hack day at the British Library a few weeks back and that was a great opportunity to get those people who create data, who research etc. and bringing it together. And I think ATI should be the holder of best practice to connect the holders of content, academia, etc. to work together to add value. For me trying to independently add value where it counts really makes a difference. For instance we are doing some Welsh speech to text work which is work I’m keen to share with others  in some way…

SK: Is there anything else that anyone here wants to add to the ATI to do list ?

Comment: I want to see us get so much better at multilingual support, the babelfish for all spoken languages ideally!

 

Closing Remarks – Steve Renals, Informatics, University of Edinburgh

I think today is something of a kick off for building relationships and we’ve seen some great opportunities today. And there will be more opportunity to do this over drinks as we finish for today.

And with that we are basically done, save for a request to hand in our badges in exchange for a mug – emblazoned with an Eduardo Paolazzi inspired by a biography of Alan Turing – in honour of Turing’s unusual attachment to his mug (which used to be chained to the radiator!).

Aug 242012
 

Today I am dipping in and out (around other meetings) of Interactive Scotland@Turing, the conference day of the Turing Festival. It’s being live streamed over on the AmbITion Scotland website.

I have just caught the tail end of “Big data (and doing something with it)” – Dave Coplin, Director of Search and Futurologist, Microsoft. Designing big data, privacy, and security in from the beginning was the big message there. Also that your child’s first search engine query is a moment of huge pride.

Next up is “Information is beautiful – a picture paints a thousand words” – John Loudon, Head of Multimedia, BiP Solutions. He’s basically giving the big picture on infographics and reminding the audience that infographics are hundreds of years old. He’s running through several examples, looking at how effective they are and the power of graphics even if the text is in another language – a great illustration that infographics should make basic sense even before you begin labelling. He’s also reminding us of maps as infographics and innovations such as contour lines.

Infographics shouldn’t just be about clicks, it should be about learning, discovery. It should represent interesting research or data. They need to be engaging. BiP now use infographics for their own corporate management – looking for demand on their helpdesk to plan holidays say. The company produce a magazine for the MoD and decided to start trialling infographics in this. We had the data, then we wanted to know what the key questions are here for readers. Asking those questions to the data made for surprises. After checking and double checking it became clear that those surprises were correct representations of the data – that’s really useful for us and for our readers.

Some key lessons for infographics:

  • Don’t just use templates.
  • Don’t be over complex – use a basic table if that is better.
  • Tell a story, have great content, and make it relevant and right for your audience.
  • Find data to compare and contrast, data on it’s own can be quite boring so you have to draw out the interesting .

What are the benefits of data infographics? Well they should communicate information, they give you an opportunity to inform your audience and, yes, to share and be sociable.

 

And now… Richard Ayers, CEO, Seven League – “Ambassador, with this datatainment you are really spoiling us”

“I will be talking about how we use infographics in our work which is mainly in the sports and creative industries. I talk about something I call the “Foreign Office Strategy” for social media. The consulate have made decisions about relative merit of having a consulate in a particular area. Some run events, some do huge cultural work, some just stamp passports. Social media is like that – you have to make a decision over the merits of where and how much to engage in each area.

And data can be the same way. But you have to make sure you tell stories with that data and I’m going to talk

Film – BFI. the BFI is really the centre for film in Britain. They have a massic beautiful archive. They don’t just have film. There are 20k unpublished film scripts, cinema ephemera, personal papers, all press photos from Doctor Who. Even kids drawings – although in the modern era these are not kept but destroyed for child safety reasons. Behind the scenes images of Star Wars are in the archive… huge amounts of materials. But only 90,000 films are digitised. But all will be done, and the metadata will be good! But not all of that metadata is known – shooting location etc. That’s a crowdsourcing/wiki opportunity. I’m going to run through some stuff that is coming and welcome more suggestions, especially if they make BFI, a charity, a bit of money. One of the things that’s already in the works are Time Capsule – capturing the nations memories of viewing film; alternative forms of navigation – mind maps that show connections between films, directors etc; we’ve got 100 greats? as well. Now I get sick of that concept but I’m interested in people’s 100 greatest and then overlaying BFI knowledge and undestanding; film tracker – recommends upcoming films and connects to materials in the archive, sight and sound reviews etc. And we haven’t even got to the API for SID (our Systems Information Database) and that will create huge opportunities to change things. So that’s film…

Onto sports… Datatainment is a term I’ve blogged about a lot. I love sport, football isn’t my favourite but I love sport. But there is an increadible level of sports data. This is sprint speeds, jump heights, tackles, etc. There was a campaign for Heinekan – a game, kind of like betting, on particular players. It was ok and interactive but not great. More exciting is the Major League Baseball app which lets you switch between footage, commentary, stats, alternative views, pitch images – really really nice. Imagine if we could do that with cricket? There are so many stats and if you are a fan you want all that! In the modern era at Lords they show scores from Teletext!

Lets talk about LimeLight. They have been backroom guys. Sports England says that engagement in mainstream sport is kind of holding steady but the areas in growth is 5ks, 10ks, triathlon, cycling etc. There is a tremendous involvement in personal sporting activity. I think that’s great. LimeLight organised the road race, time trial, marathons, and triathlon swim. They also run those kinds of the event – with 30-40k participants – all through the year. And there are chips on all of these competitors… the things you can do. There is a tendency to look at participants as individuals but he is also a one-man team – with support structures, there are loads of people supporting these people here. There is a huge opportunity here… LimeLight don’t put on a race for 30k people but a 9 month sports engagement campaign for 110k people.

MyPace is a great app here – it tracks the race and lets you replay dots on the map, picks out routes of winners etc. It uses simple data points here. What’s interesting is that for the race coming up in October we are putting maps and interactive boards and an app so that spectators can know what’s doing on throughout, enabling better engagement and support. The BBC is starting to do datatainment – presenting old school TV over the top. We are seeing line graphs in the Olympics coverage. New to TV but sports analysts have long been hugely into data but it’s never been used to engage and interpolate the sport for the audience. On football and rugby the stats as is really don’t engage anyone but those that already know about the sport. The Olympics have really presented a possibility to make this data accessible. We also saw virtual world record markers on the pool. We also saw Emoto – Twitter sentiment analysis of the Olympics. I’m not a huge fan of sentiment analysis but it gives you an interesting view into what’s happening. The BBC showing relative heights and weights of athletes – and your own place in the scheme of things. And you could download the data table for all athletes…

And that brings me to Open Data. We’ve been doing lots of data work and products that will launch. But in the meantime we’ve been working on Manchester City Performance data – for last year – and making that freely available to play with and that’s groundbreaking.

Perhaps opening up that data will open those sports to those who have not previously engaged with them: we’ll have established the right data and social embassies.”

And with that I leave the livestream for a meeting but will return…

OK, it’s nearly 4pm and I have returned from a series of meetings and lunch to catch the last two hours of the conference…

I’ve dipped in mid-talk about Brewster which aggregates multiple profiles [just been investigating reviews and it appears that Brewster is useful for those in the US but not the UK at present]. Gavin Dutch has just commented that he is most excited by devices combining with the real world. Like tap-a-note – a hacking project for Evernote using a chip that you placed around the house – say on your fridge – and if you tapped that note you’d see alerts associated with that chip/object. Another panel member has been musing over web apps vs cloud apps – a nice poll of the room (full of geeks) indicate that they have trouble understanding just what the cloud is, and what is and is not a cloud app. Finally a panel member says that “hardware is red hot”. And that session (the end of “Put bangers in your mobile mashup”) closes on a prediction of hoverboards…

The folks at Dynamic Earth have headed for coffee so a wee bit of a break for the webstreamers here but up next are:

 “Fragmentation & the State of Online Video” –  Luke Gaydon, VP Operations EMEA, Brightcove

Luke will be talking about online video from the perspective of Brightcove, a cloud based content services provider. Our best known product is VideoCloud and I’ll be looking at this issue from this perspective. VideoCloud is everything to allow you to manage and distribute video, with around 4500 customers. Some of these customers include big media and non media brands and include EMI, Sky, STV etc. They are responsible for 800-900 million streams going through the VideoCloud platform.

“I wanted to dwell a little bit on some key moments in the history of online video… firstly a fashion show stream taken down by denial of service attack, Steve Jobs, US Elections, iPlayer and NetFlix, Hulu launches – online legal video for free, Obama’s inaugeration speech, launch of Apple TV, the Royal Wedding (2.9 M concurrent streams), Olympics 2012 (hugely outperforms that royal wedding number). This year US viewers will stream more online video than DVD and Blu-Ray combined.

I used to work at Channel 4 and I used to sell Big Brother. We had clips, we put them onto a RealMedia player. It was a simple workflow. But now video is increasingly complicated. You have the video library of short, long, on demand, live video. The libraries increase in size exponentially. Then the places/platforms that publishers want to distribute to: mobile apps and the mobile web, the website, set-top boxes, connected TVs, gaming console, and via Social Media through Facebook etc. Suddenly the workflow is much more complicated. So to put this into context. 100m households in the world own at least one connected TV, huge uptake of broadband and smartphones. But all these devices are different. Whilst we’ve seen video standards like HTML5 and h.264 and that is great but you will never get all device producers to agree.

When Warner Brothers released the Dark Knight they showed it via Facebook. That seemed bizarre to me but my young relatives sit there with Facebook open all day and it made sense to them. That’s just where they hang out online.

The big question around connected TVs is the extent within which people actually use their functionality. Apparently quite a lot of connected TVs don’t actually get connected… except in Germany where an onscreen prompt tells you to connect the TV up to the internet.”

And with that I had a wee interruption but note via Twitter that:

” vid viewing on social networks grew 40% at end of March 2012 -becoming the home of TV & film views”

Back to the stream:

“I think I wanted to give you a view of where we are at and then share thoughts/musings and key trends we’ve seen developing in the last few years and see developing in the future.

  1. the notion of a “media company” needs reviewing. It’s not just a TV or Film or Publishing firm these days. Increasingly the notion of a traditional media company has been turned on it’s head. I remember Red Bull being a fizzy drink company but Red Bull has a TV channel, they have a commissioning department which commissions content for them. I was talking last night to someone asking if it’s long until Red Bull commission a feature film… apparently they already have! Marks and Spencers are creating content. You can put these sites on TV but you have to login to purchase – and logins via remotes are a nightmare.
  2. The rise of XBox as an entertainment platform, which goes hugely far beyond gaming.
  3. The iPhone and iPad are creating new expectations about how we interact with devices and consume content.
  4. Future of TV experiences. First screen display and consume, second screen for searching, controlling, interacting
  5. Finally… the most important thing is the content!”

“Social Media:  television’s best new friend?” – Anthony Rose, Co-Founder & CTO, Zeebox

“I am going to do some Fringe inspired improv here as my theme is very much storytelling. As a programme maker can you engage with your audience directly and own that engagement rather than selling to a broadcaster, have them manage engagement. Broadcast is not just on TV but also on the web. Back when I headed up BBC iPlayer it was great that folk could watch content anywhere, anytime. But the best I was doing was watching the same TV, albeit time shifted. But how can we think of a new model? There are two takes here?

Well back in the 90s we made real time 3D graphic engine that allowed us to make interactive stories – we licensed content like Xena, Choose your own knightmare, etc. then writers created a story you could choose your own ending. It’s maybe like what can be done with sports views. When we started we had fairly grotty looking graphics but they improved. But the scriptwriters and storywriters had real issues. But it seems to be a left brain/right brain thing. Developers think brilliantly about logic but struggle with storytellers. I wanted storytellers to think about logic but it was super difficult. How do you bring interactivity to a medium we know and love well.

The other aspect is Microsoft’s foray into the TV space. In the 90s Microsoft were trying to sell an NT based TV platform. They made a video of the house of the future – everything had a screen – like the fridge and the toaster – and you could touch the screen and agree or disagree. They built a house of the future with these flatscreen TVs – but it was the exact same picture. Ten years of the finest minds hadn’t moved on. There is a really big challenge here to think ahead. The cost is great – an interactive TV is expensive for the future. And for the broadcaster to work for the small number of their huge audience who can interact is expensive and tricky. But the phone and ipad are bringing something in here, interactivity through a second screen. And I think the future is on-demand tv with syncronous experiences.

Sky brought 10% of company recently. We have a great product. Imagine you are sitting in front of your TV – you have your phone or tablet with you. You will never be without it one day! A few problems to solve with TV. You have to decide what to watch and you want a trusted authority to help you discover what’s on. And social comes in there soon. Zuckerberg reckons it’s your friends, Amazon would say that the recommendation engine will tell you, BBC or Guardian would say trusted authority. But wouldn’t it be great to have info delivered to your smartphone and tablets. These bring in interactions, transactional experience. Advertisers spend millions trying to sell you things but it’s a convoluted path, a one click experience works well.

Zeebox ingest live TV in the UK, we use speech analysis and the subtitles of that TV and extract meaning second by second. Our servers look for meaningful information – names, type of advert, etc. Wouldn’t it be great to see what is coming up soon? Well that’s the start screen for Zeebox in our newest version – brings in tweets etc. Can also see featured aspects – what celebrities are watching, your friends, what is being seen on Freeview if that is what you are using, what else is being watched on Sky if you use that etc. And you get privacy options there. Or you can just use EPG that includes friends and what they are watching. Zeebox can also just act as a remote control with a range of TV devices.

Zeebox can also be the next generation of “The Bar” – the Neilson panel of 1000 viewers. Real time analytics changes how TV can be monitored and understood.

Once upon a time you had to watch TV when it was being broadcast. Now you can time shift. But you will also be able to watch with others – create viewing partners. And you can also watch with celebrities based upon what they are sharing at that time. And you can see new conversations, links, self-generating tags etc. Everything becomes clickable. You can also look via Amazon. This is the beginning. Advertisers may buy a broad ad on TV and start to use second screens to personalise that. That really disrupts the model of advertising. Advertisers could buy a word on TV – you could show the ad whenever a word arises and you click for more info! It’s a bit like old iPlayer days… it’s about bringing interactivity into the programme.

We have a platform – OpenBox – to enable that one framework for multiple programmes. And that will open up to Developers via an API at some point.  That’s based on HTML5. For example we did this for Eurovision – and could poll each entry, we could link to their albums for sale etc. For Sky we did Got To Dance where alternative commentaries were made available via presenters using Zeebox on iPad. Football is another great example here. An interactive way to monitor progress, find more info, and this was sponsored so also a way to generate income.

Finally this is all about analytics… broadcast isn’t stats rich but platforms allow a huge amount of data to be gathered, used, and understood. Data is the new oil.

A programme maker can’t be totally interactive with the audience yet but you can create programmes with interactivity built in, or build rich interactivity around a traditional programme. Second screen propositions are valuable, disruptive and leading to enormous innovation. If you are creating content, if you are a games developer or programmer think about what could be done here. I think this will lead to the fastest pace of change in TV in decades.”

“Second Screen – the new remote control” – Gareth Capon, Product Development Director, BSkyB

“I wanted to talk about second screen – or screens  – in a rather different context. Luke gave a great sense of the growth of connected devices. Our view of where the market is – dominated by smart phones and game devices. When we look to 2015 we expect huge growth in connected devices. But not all will be connected immediately – we need to make sure customers do actually see the value in connecting, to show experiences that make this worthwhile. We’ve also talked about networks today. At least 15% of the UK are connected for fibre, particularly in larger metro areas. By 2015 we think there will be 75% of UK on up to 24MBps – that is an important change for us and will change how consumers interact with our products. In terms of mobile we expect three times more mobile connectivity by 2015, and five times more wifi hotspots – and many will be free. We are building out The Cloud wifi service as a free service for customers. As consumers we have options. We can give more choice and also control the service from end to end and improve the service as a result. In terms of mobile… 4.3 million downloads of Sky content on smartphones in 2011, 25 million in 2012. It’s increadibly important for us as a platform.

So now a day in the life of the iPad… I read that more people use their phones as an alarm clock than as a phone now! We can prepare ourselves for the day.. .it could be current news but actually lots of our experiences don’t fit with our day. Lets look at Formula 1 – they are international and not always at convenient times of day. For that we’ve made an iPad app for F1 with 9 live feeds and information not on the TV feed…”

And I’m afraid I got called away but I’m back to continue noting the panel session…

“Engaging the Audience – Shaping the future for our connected world” – Panel Chair : Mike Dicks, Senior Policy Executive, PACT with Luke Gaydon, Brightcove; Anthony Rose, Zeebox; Gareth Capon, BSkyB; Alistair Brown, CTO, STV Group; Dave Coplin, Microsoft.

Mick Dicks: “There is another festival in Edinburgh this weekend – most of my colleagues are over at the Edinburgh TV Festival at the moment which is obviously very connected to this session”.  Last year we had Eric Schmidt in Edinburgh and he said “ignore the internet at your peril” and in her McTaggart this year Elizabeth Murdoch said “ignore the digital native at your peril”…

My first question is we have distribution, broadcasters, platforms, investors, userpers, disruptors… who is going to control the relationship with the audience? And who pays for the content?

Alastair Brown, STV: I think we’re seeing some interesting things. Apple talking to cable operators. Second stream experiences will be driven off linear viewing for the forseeable future but an interesting few years.

Dave Coplin, Microsoft: we always knew that our consumers used XBox for other things but we saw a stat a few years back that said users were spending 40% of their XBox time not playing games which really emphasised the impact of that. Families are needing to negotiate which platform to access content in.

Anthony Rose, Zeebox: This stuff will get more confusing for the average family. TV manufacturers want to control the user experience, want to take you to apps, to make money. EPGs might be a paid for thing… BBC won’t be top of list. Think of a broadcaster as content provider, distributor (via TV/Freeview/etc. bandwidth), and a destination site. And broadcasters have to think about how to do all three or whether to do all three. When I was on iPlayer it was problematic as iPlayer could be great if you were working in digital teams etc. but a bad thing for BBC Vision, who control the channels. And programme makers saw it as good but perhaps greater viewers if they could just share direct to YouTube… it’s never straightforward.

Luke Gaydon, BrightCove: There will be a lot more change. I love the idea that everyone comes together and collaborates to build amazing user experiences I think it’s unlikely. The problem but also opportunity for the broadcasters is that they have the history of content production and commissioning. The lineage broadcasters have is a huge asset.

Mike Dicks, PACT: What does this audience do rather than a TV audience do to enable interactions with the audience? Where do developers come to you with ideas around programming, content, etc?

Alastair Brown, STV: we are hiring and looking for partners, ideas etc. Things like YouView, ZeeBox etc. help you get thinking. You could expose all sorts of added value content.

Dave Coplin, Microsoft: XBox smart glass – lets new types of interactions take place. Developers have to think about what platform to support.

Luke Gaydon, BrightCove: That choice of platforms is also a big opportunity though. From my Channel 4 days it was really rare to find people in commissioning who really got interactive content. My understanding is that there is a lot more cross-platform commissioning now though so the broadcasters are listening more now.

Mick Dicks, PACT: What are threats rather than opportunities.

Gareth Capon, BSkyB: we are geeks, we can get totally over excited but it’s all still really early, there are no rules yet… most people are watching regular TV right now but we can do huge new things.

Dave Coplin, Microsoft: I love that Man City approach we saw early, the fans are the best people to build stuff with that data. Normal people are geeks too, anyone can code. Content providers will be rewarded if they open up their content to those that love it.

Questions:

Q: Darcie Tanner, LBi: Right now everything is using social and logins are tied to one person but that gets messy on devices… what about the household logins or people using each others logins, shared devices, is anyone thinking about merging, recommendations etc.

Mick Dicks, PACT: that’s the amazon recommendations issue as well.. is that a second screen thing?

Gareth Capon, BSkyB: that’s  totally important. It’s not about I it’s about we here… context is all. We have to cope with that as a business. We have to be able to react to a change in context. Recommendations have so far to go…

Anthony Rose, Zeebox: We do need to think about how to prioritise things here… the fringe is a good parallel again. We need to really create new watercooler moments, we need to raise awareness of small programmes etc…

Dave Coplin, Microsoft: this is about the issue of paying separately right? (to Darcie) you had two Netflix accounts?

Darcie: Yeah, there is no way to merge them, even if you paid for a more expensive subscription you have to pick one account – and lose data from the others, you can’t merge. I did contact NetFlix and suggested merging… I think they are missing a trick…

Dave Coplin, Microsoft:  we do have household XBox deals there, Sky does this to an extent to but yes, more opportunity there…

Q: Aaron Quigley, St Andrews University: I’ve been working with multiple screens, eye tracking together etc. and there are huge things you can do there but media doesn’t seem to be adopting these technologies…

Dave Coplin, Microsoft:  This is where the XBox Smart Glass comes in really…

Discussion: but this is also an issue of who owns experience vs. who creates content and hardware…

Q: Apparently the experience of football is better at home and ticket sales have been going down… Video killed the radio star – so what will the second screen kill ?

Discussion: that’s about making things better – whether in the in-person experience or at home. Music composers used to make money selling sheet music and feared performance sales… we should think about changing technology, not good nor bad but just changing.

Mick Dicks, PACT: TV is finding live events are huge for advertising…

Alastair Brown, STV: absolutely. Live football or X Factor finals get huge engagement. Different content gets different levels of engagement.

Mick Dicks, PACT: What will happen two years down the line?

Alastair Brown, STV: We are moving to a new phase as a digital company now…

Dave Coplin, Microsoft:  SmartGlass will be on iOS and Android and that’s huge for us… that’s a time of huge change. We should do more of that..

Gareth Capon, BSkyB:  I didn’t talk about NowTV – our new internet delivery business. With that model we can take services and a whole range of business models, types of consumption, formats etc. to anyone in the UK as they want it.

Anthony Rose, Zeebox: we like to release something new each month… that will be 20 versions between now and two years time so hard to know where we’ll be. But we’re different to large broadcasters who know where they will be. We don’t know and can base our change on feedback and experience. We will move from being disruptive and instead partner, find new niches, etc. That will morph enormously over coming months for this new space.

Luke Gaydon, BrightCove: We will be everything, we’ll be the defacto digital distributor of video, we’ll buy Google!

And on that playful note that’s me finishing my slightly patchy live blogging of an interesting virtual Turing day. Off to enjoy a weekend in the very real, almost sunny Edinburgh for me.

 August 24, 2012  Posted by at 10:32 am LiveBlogs, Week In the Life Tagged with: , , ,  No Responses »