Nov 172017
 

Today I am at the Scottish Government for the Digital and Information Literacy Forum 2017.

Introduction from Jenny Foreman, Scottish Government: Co-chair of community of practice with Cleo Jones (who couldn’t be here today). Welcome to the 2017 Digital and Information Literacy Forum!

Scottish Government Digital Strategy – Cat Macaulay, Head of User Research and Service Design, Scottish Government

I am really excited to speak to you today. For me libraries have never just been about books, but about information and bringing people together. At high school our library was split between 3rd and 4th year section and a 5th and 6th year section, and from the moment I got there I was desperate to get into the 5th and 6th year section! It was about place and people and knowledge. My PhD later on was on interaction design and soundscapes, but in the context of the library and seeking information… And that morphed into a project on how journalists yse information at The Scotsman – and the role of the library and the librarian in their clippings library. In Goffman terms it was this backstage space for journalists to rehearse their performances. There was talk of the clippings library shutting down and I argued against that as it was more than just those clippings.

So, that’s the personal bit, but I’ll turn to the more formal bit here… I am looking forward to discussions later, particularly the panel on Fake News. Information is crucial to allowing people to meaningfully, equally and truly participate in democracy, and to be part of designing that. So, the imporatnce of digital literacy is crucial to participation in democracy. And for us in the digital directorate, it is a real priority – for reaching citizens and for librarians and information professionals to support that access to information and participation.

We first set out a digital strategy in 2011, but we have been refreshing our strategy and about putting digital at the heart of what we do. Digital is not about technology, it’s a cultural issue. We moved before from agrarian to industrial society, and we are now in the process of moving from an industrial to a digital society. Aiming to deliver inclusive economic growth, reform public services, tackle inequalities and empower communities, and prepare people for the future workplace. Digital and information literacy are core skills for understanding the world and the future.

So our first theme is the Digital Economy. We need to stimulate innovation and investment, we need to support digital technologies industr, and we need to increase digital maturity of all businesses. Scotland is so dependent on small businesses and SMEs that we need our librarians and information professionals to be able to support that maturity of all businesses.

Our second theme is Data and Innovation. For data we need to increase public trust in holding data securely and using/sharing appropriately. I have a long term medical issue and the time it takes to get appointments set up, to share information between people so geographically close to each other – across the corridor. That lack of trust is core to why we still rely on letters and faxes in these contexts.

In terms of innovation, CivTech brings together the public sector teams and tech start-ups to develeop solutions to real problems, and to grow and expand services. We want to innovate and learn from the wider tech and social media context.

The third theme is Digital Public Services, the potential to simplify and standardise ways of working. Finding common technologies/platforms build and procured once. And design services with citizens to meet their needs. Information literacy skills and critical questioning are at the heart of this. You have to have that literacy to really understand the problems, and to begin to be looking at addressing that, and co-designing.

The fourth theme is Connectivity. Improving superfast broadband, improving coverage in rural areas, increasing the 4G coverage.

The fifth theme is Skills. We need to build a digitally skilled nation. I spent many years in academia – no matter how “digital native” we might assume them, actually we’ve assumed essentially that because someone can drive a car, they can build a car. We ALL need support for finding information, how to judge it and how to use it. We all need to learn and keep on learning. We also need to promote diversity – ensuring we have more disabled people, more BAME people, more women, working in these areas, building these solutions… We need to promote and enhance that, to ensure everyone’s needs are reflected. Friends working in the third sector in Dundee frequently talk about the importance of libraries to their service users, libraries are crucial to supporting people with differing needs.

The sixth theme is Participation. We need to enable everybody to share in the social, economic and democractic opportunities of digital. We need to promote inclusion and participation. That means everyone participating.

And our final theme (seven) is Cyber Security. That is about the global reputation for Scotland as a secure place to work, learn and do business. That’s about security, but it is also about trust and addressing some of those issues I talked about earlier.

So, in conclusion, this is a strategy for Scotland, not just Scottish Government. We want to be a country that uses digital to maximum effect, to enable inclusion, to build the economy, to positively deliver for society. It is a living document and can grow and develop. Collective action is needed to ensure nobody is left behind; we all remain safe, secure and confident about the future. We all need to promote that information and digital literacy.

Q&A
Q1) I have been involved in information literacy in schools – and I know in schools and colleges that there can be real inconsistency about how things are labeled as “information literacy”, “digital literacy”, and “digital skills”. I’m slightly concerned there is only one strand there – that digital skills can be about technology skills, not information literacy.

A1) I echo what you’ve just said. I spent a year in a Life Sciences lab in a Post Doc role studying their practice. We were working on a microscopy tool… And I found that the meaning of the word “image” was understood differently by Life Scientists and Data Scientists. Common terminology really matter. And indeed semantic technologies enable us to do that in new ways. But it absolutely matters.

Q2, Kate SVCO) We are using a digital skills framework developed that I think is also really useful to frame that.

A2) I’m familiar with that work and I’d agree. Stripping away complexity and agree on common terms and approaches is a core focus of what we are doing.

Q3) We have been developing a digital skills framework for colleges and for the student lifecycle. I have been looking at the comprehensive strategy for schools and colleges by Welsh Government’s… Are there plans for similar?

A3) I know there has been work taking place but I will take that back.

Q4) I thought that the “Participation” element was most interesting here. Information literacy is key to enabling participation… Say what you like about Donald Trump but he has made the role of information literacy in democracy very vital and visible. Scotland is in a good place to support information literacy – there are many in this room have done great work in this area – but it needs resourcing to support it.

A4) My team focuses on how we design digital tools and technologies so that people can use them. And we absolutely need to look at how best to support those that struggle. But is not just about how you access digital services… How we describe these things, how we reach out to people… I remember being on a bus in Dundee and hearing a guy saying “Oh, I’ve got a Fairer Scotland Consultation leaflet… What the fuck is a Consultation?!”. I’ve had some awkward conversations with my teenage boys about Donald Trump, and Fake News. I will follow up with you afterwards – I really welcome a conversation about these issues. At the moment we are designing a whole new Social Security framework right now – not a thing most other governments have had to do – and so we really have to understand how to make that clear.

Health Literacy Action Plan Update – Blythe Robertson, Policy Lead, Scottish Government

The skills, confidence, knowledge and understanding to interact with the health system and maintain good health is essentially what we mean in Health Literacy. Right now there is a huge focus in health policy on “the conversation”. And that’s the conversation between policy makers and practitioners and people receiving health care. There is a model of health and care delivery called “More than Medicine” – this is a memorable house-shaped visual model that brings together organisational processes and arrangements, health and care professionals, etc. At the moment though the patient has to do at least as much as the medical professional, with hoops to jump through – as Cat talked about before…

Instructions can seem easy… But then we can all end up at different places [not blogged: an exercise with paper, folding, eyes closed].

Back when computers first emerged you needed to understand a lot more about computer languages, you had to understand how it worked… It was complex, there was training… What happened? Well rather than trianing everyone, instead they simplified access – with the emergence of the iPad for instance.

So, this is why we’ve been trying to address this with Making it easy: A health literacy action plan for Scotland. And there’s a lot of text… But really we have two images to sum this up… The first (a woman looking at a hurdle… We’ve tried to address this by creating a nation of hurdlers… But we think we should really let people walk through/remove those hurdles.

Some statistics for you: 43% of English working age adults will struggle to understand instructions to calculate a childhood paracetamol dose. There is lot bound up here… Childhood health literacy is important. Another stat/fact: Half of what a person is told is forgotten. And half of what is remembered is incorrect. [sources: several cited health studies which will be on Blythe’s slides]. At the heart of issue is that a lot of information is transmitted… then you ask “Do you understand?” and of course you say “yes”, even if you don’t. So, instead, you need to check information… That can be as simple as rephrasing a question to e.g. “Just so I can check I’ve explained things clearly can you tell me what you’ve understood” or similar.

We did a demonstrator programme in NHS Tayside to test these ideas… So, for instance, if you wander into Nine Wells hospital you’ll see a huge board of signs… That board is blue and white text… There is one section with yellow and blue… That’s for Visual Impairment, because that contrast is easier to see. We have the solution but… People with visual impairment come to other areas of the hospitals. So why isn’t that sign all done in the same way with high contrast lettering on the whole board? We have the solution, why don’t we just provide it across the board. That same hospital send out some appointment letters asking them to comment and tell them about any confusion… And there were many points that that happened. For instance if you need the children’s ward… You need to know to follow signs for Paediatrics first… There isn’t a consistency of naming… Or a consistency of colour. So, for instance Maternity Triage is a sign in red… It looks scary! Colours have different implications, so that really matters. You will be anxious being in hospital – consistency can help reduce the levels of anxiety.

Letters are also confusing… They are long. Some instructions are in bold, some are small notes at the bottom… That can mean a clinic running 20 minutes late… Changing what you emphasise has a huge impact. It allows the health care provision to run more smoothly and effectively. We workshopped an example/mock up letter with the Scottish Conference for Learning Disability. They came up with clear information and images. So very clear to see what is happening, includes an image of where the appointment is taking place to help you navigate – with full address. The time is presented in several forms, including a clock face. And always offer support, even if some will not need it. Always offer that… Filling in forms and applications is scary… For all of us… There has to be contact information so hat people can tell you things – when you look at people not turning up to appointments was that they didn’t know how to contact people, they didn’t know that they could change the appointment, that they wanted to contact them but they didn’t want to make a phone call, or even that because they were already in for treatment they didn’t think they needed to explain why they weren’t at their outpatients appointment.

So, a new action plan is coming called “Making it easier”. That is about sharing the learning from Making it Easy across Scotland. To embed ways to improve health literacy in policy and practice. To develop more health literacy responsive organisations and communities. Design supports and services to better meet people’s health literacy levels. And that latter point is about making services more responsive and easier to understand – frankly I’d like to put myself out of a job!

So, one area I’d like to focus on is the idea of “Connectors” – the role of the human information intermediary, is fundamental. So how can we take those competancies and roll them out across the system… In ways that people can understand… Put people in contact with digital skills, the digital skills framework… Promoting understanding. We need to signpost with confidence, and to have a sense that people can use this kind of information. Looking at librarians as a key source of information that can helps support people’s confidence.

In terms of implementation… We have at (1) a product design and at (3) “Scaled up”. But what is at step (2)? How do we get there… Instead we need to think about the process differently… Starting with (1) a need identified, then a planned structured resources and co-developed for success, and then having it embedded in the system… I want to take the barriers out of the system.

And I’m going to finish with a poem: This is bad enough by Elspeth Murray, from the launch of the cancer information reference group of the South East Scotland Cancer Network 20 January 2016.

Q&A

Q1) I’m from Strathclyde, but also work with older people and was wondering how much health literacy is part of the health and social care integration?

A1) I think ultimately that integration will help, but with all that change it is challenging to signpost things clearly… But there is good commitment to work with that…

Q2) You talked about improving the information – the letters for instance – but is there work more fundamentally questioning the kind of information that goes out? It seems archaic and expensive that appointments are done through posted physical letters… Surely better to have an appointment that is in your diary, that includes the travel information/map….

A2) Absolutely, NHS Lothian are leading on some trial work in this area right now, but we are also improving those letters in the interim… It’s really about doing both things…

Cat) And we are certainly looking at online bookings, and making these processes easier, but we are working with older systems sometimes, and issues of trust as well, so there are multiple aspects to addressing that.

Q3) Some of those issues would be practically identical for educators… Teachers or lecturers, etc…

A3) I think that’s right. Research from University of Maastrict mapped out the 21 areas across Public and Private sectors in which these skills should be embedded… And i Think those three areas of work can be applied across those area… Have to look at design around benefits, we have some hooks around there.

Cat) Absolutely part of that design of future benefits for Scotland.

Panel Discussion – Fake News (Gillian Daly – chair; Lindsay McKrell (Strathclyde); Sean McNamara (CILIPS); Allan Lindsay (Young Scott))

Sean: CILIPS supports the library and information science community in Scotland, including professional development, skills and ethics. Some years ago “information literacy” would have been more about university libraries, but now it’s across the board an issue for librarians. Librarians are less gatekeepers of information, and more about enabling those using their libraries to seek and understand information online, how to understand information and fake news, how to understand the information they find even if they are digitally confident in using the tools they use to access that information.

Allan: Young Scot is Scotland’s natural charity for information literacy. We work closely with young people to help them grow and develop, and influence us in this area. Fake News crops up a lot. A big piece of work we are involved in is he 5 Rights projects, which is about rights online – that isn’t just for young people but significantly about their needs. Digital literacy is key to that. We’ve also worked on digital skills – recently with the Carnegie Trust and the Prince’s Trust. As an information agency we reach people through our website – and we ensure young people are part of creating content in that space.

Lindsay: I’d like to talk about digital literacy as well as Fake News. Digital literacy is absolutely fundamental to supporting citizens to be all that they can be. Accessing information without censorship, and a range of news, research, citizenship test information… That is all part of public libraries service delivery and we need to promote that more. Public libraries are navigators for a huge and growing information resource, and we work with partners in government, in third sector, etc. And our libraries reach outside of working hours and remote areas (e.g. through mobile levels) so we have unique value for policy makers through that range and volume of users. Libraries are also well placed to get people online – still around 20% of people are not online – and public libraries have the skills to support people to go online, gain access, and develop their digital literacy as well. We can help people find various source of information, select between them, to interpret information and compare information. We can grow that with our reading strategies, through study skills and after school sessions. Some libraries have run sessions on fake news, but I’m not sure how well supported thse have been. We are used to displaying interesting books… But why aren’t our information resources similarly well designed and displayed – local filterable resources for instance… Maybe we should do some of this at national level,  not just at local council level. SLIC have done some great work, what we need now is digital information with a twist that will really empower citizens and their information literacy…

Gillian Daly: I was wondering, Allan, how do you tackle the idea of the “Digital Native”? This idea of inate skills of young people?

Allan: It comes up all the time… This presumption that young people can just do things digitally… Some are great but many young people don’t have all the skills they need… There are misconceptions from young people themselves about what they can and cannot do… They are on social media, they have phones… But do they have an understanding of how to behave, how to respond when things go wrong… There is a lot of responsibility for all of us that just because young people use these things, doesn’t mean they understand them all. Those misconceptions apply across the board though… Adults don’t always have this stuff sorted either. It’s dangerous to make assumptions about this stuff… Much as it’s dangerous to assume that those from lower income communities are less well informed about these things, which is often not correct at all.

Lindsay: Yes, we find the same… For instance… Young people are confident with social media… But can’t attach a document for instance…

Comment from HE org: Actually there can be learning in both directions at University. Young people come in with a totally different landscape to us… We have to have a dialogue of learning there…

Gillian: Dialogue is absolutely important… How is that being tackled here…

Sean: With school libraries, those skills to transfer from schools to higher education is crucial… But schools are lacking librarians and information professionals and that can be a barrier there… Not just about Fake News but wider misinformation about social media… It’s important that young people have those skills…

Comment: Fake News doesn’t happen by accident… It’s important to engage with IFLA guide to spot that… But I think we have to get into the territory of why Fake News is there, why it’s being done… And the idea of Media and Information Literacy – UNESCO brought those ideas together a few years ago. There is a vibrant GATNO organisation, which would benefit from more Scottish participation.

Allan: We run a Digital Modern Apprenticeship at Young Scot. We do work with apprentices to build skills, discernment and resiliance to understand issues of fake news and origins. A few weeks back a young person commented on something they had seen on social media… At school for me “Media Studies” was derided… I think we are eating our words now… If people had those skills and were equipped to understand that media and creation process. The wider media issues… Fake News isn’t in some box… We have to be able to discern mainstream news as well as “Fake News”. Those skills, confidence, and ability to ask difficult questions to navigate through these issues…

Gillian: I read a very interesting piece by a journalist recently, looking to analyse Fake News and the background to it, the context of media working practice, etc. Really interesting.

Cat: To follow that up… I distinctly remember in 1994 in The Scotsman about the number of times journalists requested clippings that were actually wrong… Once something goes wrong and gets published, it stay there and repopulates… Misquotations happen that way for instance. That sophisticated understanding isn’t about right and wrong and more about the truthfulness of information. In some ways Trump is doing a favour here, and my kids are much more attuned to accuracy now…

Gillian: I think one of the scariest things is that once the myth is out, it is so hard to dispel or get rid of that…

Comment: Glasgow University has a Glasgow Media Group and they’ve looked at these things for years… One thing they published years ago, “Bad News”, looked at for instance the misrepresentation of Trade Unionists in news sources, for a multitude of complex reasons.

Sean: At a recent event we ran we had The Ferret present – those fact checking organisations, those journalists in those roles to reflect that.

Jenny: The Ferret has fact checking on a wonderful scale to reflect the level of fakeness…

Gillian: Maybe we need to recruit some journalists to the Digital and Information Literacy Forum.

And on that, with many nods of agreement, we are breaking for lunch.

Information Literacy & Syrian New Scots – Dr Konstantina Martzoukou, Postgraduate Programme Leader, Robert Gordon University

This project was supposed to be a scoping study of Syrian New Scots – Syrian Refugees coming to Scotland. The background to this is the Syrian Civil War since 2011, which has led to an enormous amount of refugees, mainly in the near region. Most research has been on Asylum seekers in the camps near Syria on basic survival and human rights, on their needs and how to respond to them. The aim of this project was different: a scoping study to examine the information related experiences and information literacy practices of Syrian new Scots during their resettlement and integration. So this is quite different as the context is relatively settled, and is about that resettlement process.

In September 2015 the Prime Minister announced an expansion of the refugee programme to take up to 2000 Syrian Refugees. And the first place Syrian Refugees came was Glasgow. Now, there have been a lot of changes since then but there is the intent to resettle 2000 Syrian Refugees by 2020.

Primary research was done with 3 refugee resettement officers, as well as focus groupd with Syrian new Scots. These groups were in both urban (1 group) and rural (2 groups), and included 38 people from across Syria, having been in camps in Lebanon, Turkey and Iraq and Jordan. I didn’t know what to expect – these people had seen the worst horrors of war. In reality the focus groups were sometimes loud and animated, sometimes quiet and sad. And in this group they came from a huge range of professional backgrounds, though most of the women did not work.

So, our work looked at included English language and community integration; Information provisions, cultural differences and previous experiences; Financial security. Today I want to focus on libraries and the role of libraries.

One of the most crucial aspects were language barriers and sociocultural. The refugees were given ESOL classes; a welcome pack with key information for finding the resources in their neighbourhood; a 24 hour Arabic hotline, set up with the mosque for emergencies so that families could receive help outside core working hours; In-house translation services. But one of the challenges across the support given was literacy as a whole – not all of the refugees could read and write in any language. But it was also about understanding interchangable words – “doctor” has a meaning but “GP” not so much. There was also a perception that learning English would be really difficult.

The refugees wanted to know how to learn English, and they were anxious about that. The support officers had different approaches. The ESOL classes were there, but some officers were really proactive, taking refugees to the train station, having mock job interviews… That was really really valuable. But some groups, even a year after arriving, weren’t speaking English. But sometimes that was about the families… Some were confident and really well travelled, but some had lived in one place, and not travelled, and found communication and networking much more difficult. So the language learning was very tied to socio-cultural background.

Many of these families have complex health needs – they were hand picked to come here often because of this – and that causes it’s own challenge. Some had no experience of recycling and of how to correctly put their bins out. Someone felt the open plan kitchen was difficult – that her child was burned because of it. One reported a neighbour telling him not to play with his son outside – the boundaries of danger and expectations of childhood was rather different from their new neighbours. Doctors appointments were confusing. Making bus change was expensive – buying something unneeded because the buses don’t give change. Many wanted family reunion information and support.

Technology is used, but technology is not the key source of information. They used mobile phones with pasy as you go sim cards. They used WhatsApp and were sharing quite traumatic memories and news in this way.

The library is there… But actually they are perceived as being for books and many refugees don’t go there. Community classes, meals etc. may be better. Computer classes can be useful, especially when refugees can participate in a meaningful way. And there are real challenges here – computer classes in the library didn’t work for this group as there were too few computers and the internet connections were too small.

For me the key thing is that we need to position the library as a key place for communication, learning and support for the families.

Q&A
Q1) Alamal(?) is running events in our libraries – we have an event with films and telling their story – and we have had huge interest in that.

A1) We really want to show case that to the wider community. There are some great examples from England, from other EU countries, but we want more Scottish examples so do please get in touch.

A User Study Investigating the Information Literacy of Scotland Teenagers – David Brazier, Research Assistant, Northumbria University

This is an ILG funded project looking at the Information Literacy of Scottish Teenagers. I’ll introduce the concepts, going through some related works, and then some of the methodology we’d like to implement. So, information literacy is about the ability to seek, understand, assess information. They are crucial to integrating with society as a whole, and is crucial to our modern society. We need to empower students to learn, so they can integrate themselves into modern society.

As the panel talked about earlier, the idea of the “Digital Native” is misleading. Young people have a poor understanding of their information needs. That leads to young people taking the top ranked documents/sites or cite that. And that needs to be counteracted early in their learning so that it doesn’t carry through and right into University (Rowlands 2008). In recent research (Brazier and Harvey 2017) ESOL post graduates were unable to perceive their performance correctly, often judging high performance when the opposite was true. In the “Not Without Me” report this inability to assess their own skills was also highlighted in the wider range of young people. These groups are highly educated, so they should be able to be more reflective on their own practice.

So, in our research, we are using a Mixed Methods approach to do a quantitative analysis of secondary school-aged children’s information gathering behaviour. Triangulated with qualitative assessments of the participants own assessment. It is around a simulated work task.

The search system is based on the TREC AQUAINT collection – large set of over a million documents from three large news agencies collected between 196 and 2000. Pre-defined search topics associated with the project. The initial 15 topics were reduced down to 4 topics selected by school representatives (librarian and 2 teachers from Gracemount High School in Edinburgh).

So, we start with a pre-task questionnaire. The search task is “Tropical strms: What tropical storms (hurricanes and typhoons) have caused significant property damage and loss of life?”. They can then search through a Google-style search of the documents. They click on those sources that seem relevant. And then they get a questionnaire to reflect on what they’ve done.

A pilot was conducted in December 2016. Tasks were randomly selected, using a Latin Square design to ensure no 2 students had the same two tasks. In total 19 students were involved, from S3 (13-14 years old). The study was on PCs rather than handheld devices. No other demographic data was collected. The school representative did provide a (new) unique id to match the task and the questionnaires. The id was known only to the school rep. No further personal data was taken.

We could then look at the queries each student submitted, and were able to ask why they did that and why they selected the article they did.

This is a work in progress… We are interested in how they engage with the study as a whole. We have used the findings of the pilot to adapt the study design and interface, including a task description relocated to a more prominent location; and an instruction sheet (physical) i.e. browser page, interpret interface.

The main study takes place next week, with 100 students (none of whom were part of the pilot). From this we want to get recommendations and guidelines for IL teaching; to inform professional practice; feedback to participants (pamphlet) for reflective purposes; academic publications in the field of information literacy, information retrieval, education and pedagogy.

Q&A

Q1) Why such a controlled space was selected – presumably students would normally use other places to search, to ask friends etc. So I wondered why you selected such a controlled space like this.

A1) In the previous study we allowed students to look anywhere on the web… But it is much harder to judge relevance in that… These have already been judged for relevance… It’s a wide arc… It adds complexity to the whole process… And someone has to transcribe and mark that footage… For my study there were 29 students and it took 7 months. For 100 students that’s just too large. Test collection is also standardised and replicatable.

The Digital Footprint MOOC – Nicola Osborne, Digital Education Manager, EDINA

This was me… No notes but slides to follow. 

Wikipedia & Information Literacy: the importance of reliable sources – Sara Thomas, Wikimedian in Residence, SLIC

Hi, I’m Wikimedian in Residence at SLIC. The role of a Wikimedian in residence is to work with cultural heritage organisations and Wikimedia and bring the two together. In this role we are working with three local libraries right now but we will be expanding it to a wider Scottish context.

I am passionate about open knowledge and open data. Open data and open knowledge leads to better society, it allows us to make better decisions – I am sick of us being asked to make big decisions about big issues without appropriate information.

Now, I want to introduce you to Bassel Khartabil who was an open source software developer and advocate for open data and knowledge. Knowledge is power… He was detained by the Syrian government and, before he was killed by the government, he wrote very movingly about the impact of open knowledge, that it is so important and a matter of life and death in some contexts.

I want to talk about production of knowledge and what that can teach us about information literacy. Jim Groom at #OER16, said “Wikipedia is the single greatest Open Education Resource the world has ever known”, and he’s not wrong. Wikipedia is more accurate than you may think. There are groups who just edit and work on improving the quality of articles. Women in Red is a group dedicated to having more women’s biographies on Wikipedia. 17% of biographies are women now, that’s 2% more than was the case 2 years ago – and they also work on bringing those biographies up to “featured article” quality.

Quality and ratings scale. Vandalism is picked up quickly – by bots and by people. Wikipedia is neutral in it’s point of view. Nature, in 2005, found that Wikipedia was nearly as accurate as Britannica (2.92 errors per article compared to 3.86 on Wikipedia). The Journal of Clinical Oncology, 2010, found Wikipedia as accurate as Physician Data Query (a premium database). The medical information there is huge – 80% of medical students will use it; ~50% of GPs will use it as a first point in their search. It is the most popular health resource on the web.

Wikipedia is generally the seventh most popular site on the internet. And we have a basic Notability guidance that means an article must be notable, there must be a reason for it being there. The information but be verifiable – the information must come from credible checkable verifiable sources. And we have to use reliable third party publiches sources with a reputation for fact checking and accuracy.

On the subject of media literacy… The Daily Mail didn’t like that Wikipedia doesn’t treat it as reliable – there is no ban but you will get a trigger to ask you if that’s the right source. Brilliantly, they got loads of errors in their own outraged article.

Manipulation is really obvious… The community spots when people are trying to whitewash their own biographies, to promote their company, to try to remove claims of misconduct. And Wikipedia gets it – there is an article on “Wikipedia is not a credible source” – we get it. We are a starting point, a jumping off and discovery point. And in fact we have Wiki Ed (https://wikiedu.org/) which works to combat fake news, to support information literacy. If you want to teach information literacy, wiki can help you. We have a Wiki Education Dashboard – mainly in the US, but lots in the UK. Our guides include: Instructor Basics and Case Studies for using Wikipedia in teaching. Some lovely projects here…

I did some work with Chris Harlow, at University of Edinburgh, a few years ago… He found a medical term that wasn’t in Wikipedia, gave them guidance on how to create a Wikipedia page, taught them how to use a medical database, and sent them away to write a section in simple language… Then we show them how to edit an article. It’s really really easy to edit an article now… The students write their section, put it in… And write a page, it goes live… Five minutes later it’s on the front page of Google. It is gratifying to find work so immediately valued and used and useful.

Translation studies at UoE also use Wikipedia in the classroom. Queen Mary’s University of London use Wikipedia in their film classes. They trialled it, it’s now a compulsory part of the programme. It’s a way to teach digital skills, information synthetis. Imperial College London are working to engage undergraduate students involved in synthesising and sharing university. Greg Singh in Sterling University who uses WikiBooks… Which is a project that seeks to create collaboratively produced text books… To produce a text book, a chapter, on what they’ve been doing… It’s about developing collaboration, track that, instill that within a student…

So I have a vide here of Aine Kavanagh from Reproductive Biology at the University of Edinburgh, who authored an article that has been read 20,000 times in the last year. Aine was looking for some extra work, and she wanted to develop her skills. She asked Chris (Harlow) what she could do… She wrote about one of the most common sorts of cancers which there was very little information about. To be able to see the value of that, the impact of that work, that this has been hugely gratifying to do.

To conclude: open knowledge is important, open knowledge gives us a better society, not just being able to find this information but also be able to produce that knowledge is hugely powerful. And Wikipedia is more accurate than you think!

Closing

Gillian: I just want to thank all of our speakers, to thank all of you for coming, and to thank the Scottish Government for hosting us.

Oct 042017
 

This afternoon I’m at the Keynote Session for Information Security Awareness Week 2017 where I’ll speaking about Managing Your Digital Footprint in the context of security. I’ll be liveblogging the other keynotes this afternoon.

The event has begun with a brief introduction from Alistair Fenemore, UoE’s Chief Information Security Officer, and from his colleague David Creighton Offord, the organiser for today’s event.

Talk by John Whitehouse, PWC Cyber Security Director Scotland covering the state of the nation and the changing face of Cyber Threat

I work at PWC, working with different firms who are dealing with information security and cyber security. In my previous life I was at Standard Life. I’ve seen all sorts of security issues so I’m going to talk about some of the things I’ve seen, trends, I’ll explain a few key concepts here.

So, what is cybersecurity… People imagine people in basements with balaclavas… But it’s not that at all…

I have a video here…

(this is a Jimmy Kimmel comedy segment on the Sony hack where they ask people for their passwords, to tell them if it’s strong enough… And how they construct them… And/or the personal information they use to construct that…)

YouTube Preview Image

We do a lot of introductions for boards… We talk about technical stuff… But they laugh at that video and then you point out that these could all be people working in their companies…

So, there is technical stuff here, but some of the security issues are simple.

We see huge growth due to technology, and that speaks to businesses. We are going to see 1 billion connected devices by 2020, and that could go really really wrongly…

There is real concern about cyber security, and they have concerns about areas including cloud computing. The Internet of Things is also a concern – there was a study that found that the average connected device has 25 security vulnerabilities. Dick Cheney had to have his pacemaker re programmed because it was vulnerable to hacking via Bluetooth. There was an NHS hospital in England that had to pause a heart surgery when the software restarted. We have hotel rooms accessible via phones – that will come to homes… There are vulnerabilities in connected pet feeders for instance.

Social media is used widely now… In the TalkTalk breach we found that news of the breach has been leaked via speculation just 20 seconds after the breach occurs – that’s a big challenge to business continuity planning where one used to plan that you’d perhaps have a day’s window.

Big data is coming with regulations, threats… Equifax lost over 140 million records – and executives dumped significant stock before the news went public which brings a different sort of scrutiny.

Morrisons were sued by their employees for data leaked by an annoyed member of staff – I predict that big data loss could be the new PPI as mass claims for data loss take place. So maybe £1000 per customer per data breach for each customer… We do a threat intelligence service by looking on the dark net for data breach. And we already see interest in that type of PPI class suit approach.

The cyber challenge extends beyond the enterprise – on shore, off shore; 1st through to 4th parties. We’ve done work digging into technology components and where they are from… It’s a nightmare to know who all your third parties are… It’s a nightmare and a challenge to address.

So, who should you be worried about? Threat actors vary…. We have accidental loss, Maware that is not targeted, and hacker hobbyists in the lowest level of sophistication, through to state sponsored attacks at the highest level of sophistication. Sony were allegedly breached by North Korea – that firm spends astronomical amounts on security and that still isn’t totally robust. Target lost 100 million credit card details through a third party air conditioner firm, which a hacker used to get into the network, and that’s how the loss occured. And when we talk organised crime we are talking about really organised crime… One of the Ukrainian organised crime groups were offering a Ferrari for their employee of the month prize for malware. We are talking seriously Organised. And serious financial gain. And it is extremely hard to trace that money once its gone. And we see breaches going on and on and on…

Equifax is a really interesting one. There are 23 class action suits already around that one and that’s the tip of the iceberg. There has been a lot of talk of big organisations going under because of cyber security, and when you see these numbers for different companies, that looks increasingly likely. Major attacks lead to real drops in share prices and real impacts on the economy. And there are tangible and intangible costs of any attack…. From investigation and remediation through to DEO and CTO’s losing their jobs or facing prison time – at that level you can personally liable in the event of an attack.

In terms of the trends… 99% of exploited vulnerabilities (in 2014) had been identified for more than a year, some as far back as 1999. Wannacry was one of these – firms had 2 months notice and the issues still weren’t addressed by many organisations.

When we go in after a breach, typically the breach has been taking place for 200 days already – and that’s the breaches we find. That means the attacker has had access and has been able to explore the system for that long. This is very real and firms are dealing with this well and really badly – some real variance.

One example, the most successful bank robbery of all time, was the Bangladesh Central Bank was attacked in Feb 2016 through the SWIFT network .These instructions totalled over US $900 million, mostly laundered through casinos in Macau. The analysis identified that malware was tailored for the target organisation based on the printers they were using, which scrubbed all entry and exit points in the bank. The US Secret Service found that there were three groups – two inside the bank, one outside executing the attack.

Cyber security concerns are being raised, but how can we address this as organisations? How do we invest in the right ways? What risk is acceptable? One challenge for banks is that they are being asked to use Fintechs and SMEs working in technology… But some of these startups are very small and that’s a real concern for heads of securities in banks.

We do a global annual survey on security, across about 10,000 people. We ask about the source of compromise – current employees are the biggest by some distance. And current customer data, as well as IPR, tend to be the data that is at risk. We also see Health and Social Care adopting more technology, and having high concern, but spending very little to counter the risks. So, with Wannacry, the NHS were not well set up to cope and the press love the story… But they weren’t the target in any way.

A few Mythbusters for you…

Anti-Virus software… We create Malware to test our clients’ set up. We write malware that avoids AVs. Only 10-15% of malware will be caught with Anti-Virus software. There is an open source tool, Veil-Framework, that teaches you how to write that sort of Malware so that you can understand the risks. You should be using AV, but you have to be aware that malware goes beyond that (and impacts Macs too)… There is a malware SaaS business model on the darknet – as an attacker you’ll get a guarantee for your malware’s success and support to use it!

Myth 2: we still have time to react. Well, no, the lag from discovery to impacting you and your set up can be minutes.

Myth 3: well it must have been a zero day that got us! True Zero Day exploits are extremely rare/valuable. Attacker won’t use one unless target is very high value and they have no other option. They are hard to use. Even NSA admits that persistence is key to sucessful compromise, not zero day exploits. The NSA created EternalBlue – a zero day exploit – and that was breached and deployed out to these “good guys” as Wannacry.

Passwords… They are a thing of the past I think. 2-factor authentication is more where we are at. Passphrases and strength of passphrases is key. So complex strings with a number and a site name at the end is recommended these days. Changing every 30 days isn’t that useful – it’s so easy to bruteforce the password if lost – much better to have a really strong hash in the first place.

Phishing email is huge. We think about 80% of cyber attacks start that way. Beware spoofed addreses, or extremely small changes to email addresses.

We had a client that had an email from their “finance director” about urgently paying money to an account, which was only spotted because someone in finance noticed the phrasing… “the chief exec never says “Thanks”!”

Malware trends: our strong view is that you should never ever pay for a Ransomeware attack.

I have another video here…

(In this video we have people having their “mind read” for some TV show… It was uncanny… And included spending data… But it wasn’t psychic… It was data that they had looked up and discovered online… )

YouTube Preview Image

It’s not a nice video… This is absolutely real… This whole digital footprint. We do a service called Digital Footprinting for senior execs in companies, and you have to be careful about it as they can give so much away by what you and those around you post… It’s only getting worse and more pointed. There are threat groups going for higher value targets, they are looking for disruption. We think that the Internet of Things will open up the attack surface in whole new ways… And NACS – the Air Traffic people – they are thinking about drones and the issues there around fences and airspace… How do you prepare for this. Take the connected home… These fridges are insecure, you can detect if owner is opened or not and detect if they are at home or not… The nature of threats is changing so much…

In terms of trends the attacks are moving up the value chain… Retain bank clients aren’t interesting compared to banks finance systems, more to exchanges or clearing houses. It’s about value of data… Data is maybe $0.50 for email credentials; a driving license is maybe $25… and upwards the price goes depending on value to the attackers…

So, a checklist for you and your work: (missed this but delighted that digital footprint was item 1)

Finally, go have a look at your phone and how much data is being captured about you… Check your iPhone frequent locations. And on Android check Google Location History. The two biggest companies in the world, Google and Facebook, are free, and they are free because of all the data that they have about you… But the terms of service… Paypal’s are longer than Hamlet. If you have a voice control TV from Samsung and you sign those, you agree to always on and sharable with third parties…

So, that’s me… Hopefully that gave you something to ponder!

Q&A

Q1) What does PWC think about Deloitte’s recent attack?

A1) Every firm faces these threats, and we are attacked all the time… We get everything thrown at us… And we try to control those but we are all at risk…

Q2) What’s your opinion on cyber security insurance?

A2) I think there is a massive misunderstanding in the market about what it is… Some policies just cover recovery, getting a response firm in… When you look at Equifax, what would that cover… That will put insurers out of business. I think we’ll see government backed insurance for things like that, with clarity about what is included, and what is out of scope. So, if, say, SQL Injection is the cause, that’s probably negligence and out of scope…

Q3) What role should government have in protecting private industry?

A3) The national cyber security centre is making some excellent progress on this. Backing for that is pretty positive. All of my clients are engaging and engaged with them. It has to be at that level. It’s too difficult now at lower levels… We do work with GCHQ sharing information on upcoming threats… Some of those are state sponsored… They even follow working hours in their source location… Essentially there are attack firms…

Q4) (I’m afraid I missed this question)

A4) I think Microsoft in the last year have transformed their view… My honest view is that clients should be on Windows 10 its a gamechanger for security. Firms will do analysis on patches and service impacts… But they delayed that a bit long. I have worked at a firm with a massively complex infrastructure, and it sounds easy to patch but it can be quite difficult to do that in practice, and it can put big operational systems at risk. As a multinational bank for instance you might be rolling out to huge numbers of machines and applications.

Talk by Kami Vaniea (University of Edinburgh) covering common misconceptions around Information Security and to avoid them

My research is on the usability of security and why some failings are happening from the point of view of an average citizen. I do talks to community groups – so this presentation is a mixture of that sort of content and proper security discussion.

I wanted to start with misconceptions as system administrators… So I have a graph here of where there is value to improving your password; then the range in which having rate limits on password attempts; and the small area of benefit to the user. Without benefits you are in the deadzone.

OK, a quick question about URL construction… http://facebook.mobile.com? Is it Facebook’s website, Facebook’s mobile site, AT&T’s website, or Mobile’s website. It’s the last one by construction. It’s both of the last two if you know AT&T own mobile.com. But when you ask a big audience they mainly get it right. Only 8% can correctly differentiate http://facebook.profile.com vs http://profile.facebook.com. Many users tend to just pick a big company name regardless of location in URLs. A few know how to to correctly read subdomain URLs. We did this study on Amazon Mechanical Turk – so that’s a skewed sample of more technical people. And that URL understanding has huge problematic implications for phishing email.

We also tried http://twitter.com/facebook.com. Most people could tell that was Twitter (not Facebook). But if I used “@” instead of “/” people didn’t understand, thought it was an email…

On the topic of email… Can we trust the “from” field? No. Can we trust a “this email has been checked for viruses…” box? No. Can you trust the information on the source URL for a link in the email, that is shown in the bottom of the browser? Yes.

What about this email – a Security alert for your linked Google account email? Well this is legitimate… Because it’s coming from accounts.google.com. But you knew this was a trick question… Phishing is really tricky…

So, a shocking percentage of my students think that “from” address is legitimate… Tell your less informed friends how easily that can be spoofed…

What about Google. Does Google know what you type as you type it and before you hit enter? Yes, it does… Most search engines send text to their servers as you write it. Which means you can do fun studies on what people commonly DON’T post to Facebook!

A very common misconception is that opening web pages, emails, pdfs, and docs is like reading physical paper… So why do they need patching?

Lets look at an email example… I don’t typically get emails with “To protect your privacy, Thunderbird has blocked remote content in this message” from a student… This showed me that a 1 pixel invisible image had come with the email… which pinged the server if I opened it. I returned the email and said he had a virus. He said “no, I used to work in marketing and forgot that I had that plugin set up”.

Websites are made of many elements from many sources. Mainly dynamically… And there are loads of trackers across those sites. There is a tool called Lightbeam that will help you track the sites you go to on purpose, and all the other sites that track you. That’s obviously a privacy issue. But it is also a security problem. The previous speaker spoke about supply chains at Target, this is the web version of this… That supply chain gets huge when you visit, say, six websites.

So, a quiz question… I got to Yahoo, I hit reload… Am I running the same code as a moment ago… ? Well, it’s complicated… I had a student run a study on this… And how much changes… In a week about half of the top 200 sites had changed their javascript in a week. I see trackers change between individual reloads… But it might change, it might not…

So we as users you access a first party website, then they access third party sites… So they access ad servers and that sells that user, and ad is returned, with an image (sometimes with code). Maybe I bid to a company, that bids out again… This is huge as a supply chain and tracking issue…

So the Washington Post, for instance, covering the yahoo.com malware attack showed that malicious payloads were being delivered to around 300k users per hour, but only about 9% (27k) users per hour were affected – they were the ones that hadn’t updated their systems. How did that attack take place? Well rather than attack, they just brought an ad and ran malware code.

There is a tool called Ghostery… It’s brilliant and useful… But it’s run by the ad industry and all the trackers are set the wrong way. Untick those all and then it’s fascinating… They tell you about page load and all the components involved in loading a page…

To change topic…

Cookies! Yes, they can be used to track you across web sites. But they can’t give you malware as is. So… I will be tackling the misconception that cookies is evil… And I’m going to try to convince you otherwise. Tracking can be evil… But cookies is kind of an early example of privacy by design…

It is 1994. The internet cannot remember anyone between page loads. You have an interaction with a web server that has absolutely no memory. Cookies help something remember between page loads and web pages… Somehow a server has to know who you are… But back in 1994 you just open a page and look at it, that’s the interaction point…

But companies wanted shopping baskets, and memory between two page reloads. There is an obvious technical solution… You just give every browser a unique identifier… Great! The server remembers you. But the problem is a privacy issue across different servers… So, Netscape implemented cookies – small text strings the server could ask the browser to remember and give back to it later…

Cookies have some awesome properties: it is client visible; third party tracking is client visible too; it’s opt out (delete) option on a per-site basis; it’s only readable by the site that set it; and it allows for public discussion of tracking…

… Which is why Android/iOS both went with the unique ID option. And that’s how you can be tracked. As a design decision it’s very different…

Now to some of the research I work on… I believe in getting people to touch stuff, to interact with it… We can talk to each other, or mystify, but we need to actually have people understand this stuff. So we ran an outreach activity to build a website, create a cookie, and then read the cookie out… Then I give a second website… To let people try to understand how to change their names on one site, not the other… What happens when you view them in Incognito mode… And then exploring cookies across sites. And how that works…

Misconception: VPNs solve all privacy and security problems. Back at Indiana I taught students who couldn’t code… And that was interesting… They saw VPNs as magic fairy dust. And they had absorbed this idea that anyone can be hacked at any time… They got that… But that had resulted in “but what’s the point”. That worries me… In the general population we see media coverage of attacks on major companies… And the narrative that attacks are inevitable… So you end up with this problem…

So, I want to talk about encryption and why it’s broken and what that means by VPNs. I’m not an encryption specialist. I care about how it works for the user.

In encryption we want (1) communication between you and the other party is confidential and has not been changes, and no-one can read what you sent and no one can change what you sent; and (2) to know who we are talking about. And that second part is where things can be messed up. You can make what you think is the secure connection to the right person, but could be a secure connection to the wrong person – a man in the middle attack. A real world example… You go to a coffee shop and use wifi to request the BBC news site, but you get a wifi login page. That’s essentially a man in the middle attack. That’s not perhaps harmful, it’s normal operating procedure… VPNs basically work like this…

So, an example of what really happened to a student… I set up a page that just had them creating a very simple cookie page… I was expecting something simple… But one of them submitted a page with a bit of javascript… it is basically injecting code so if I connect to it, it will inject an ad to open in my VPN…. So in this case a student logged in to AnchorFree – magic fairy dust – and sees a website and injects code that is what I see when they submit the page in Blackboard Learn…

VPNs are not magic fairy dust. The University runs an excellent VPN – far better for coffee shops etc!

So, I like to end with some common advice:

  • Install anti virus scanner. Don’t turn off Windows 8+ automatically installed AV software… I ran a study where 50% of PhD students had switched off that software and firewalls…
  • Keep your software updated – best way to stay safe
  • Select strong passcode for important things you use all the time
  • For non-important stuff, use a password manager for less important things that you use rarely… Best to have different password between them…
  • Software I use:
    • Ad blockers – not just ads, reduce lots of extra content loading. The more websites you visit the more vulnerable you are
    • Ghostery and Privacy Badger
    • Lightbeam
    • Password Managers (LastPass, OnePassword and KeePass are most recommended
    • 2-factor like Yubikey – extra protection for e.g. Facebook.
    • If you are really serious: UMatrix and NoScript BUT it will break lots of pages…

Q&A

Q1) It’s hard to get an average citizen to do everything… How do you get around that and just get the key stuff across…

A1) Probably it’s that common advice. The security community has gotten better at looking at 10 key stuff. Google did a study with Blackhats Infosec conference about what they would do… And asked on Amazon Mechanical Turj about what they would recommend to friends. About the only common answer amongst blackhats was “update your software”. But actually there is overlap… People know they should change passwords, and should use AV software… But AV software didn’t show on the Blackhat list… But 2-factor and password managers did…

Q2) What do you think about passwords… long or complex or?

A2) We did a study maybe 8 years ago on mnemonic passwords… And found that “My name is Inigo Montoya, you killed my father, prepare to die” was by far the most common. The issue isn’t length… It’s entropy. I think we need to think server side about how many other users have used the same password (based on encrypted version), and you need something that less than 3 people use…

Q2) So more about inability to remember it…

A2) And it depends on threat type… If someone knows you, your dog, etc… Then it’s easier… But if I can pick a password for a long time I might invest in it – but if you force people to change passwords they have to remember it. There was a study that people using passwords a lot use some affirmations, such as “I love God”… And again, hard to know how you protect that.

Q3) What about magic semantic email links instead of passwords…

A3) There is some lovely work on just how much data is in your email… That’s a poor mans version of the OAuth idea of getting an identity provider to authenticate the user. It’s good for the user, but that is one bigger stake login then… And we see SMS also being a mixed bag and being subject to attack… Ask a user though… “there’s nothing important in my email”.

Q4) How do you deal with people saying “I don’t have anything to hide”?

A4) Well I start with it not being about hiding… It’s more, why do you want to know? When I went to go buy a car I didn’t dress like a professor, I dressed down… I wanted a good price… If I have a lot of time I will refer them to Daniel Salvo’s Nothing to Hide.

Talk by Nicola Osborne (EDINA) covering Digital Footprints and how you can take control of your online self

And that will be me… So keep an eye out for tweets from others on the event hashtag: #UoEInfoSec.

And with a very brief summing up from Alastair Fenemore, the day came to a close. Thanks to the lovely University Information Security team for organising this really interesting event (and inviting me to speak) as part of their awesome Information Security Awareness Week programme.

 October 4, 2017  Posted by at 3:06 pm digital footprint, Events Attended, LiveBlogs Tagged with: ,  No Responses »
Aug 022017
 
The digital flyer for my CODI 2017 show - huge thanks to the CODI interns for creating this.

As we reach the end of the academic year, and I begin gearing up for the delightful chaos of the Edinburgh Fringe and my show, Is Your Online Reputation Hurting You?, I thought this would be a good time to look back on a busy recent few months of talks and projects (inspired partly by Lorna Campbell’s post along the same lines!).

This year the Managing Your Digital Footprint work has been continuing at a pace…

We began the year with funding from the Principal’s Teaching Award Scheme for a new project, led by Prof. Sian Bayne: “A Live Pulse”: Yik Yak for Teaching, Learning and Research at Edinburgh. Sian, Louise Connelly (PI for the original Digital Footprint research), and I have been working with the School of Informatics and a small team of fantastic undergraduate student research associates to look at Yik Yak and anonymity online. Yik Yak closed down this spring which has made this even more interesting as a cutting edge research project. You can find out more on the project blog – including my recent post on addressing ethics of research in anonymous social media spaces; student RA Lilinaz’s excellent post giving her take on the project; and Sian’s fantastic keynote from#CALRG2017, giving an overview of the challenges and emerging findings from this work. Expect more presentations and publications to follow over the coming months.

Over the last year or so Louise Connelly and I have been busy developing a Digital Footprint MOOC building on our previous research, training and best practice work and share this with the world. We designed a three week MOOC (Massive Open Online Course) that runs on a rolling basis on Coursera – a new session kicks off every month. The course launched this April and we were delighted to see it get some fantastic participant feedback and some fantastic press coverage (including a really positive experience of being interviewed by The Sun).

The MOOC has been going well and building interest in the consultancy and training work around our Digital Footprint research. Last year I received ISG Innovation Fund support to pilot this service and the last few months have included great opportunities to share research-informed expertise and best practices through commissioned and invited presentations and sessions including those for Abertay University, University of Stirling/Peer Review Project Academic Publishing Routes to Success event, Edinburgh Napier University, Asthma UK’s Patient Involvement Fair, CILIPS Annual Conference, CIGS Web 2.0 & Metadata seminar, and ReCon 2017. You can find more details of all of these, and other presentations and workshops on the Presentations & Publications page.

In June an unexpected short notice invitation came my way to do a mini version of my Digital Footprint Cabaret of Dangerous Ideas show as part of the Edinburgh International Film Festival. I’ve always attended EIFF films but also spent years reviewing films there so it was lovely to perform as part of the official programme, working with our brilliant CODI compare Susan Morrison and my fellow mini-CODI performer, mental health specialist Professor Steven Lawrie. We had a really engaged audience with loads of questions – an excellent way to try out ideas ahead of this August’s show.

Also in June, Louise and I were absolutely delighted to find out that our article (in Vol. 11, No. 1, October 2015) for ALISS Quarterly, the journal of the Association of Librarians and Information Professionals in the Social Sciences, had been awarded Best Article of the Year. Huge thanks to the lovely folks at ALISS – this was lovely recognition for our article, which can read in full in the ALISS Quarterly archive.

In July I attended the European Conference on Social Media (#ecsm17) in Vilnius, Lithuania. In addition to co-chairing the Education Mini Track with the lovely Stephania Manca (Italian National Research Council), I was also there to present Louise and my Digital Footprint paper, “Exploring Risk, Privacy and the Impact of Social Media Usage with Undergraduates“, and to present a case study of the EDINA Digital Footprint consultancy and training service for the Social Media in Practice Excellence Awards 2017. I am delighted to say that our service was awarded 2nd place in those awards!

Social Media in Practice Excellence Award 2017 - 2nd place - certificate

My Social Media in Practice Excellence Award 2017 2nd place certificate (still awaiting a frame).

You can read more about the awards – and my fab fellow finalists Adam and Lisa – in this EDINA news piece.

On my way back from Lithuania I had another exciting stop to make at the Palace of Westminster. The lovely folk at the Parliamentary Digital Service invited me to give a talk, “If I Googled you, what would I find? Managing your digital footprint” for their Cyber Security Week which is open to members, peers, and parliamentary staff. I’ll have a longer post on that presentation coming very soon here. For now I’d like to thank Salim and the PDS team for the invitation and an excellent experience.

The digital flyer for my CODI 2017 show - huge thanks to the CODI interns for creating this.

The digital flyer for my CODI 2017 show (click to view a larger version) – huge thanks to the CODI interns for creating this.

The final big Digital Footprint project of the year is my forthcoming Edinburgh Fringe show, Is Your Online Reputation Hurting You? (book tickets here!). This year the Cabaret of Dangerous Ideas has a new venue – the New Town Theatre – and two strands of events: afternoon shows; and “Cabaret of Dangerous Ideas by Candlelight”. It’s a fantastic programme across the Fringe and I’m delighted to be part of the latter strand with a thrilling but challengingly competitive Friday night slot during peak fringe! However, that evening slot also means we can address some edgier questions so I will be talking about how an online reputation can contribute to fun, scary, weird, interesting experiences, risks, and opportunities – and what you can do about it.

QR code for CODI17 Facebook Event

Help spread the word about my CODI show by tweeting with #codi17 or sharing the associated Facebook event.

To promote the show I will be doing a live Q&A on YouTube on Saturday 5th August 2017, 10am. Please do add your questions via Twitter (#codi17digifoot) or via this anonymous survey and/or tune in on Saturday (the video below will be available on the day and after the event).

So, that’s been the Digital Footprint work this spring/summer… What else is there to share?

Well, throughout this year I’ve been working on a number of EDINA’s ISG Innovation Fund projects…

The Reference Rot in Theses: a HiberActive Pilot project has been looking at how to develop the fantastic prior work undertaken during the Andrew W. Mellon-funded Hiberlink project (a collaboration between EDINA, Los Alamos National Laboratory, and the University of Edinburgh School of Informatics), which investigated “reference rot” (where URLs cease to work) and “content drift” (where URLs work but the content changes over time) in scientific scholarly publishing.

For our follow up work the focus has shifted to web citations – websites, reports, etc. – something which has become a far more visible challenge for many web users since January. I’ve been managing this project, working with developer, design and user experience colleagues to develop a practical solution around the needs of PhD students, shaped by advice from Library and University Collections colleagues.

If you are familiar with the Memento standard, and/or follow Herbert von de Sompel and Martin Klein’s work you’ll be well aware of how widespread the challenge of web citations changing over time can be, and the seriousness of the implications. The Internet Archive might be preserving all the (non-R-rated) gifs from Geocities but without preserving government reports, ephemeral content, social media etc. we would be missing a great deal of the cultural record and, in terms of where our project comes in, crucial resources and artefacts in many modern scholarly works. If you are new the issue of web archiving I would recommend a browse of my notes from the IIPC Web Archiving Week 2017 and papers from the co-located RESAW 2017 conference.

A huge part of the HiberActive project has been working with five postgraduate student interns to undertake interviews and usability work with PhD students across the University. My personal and huge thanks to Clarissa, Juliet, Irene, Luke and Shiva!

Still from the HiberActive gif featuring Library Cat.

A preview of the HiberActive gif featuring Library Cat.

You can see the results of this work at our demo site, http://hiberactive.edina.ac.uk/, and we would love your feedback on what we’ve done. You’ll find an introductory page on the project as well as three tools for archiving websites and obtaining the appropriate information to cite – hence adopting the name one our interviewees suggested, Site2Cite. We are particularly excited to have a tool which enables you to upload a Word or PDF document, have all URLs detected, and which then returns a list of URLs and the archived citable versions (as a csv file).

Now that the project is complete, we are looking at what the next steps may be so if you’d find these tools useful for your own publications or teaching materials, we’d love to hear from you.  I’ll also be presenting this work at Repository Fringe 2017 later this week so, if you are there, I’ll see you in the 10×10 session on Thursday!

To bring the HiberActive to life our students suggested something fun and my colleague Jackie created a fun and informative gif featuring Library Cat, Edinburgh’s world famous sociable on-campus feline. Library Cat has also popped up in another EDINA ISG Innovation-Funded project, Pixel This, which my colleagues James Reid and Tom Armitage have been working on. This project has been exploring how Pixel Sticks could be used around the University. To try them out properly I joined the team for fun photography night in George Square with Pixel Stick loaded with images of notable University of Edinburgh figures. One of my photos from that night, featuring the ghostly image of the much missed Library Cat (1.0) went a wee bit viral over on Facebook:

James Reid and I have also been experimenting with Tango-capable phone handsets in the (admittedly daftly named) Strictly Come Tango project. Tango creates impressive 3D scans of rooms and objects and we have been keen to find out what one might do with that data, how it could be used in buildings and georeferenced spaces. This was a small exploratory project but you can see a wee video on what we’ve been up to here.

In addition to these projects I’ve also been busy with continuing involvement in the Edinburgh Cityscope project, which I sit on the steering group for. Cityscope provided one of our busiest events for this spring’s excellent Data Festread more about EDINA’s participation in this new exciting event around big data, data analytics and data driven innovation, here.

I have also been working on two rather awesome Edinburgh-centric projects. Curious Edinburgh officially launched for Android, and released an updated iOS app, for this year’s Edinburgh International Science Festival in April. The app includes History of Science; Medicine; Geosciences; Physics; and a brand new Biotechnology tours that led you explore Edinburgh’s fantastic scientific legacy. The current PTAS-funded project is led by Dr Niki Vermeulen (Science, Technology & Innovation Studies), with tours written by Dr Bill Jenkins, and will see the app used in teaching around 600 undergraduate students this autumn. If you are curious about the app (pun entirely intended!), visiting Edinburgh – or just want to take a long distance virtual tour – do download the app, rate and review it, and let us know what you think!

Image of the Curious Edinburgh History of Biotechnology and Genetics Tour.

A preview of the new Curious Edinburgh History of Biotechnology and Genetics Tour.

The other Edinburgh project which has been progressing at a pace this year is LitLong: Word on the Street, an AHRC-funded project which builds on the prior LitLong project to develop new ways to engage with Edinburgh’s rich literary heritage. Edinburgh was the first city in the world to be awarded UNESCO City of Literature status (in 2008) and there are huge resources to draw upon. Prof. James Loxley (English Literature) is leading this project, which will be showcased in some fun and interesting ways at the Edinburgh International Book Festival this August. Keep an eye on litlong.org for updates or follow @litlong.

And finally… Regular readers here will be aware that I’m Convener for eLearning@ed (though my term is up and I’ll be passing the role onto a successor later this year – nominations welcomed!), a community of learning technologists and academic and support staff working with technologies in teaching and learning contexts. We held our big annual conference, eLearning@ed 2017: Playful Learning this June and I was invited to write about it on the ALTC Blog. You can explore a preview and click through to my full article below.

Playful Learning: the eLearning@ed Conference 2017

Phew! So, it has been a rather busy few months for me, which is why you may have seen slightly fewer blog posts and tweets from me of late…

In terms of the months ahead there are some exciting things brewing… But I’d also love to hear any ideas you may have for possible collaborations as my EDINA colleagues and I are always interested to work on new projects, develop joint proposals, and work in new innovative areas. Do get in touch!

And in the meantime, remember to book those tickets for my CODI 2017 show if you can make it along on 11th August!

Jul 042017
 

Today I am again at the Mykolo Romerio Universitetas in Vilnius, Lithuania, for the European Conference on Social Media 2017. As usual this is a liveblog so additions, corrections etc. all welcome… 

Keynote presentation: Daiva Lialytė, Integrity PR, Lithuania: Practical point of view: push or pull strategy works on social media 

I attended your presentations yesterday, and you are going so far into detail in social media. I am a practitioner and we can’t go into that same sort of depth because things are changing so fast. I have to confess that a colleague, a few years ago, suggested using social media and I thought “Oh, it’s all just cats” and I wasn’t sure. But it was a big success, we have six people working in this area now. And I’m now addicted to social media. In fact, how many times do you check your phone per day? (various guesses)…

Well, we are checking our smartphones 100-150 times per day. And some people would rather give up sex than smartphones! And we have this constant flood of updates and information – notifications that pop up all over the place… And there are a lot of people, organisations, brands, NGOs, etc. all want our attention on social media.

So, today, I want to introduce three main ideas here as a practitioner and marketer…

#1 Right Mindset

Brands want to control everything, absolutely everything… The colour, the font, the images, etc. But now social media says that you have to share your brand in other spaces, to lose some control. And I want to draw on Paul Holmes, a PR expert (see www.holmesreport.com) and he says when he fell in love with social media, there were four key aspects:

  • Brands (in)dependency
  • Possibilities of (non)control
  • Dialogue vs monologue
  • Dynamic 24×7

And I am going to give some examples here. So Gap, the US fashion brand, they looked at updating their brand. They spent a great deal of money to do this – not just the logo but all the paperwork, branded items, etc. They launched it, it went to the media… And it was a disaster. The Gap thought for a few days. They said “Thank you brand lover, we appreciate that you love our brand and we are going to stick with the old one”. And this raises the question of to whom a brand belongs… Shareholders or customers? Perhaps now we must think about customers as owning the brand.

Yesterday I saw a presentation from Syracuse on University traditions – and some of the restrictions of maintaining brand – but in social media that isn’t always possible. So, another example… Lagerhaus (like a smaller scale Ikea). They were launching a new online store, and wanted to build community (see videos) so targeted interior six design blogs and created “pop up online stores” – bloggers could select products from the store’s selection, and promote them as they like. That gained media attention, gained Facebook likes for the store’s Facebook page. And there was then an online store launch, with invitees approached by bloggers, and their pop up stores continue. So this is a great example of giving control to others, and building authentic interest in your brand.

In terms of dialogue vs monologue I’d quote from Michael Dell here, on the importance of engaging in honest, direct conversations with customers and stakeholders. This is all great… But the reality is that many who talk about this, many are never ever doing this… Indeed some just shut down spaces when they can’t engage properly. However, Dell has set up a social media listening and command centre. 22k+posts are monitored daily, engaging 1000+ customers per week. This was tightly integrated with @dellcares Twitter/Facebook team. And they have managed to convert “ranters” to “ravers” in 30% of cases. And a decrease of negative commentary since engagement in this space. Posts need quick responses as a few minutes, or hours, are great, longer and it becomes less and less useful…

Similarly we’ve seen scandinavian countries and banks engaging, even when they have been afraid of negative comments. And this is part of the thing about being part of social media – the ability to engage in dialogue, to be part of and react to the conversations.

Social media is really dynamic, 24×7. You have to move fast to take advantage. So, Lidl… They heard about a scandal in Lithuania about the army paying a fortune for spoons – some were €40 each. So Lidl ran a promotion for being able to get everything, including spoons there cheaper. It was funny, clever, creative and worked well.

Similarly Starbucks vowing to hire 10,000 refugees in the US (and now in EU) following Trump’s travel ban, that was also being dynamic, responding quickly.

#2 Bold Actions

When we first started doing social media… we faced challenges… Because the future is uncertain… So I want to talk about several social media apps here…

Google+ launched claiming to be bigger than Facebook, to do it all better. Meanwhile WhatsApp… Did great… But disappearing as a brand, at least in Lithuania. SnapChat has posts disappearing quickly… Young people love it. The owner has said that it won’t be sold to Facebook. Meanwhile Facebook is trying desperately to copy functionality. We have clients using SnapChat, fun but challenging to do well… Instagram has been a big success story… And it is starting to be bigger than Facebook in some demographics.

A little history here… If you look at a world map of social networks from December 2009, we see quite a lot of countries having their own social networks which are much more popular. By 2013, it’s much more Facebook, but there are still some national social media networks in Lithuania or Latvia. And then by 2017 we see in Africa uptake of Twitter and Instagram. Still a lot of Facebook. My point here is that things move really quickly. For instance young people love SnapChat, so we professionally need to be there too. You can learn new spaces quickly… But it doesn’t matter as you don’t have to retain that for long, everything changes fast. For instance in the US I have read that Facebook is banning posts by celebrities where they promote items… That is good, that means they are not sharing other content…

I want to go in depth on Facebook and Twitter. Of course the most eminent social media platform is Facebook. They are too big to be ignored. 2 billion monthly active Facebook users (June 2017). 1.28 billion people log onto Facebook daily. 83 million fake profiles. Age 25 to 34 at 29.7% of users are biggest age group. For many people they check Facebook first in the morning when they wake up. And 42% of marketers report that Facebook is very important to their business. And we now have brands approaching us to set up Facebook presence no matter what their area of work.

What Facebook does well is most precise targeting – the more precise the more you pay, but that’s ok. So that’s based on geolocation, demographic characteristic, social status, interests, even real time location. That works well but remember that there are 83 million fake profiles too.

So that’s push, what about pull? Well there are the posts, clicks, etc. And there is Canvas – which works for mobile users, story driven ads (mini landing), creative story, generate better results and click through rates. (we are watching a Nespresso mobile canvas demo). Another key tool is Livestream – free of charge, notifications for your followers, and it’s live discussion. But you need to be well prepared and tell a compelling story to make proper use of this. But you can do it from anywhere in the world. For instance one time I saw livestream of farewell of Barack Obama – that only had 15k viewers though so it’s free but you have to work to get engagement.

No matter which tool, “content is the king!” (Bill Gates, 1996). Clients want us to create good stories here but it is hard to do… So what makes the difference? The Content Marketing Institute (US), 2015 suggest:

  1. Content
  2. Photos
  3. Newsletters
  4. Video
  5. Article
  6. Blogs
  7. Events
  8. Infographics
  9. Mobile applications
  10. Conferences and Livestreams

So, I will give some examples here… I’ll show you the recent winner of Cannes Lions 2017 for social media and digital category. This is “Project Graham” – a public driver safety campaign about how humans are not designed to survive a crash… Here is how we’d look if we were – this was promoted heavily in social media.

Help for push from Facebook – well the algorithms prioritise content that does well. And auctions to reach your audience mean that it is cheaper to run good content that really works for your audience.

And LinkedIn meanwhile is having a renaissance. It was quite dull, but they changed their interface significantly a few months back, and now we see influencers (in Lithunia) now using LinkedIn, sharing content there. For instance lawyers have adopted the space. Some were predicting LinkedIn would die, but I am not so sure… It is the biggest professional social network – 467 million users in 200 countries. And it is the biggest network of professionals – a third have LinkedIn profile. Users spend 17 minutes per dat, 40% use it every day, 28% of all internet users use LinkedIn. And it is really functioning as a public CV, recruitment, and for ambassadorship – you can share richer information here.

I wanted to give a recent example – it is not a sexy looking case study – but it worked very well. This was work with Ruptela, a high tech company that provides fleet management based on GPS tracking and real-time vehicle monitoring and control. They needed to hire rapidly 15 new sales representatives via social media. That’s a challenge as young people, especially in the IT sector – are leaving Lithuania or working in Lithuania-based expertise centres for UK, Danish, etc. brands.

So we ran a campaign, on a tiny budget (incomparable with headhunters for instance), around “get a job in 2 days” and successfully recruited 20 sales representatives. LinkedIn marketing is expensive, but very targeted and much cheaper than you’d otherwise pay.

#3 Right Skills

In terms of the skills for these spaces:

  • copywriter (for good storytelling)
  • visualist (graphics, photo, video)
  • community manager (to maintain appropriate contact) – the skills for that cannot be underestimated.
  • And… Something that I missed… 

You have to be like a one man band – good at everything. But then we have young people coming in with lots of those skills, and can develop them further…

So, I wanted to end on a nice story/campaign… An add for Budweiser for not drinking and driving

Q&A

Q1) Authenticity is the big thing right now… But do you think all that “authentic” advertising content may get old and less effective over time?

A1) People want to hear from their friends, from people like them, in their own words. Big brands want that authenticity… But they also want total control which doesn’t fit with that. The reality is probably that something between those two levels is what we need but that change will only happen as it becomes clear to big brands that their controlled content isn’t working anymore.

Q2) With that social media map… What age group was that? I didn’t see SnapChat there.

A2) I’m not sure, it was a map of dominant social media spaces…

Q3) I wanted to talk about the hierarchy of content… Written posts, visual content etc… What seemed to do best was sponsored video content that was subtitled.

A3) Facebook itself, they prioritise video content – it is cheaper to use this in your marketing. If you do video yes, you have to have subtitles so that you can see rather than listen to the videos… And with videos, especially “authentic video” that will be heavily prioritised by Facebook. So we are doing a lot of video work.

Introduction to ECSM 2018 Niall Corcoran, Limerick Institute of Technology, Ireland

I wanted to start by thanking our hosts this year, Vilnius has been excellent this year. Next year we’ll a bit earlier in the year – late June – and we’ll be at the Limerick Institute of Technology, Ireland. We have campuses around the region with 7000 students and 650 staff, teaching from levels 6 to 10. The nearest airport is Shannon, or easy distance from Cork or Dublin airports.

In terms of social media we do research on Social MEdia Interactive Learning Environment, Limerick Interactive Storytelling Network, Social Media for teaching and research, Social Media for cancer recovery.

In terms of Limerick itself, 80-90% of the Europe’s contact lenses are manufactured there! There is a lot of manufacturing in Limerick, with many companies having their European headquarters there. So, I’ve got a short video made by one of our students to give you a sense of the town. And we hope to see you there next year!

Social Media Competition Update

The top three placed entries are: Developing Social Paleantology – Lisa Lundgren; EDINA Digital Footprint Consulting and Training Service – Nicola Osborne (yay!); Traditions Mobile App – Adam Peruta.

Stream A: Mini track on Ethical use of social media data – Chair: Dragana Calic

The Benefits and Complications of Facebook Memorials – White Michelle, University of Hawai’i at Manoa, USA

I wanted to look at who people imagine are their audience for these memorials. And this happened because after the death made me look at this, and I decided to look into this in more depth.

So, I’m using danah boyd’s definition of social networking here. We are talking Facebook, Twitter, SnapChat etc. So, a Facebook Memorial is a group that is created specifically to mark the death of a friend or family members – or for public figures (e.g. Michael Jackson).

Robert Zebruck and Brubecker talk about imagined audience as the flattening of realities. So, right now I can see people in the room, I can see who you are, how you react, how to modify my tone or style to meet you, to respond to you. But it is hard to do that on social media. We see context collapse. And we can be sat there alone at our computer and not have that sense of being public. Sometimes with memorials we will say things for that audience, but in other cases perhaps it is sharing memories of drinking together, or smoking weed with something… Memories that may jar with others.

It was a long road to get to this research. My review board were concerned about emotional distress of interviewees. I agreed in the end to interview via Skype or Facebook and to check everything was ok after every question, to make it easier to see and review their state of mind. I had to wait over a year to interview people, the death had to not be by suicide, and the participants had to be over 18 years old. So I did conduct qualitative research over Skype and Facebook… And I found interviewees by looking at memorial pages that are out there – there are loads there, not all labelled as memorials.

So, my data… I began by asking who people thought they were talking to… Many hadn’t thought about it. They talked about family members, friends… Even in a very controlled group you can have trolls and haters who can get in… But often people assumed that other people were like them. A lot of people would write to the deceased – as if visiting a grave, say. I asked if they thought the person could hear or understand.. But they hadn’t really thought about it, it felt like the right thing to do… And they wanted family and friends to hear from them. They felt likes, shares, etc. were validating and therapeutic, and that sense of connection was therapeutic. Some even made friends through going out drinking, or family gatherings… with friends of friends who they hadn’t met before…

This inability to really think or understand the imagine audience, that led to context collapse. Usually family is in charge of these pages… And that can be challenging… For instance an up and coming football star died suddenly, and then it was evident that it was the result of a drug overdose… And that was distressing for the family who tried to remove that content. There is an idea of alternative narratives. Fake news or alternative facts has a particular meaning right now… But we are all used to presenting ourselves in a particular way to different friends, etc. In one memorial site the deceased had owed money to a friend, and they still felt owed that money and were posting about that – like a fight at the funeral… It’s very hard to monitor ourselves and other people…

And there was fighting about who owned the person… Some claiming that someone was their best friend, fights over who was more important or who was more influenced. It happens in real life… But not quite as visibly or with all involved…

So, in conclusion… There are  a lot of benefits for Facebook Memorials. Pyschologists talk of the benefit of connecting, grieving, not feeling alone, to get support. Death happens. We are usually sad when it happens… Social networking sites provide another way to engage and connect. So if I’m in Lithuania and there is a funeral in Hawaii that I can’t travel to, I can still connect. It is changing our social norms, and how we connect. But we can do more to make it work better – safety and security needs improving. Facebook have now added the ability to will your page to someone. And now if someone dies you can notify Twitter – it changes it slightly, birthday reminders no longer pop up, it acts as a memorial. There are new affordances.

Personally, doing this research was very sad, and it’s not an area I want to continue looking at. It was emotionally distressing for me to do this work.

Q&A

Q1) I am old enough to remember LiveJournal and remember memorials there. They used to turn a page into a memorial, then were deleted… Do you think Facebook should sunset these memorials?

A1) I personally spoke to people who would stare at the page for a month, expecting posts… Maybe you go to a funeral, you mourn, you are sad… But that page sticking around feels like it extends that… But I bet Mark Zuckerberg has some money making plan for keeping those profiles there!

Q2) What is the motivation for such public sharing in this way?

A2) I think young people want to put it out there, to share their pain, to have it validated – “feel my pain with me”. One lady I spoke to, her boyfriend was killed in a mass shooting… Eventually she couldn’t look at it, it was all debate about gun control and she didn’t want to engage with that any more…

Q3) Why no suicides? I struggle to see why they are automatically more distressing than other upsetting deaths…

A3) I don’t know… But my review board thought it would be more distressing for people…

Q4) How do private memorials differ from celebrity memorials?

A4) I deliberately avoided celebrities, but also my IRB didn’t want me to look at any groups without permission from every member of that group…

Comment) I’ve done work with public Facebook groups, my IRB was fine with that.

A4) I think it was just this group really… But there was concern about publicly identifiable information.

Online Privacy: Present Need or Relic From the Past? – Aguirre-Jaramillo Lina Maria, Universidad Pontificia Bolivariana, Colombia

In the influential essay, The Right to Privacy, in the Harvard Law Review (1890) – Warren and Brandeis, privacy was defined as “Privacy – the right to be let alone”. But in the last ten years or so we now see sharing of information that not long ago would have been seen and expected to be private. Earl Warren is a famous US judge and he said “The fantastic advances in the field of electronic communication constitute a greater danger to the privacy of the individual.”

We see privacy particularly threatened by systematic data collection. Mark Zuckerberg (1999) claims “Privacy is no longer a social norm”. This has been used as evidence of disregard toward users rights and data. The manner in which data is stored, changed and used and the associated threats. But we also see counter arguments such as the American Library Association’s Privacy Revolution campaign.

So, this is the context for this work in Columbia. It is important to understand literature in this area, particularly around data use, data combinations, and the connection between privacy concerns and behaviours online (Joinsen et al 2008). And we also refer to the work of Sheenan (2002) in the characterisations of online users. Particularly we are interested in new privacy concerns and platforms, particularly Facebook. The impact of culture on online privacy has been studied by Cho, Rivera Sanchez and Lim (2009).

The State of the Internet from OxII found that Columbia had between 40 and 60% of people online. Internet uptake is, however, lower than in e.g. the US. And in Columbia our population is 46% 25-54 years old.

So, my study is currently online. A wider group is also engaging in personal and group interviews. Our analysis will focus on what background knowledge, risk and privacy awareness there is amongst participants. Wat self-efficacy level is regealed by participants – their knowledge and habits. And what interest and willingness is there to acquire more knowledge and gain more skills to manage privacy. At a later stage we will be building a prototype tool.

Our conclusions so far… Privacy is hard to define and we need to do more to define it. Privacy is not a concept articulated in one only universally accepted definition. Different groups trade off privacy differently. Relevant concepts here include background knowledge, computer literacy, privacy risk, self efficacy.

And finally… Privacy is still important but often ignored as important in the wider culture. Privacy is not a relic but a changing necessity…

Q&A

Q1) Did age play a role in privacy? Do young people care as much as older people?

A1) They seem to care when they hear stories of peers being bullied, or harassed, or hear stories of hacking Instagram accounts. But their idea of privacy is different. But there is information that they do not want to have public or stolen. So we are looking more at that, and also a need to understand how they want to engage in privacy. As my colleague Nicola Osborne form Edinburgh said in her presentation yesterday, we have to remember students already come in with a long internet/social media history and presence.

Q2) I was wondering about cultural aspect… Apps used and whether privacy is important… For instance SnapChat is very exhibitionist but also ephemeral…

A2) I don’t have full answers yet but… Young people share on SnapChat and Instagram to build popularity with peers… But almost none of them are interested in Twitter… At least that’s the case in Columbia. But they do know some content on Facebook may be more vulnerable that SnapChat and Instagram… It may be that they have the idea of SnapChat as a space they can control perhaps…

Q3) I often feel more liberal with what I share on Facebook, than students who are 10 or 15 years younger… I would have some privacy settings but don’t think about the long story of that… From my experience students are a lot more savvy in that way… When they first come in, they are very aware of that… Don’t want a bigger footprint there…

A3) That is not exactly true in Columbia. The idea of Digital Footprint affecting their career is not a thing in the same way… Just becoming aware of it… But that idea of exhibitionism… I have found that most of the students in Columbia seem quite happy to share lots of selfies and images of their feet… That became a trend in other countries about three years ago… They don’t want to write much… Just to say “I’m here”… And there has been some interesting research in terms of the selfie generation and ideas of expressing yourself and showing yourself… May be partly to do with other issues… In Columbia many young women have plastic surgery – came out of the 1980s and 1990s… Many women, young women, have cosmetic surgery and want to share that… More on Instagram than Pinterest – Pinterest is for flowers and little girlie things…

Q4) You were talking about gender, how do privacy attitudes differ between males and females?

A4) The literature review suggests women tend to be more careful about what they publish online… They may be more careful selecting networks and where they share content… More willing to double check settings, and to delete content they might have difficulty explaining… Also more willing to discuss issues of privacy… Things may change over time… Suggestion that people will get to an age where they do care more… But we also need to see how the generation that have all of their images online, even from being a baby, will think about this… But generally seems to be slightly more concern or awareness from women…

Comment) I wanted to just follow up the Facebook comment and say that I think it may not be age but experience of prior use that may shape different habits there… Students typically arrive at our university with hundreds of friends having used Facebook since school, and so they see that page as a very public space – in our research some students commented specifically on that and their changing use and filtering back of Facebook contacts… For a lot of academics and mid career professionals Facebook is quite a private social space, Twitter plays more that public role. But it’s not age per se perhaps, it’s that baggage and experience.

Constructing Malleable Truth: Memes from the 2016 U.S. Presidential Campaign – Wiggins Bradley, Webster University, Vienna, Austria, Austria

Now, when I wrote this… Trump was “a candidate”. Then he was nominee. Then president elect… And now President. And that’s been… surprising… So that’s the context.

I look at various aspects in my research, including internet memes. So, in the 2008 Obama’s campaign was great at using social media, at getting people out there and sharing and campaigning for them on a voluntary and enthusiastic basis. 2016 was the meme election I think. Now people researching Memes feel they must refer to Richard Dawkins talking about memes. He meant ideas… That’s not the same as internet memes… So what are the differences betwen Dawkins’ memes and Internet memes? Well honestly they are totally different EXCEPT that they require attention, and have to be reproducable….

Mikhail Bakhtin wrote about the Carnivalesque as something that subverts the dominant mode or perspective, it turns the world on its head… The king becomes the jester and the jester becomes the king. So the Trump tie memes… We need no text here, the absurd is made more absurd. It is very critical. It has that circus level laugh… He’s a clown or a buffoon… You know about it and how to reproduce this.

In terms of literature.. There is work on memes but I think when understanding memes with millennials, but also baby boomers, even people in their 70’s and 80s… We have to go back to major theorists, concepts and perspectives – Henry Jenkins, Erving Goffman, etc. This is a new mode of communication I think, not a new language, but a new mode.

So method wise… I wanted to do a rhetorical-critical analysis of selected internet memes from the facebook page Bernie Sanders Dank Meme Stash, which had over 420k members when I wrote this slide – more now. It was founded by a college student in October 2015. And there are hundreds of thousands of memes there. People create and curate them.

Two months before nad one month after the US Election I did two sets of samples… Memes that received 1000 or more likes/retweets. And memes that received at least 500 or more likes/reactions and at least 100 shares. As an unexpected side note I found that I needed to define “media narrative”. There doesn’t seem to be a good definition. I spoke to Brooke Gladstone of WYNC, I spoke with colleagues in Vienna… We don’t usually take time to think about media narrative… For instance the shooting at Pulse Nightclub has a narrative on the right around gun control, for others its around it being a sad and horrible event…

So, media narrative I am defining as:

  1. Malleable depending upon the ability to ask critical questions
  2.  Able to shape opinion as well as perceptions of reality and a person’s decision-making process and…
  3.  Linguistic and image-based simulations of real-world events which adhere and/or appeal to ontologically specific perspectives, which may include any intentional use of stereotyping, ideology, allegory, etc.

Some findings… The relational roles between image and text are interchangable because of the relationship to popular culture. Barthes (1977) takls about the text loading the image burdening it with culture, a moral, an imagination. And therefore the text in internet memes fluctuates depending n the intended message and the dependence on popular culture.

So, for instance we have an image from Nightmare at 20,000 ft, a classic Twilight Zone image… You need to know nothing here and if I replace a monster face with Donald Trump’s face… It’s instantly accessible and funny. But you can put any image there depending on the directionality of the intended meaning. So you have the idea of the mytheme or function of the monster/devil/etc. can be replaced by any other monster… It doesn’t matter, the reaction will depend on your audience.

Back to Barthes (1977) again, I find him incredibly salient to the work I’ve done here. One thing emerging from this and Russian memes work done before, is the idea of Polysemic directionality. It has one direction and intentionality.. No matter what version of this image you use…

So, here’s a quick clip of the Silence of the Lambs. And here Buffallo Bill, who kills women and skins them… A very scary character… We have him in a meme being a disturbing advisor in memes. If you get that reference it has more weight, but you don’t need to know the reference.

We have the image of Hillary as Two Face, we have Donald as The Joker… And a poster saying “Choose”. The vitriol directed at Clinton was far worse than that at Trump… Perhaps because Sanders supporters were disappointed at not getting the nomination.

We have intertextuality, we also have inter-memetic references… For example the Hilary deletes electoral colleges meme which plays on Grandma on the internet memes… YOu also have the Superman vs Trump – particularly relevant to immigrant populations (Jenkins 2010).

So, conclusions… The construction of a meme is affected and dependent on the media around it… That is crucial… We have heard about fake news, and we see memes in support of that fake news… And you may see that on all sides here. Intertextual references rely on popular culture and inter memetic references which assumes knowledge, a new form of communication. And I would argue that memes are a digital myth – I think Levi Strauss might agree with me on that…

And to close, for your viewing pleasure, the Trump Executive Order meme… The idea of a meme, an idea that can be infinitely replaced with anything really…

Q&A

Q1) This new sphere of memes… Do you think that Trump represents a new era of presidency… Do you think that this will pass? With Trump posting to his own Twitter account…

A1) I think that it will get more intense… And offline too… We see stickers in Austrian elections around meme like images… These are tools for millennials. They are hugely popular in Turkey… There are governments in Turkey, Iran and China are using memes as propaganda against other parties… I’m not sure it’s new but we are certainly more aware of it… Trump is a reality TV star with the nucleaur keys… That should scare us… But memes won’t go away…

Q2) In terms of memes in real life… What about bumper stickers… ? They were huge before… They are kind of IRL memes…

A2) I am working on a book at the moment… And one of the chapters is on pre-digital memes. WWII used to write “Kilroy was here”. Is Magritte’s Ceci n’est pas une pipe a meme? There is definitely a legacy of that… So yes, but depends on national regional context…

Q3) So… In Egypt we saw memes about Trump… We were surprised at the election outcome… What happened?

A3) Firstly, there is that bias that reinforcing narrative… If you looked at the Sanders meme page you might have had that idea that Clinton would not win because, for whatever reason, these people hated Hillary. Real rage and hatred towards her… And Trump as clown hitler… Won’t happen… Then it did… Then rage against him went up… After the Muslim ban, the women’s march etc…

Q4) There are some memes that seem to be everywhere – Charlie and the Chocolate Factory, Sean Bean, etc… Why are we picking those specific particular memes of all things?

A4) Like the Picard WTF meme… Know Your Meme.com is a great resource… In the scene that Picard image is from he’s reciting Shakespeare to get Louixana Troy away from the aliens… It doesn’t matter… But it just fits, it has a meaning

Q5) Gender and memes: I wondered about the aspect of gender in memes, particularly thinking about Clinton – many of those reminded me of the Mary Beard memes and trolling… There are trolling memes – the frog for Trump… the semi-pornographic memes against women… Is there more to that than just (with all her baggage) Clinton herself?

A5) Lisa Silfestry from Gonzaga, Washington State and Lemour Shipman in Tel Aviv do work in that area. Shipman looks at Online Jokes of all types and has done some work on gender.

Q6) Who makes memes? Why?

A6) I taught a course on internet memes and cultures. That was one of the best attended courses ever. My students concluded that the author didn’t matter… But look at 4Chan and Reddit or Know Your Meme… And you can tell who created it… But does that matter… It’s almost a public good. Who cares who created the Trump tie meme. With the United Airline you can see that video, it turned into a meme… and it had lost millions in stock.

Stream B: Mini track on Enterprise Social Media – Chair: Paul Alpar

The Role of Social Media in Crowdfunding – Makina Daniel, University of South Africa, Pretoria, South Africa

My work seeks to find the connection between social media and finance, specifically crowd funding. And the paper introduces the phenomena of crowdfunding, and how the theory of social networking underpins social media. The theory around social media is still developing… Underpinned by theory of information systems and technology adoption, with different characteristics from what happens in social media.

So, a definition of crowdfunding. Crowdfunding is essentially an aspect of crowdsourcing, spurred by ubiquitous web 2.0 technologies. And “Crowdfunding refers to the efforts of entrepreneurial individuals and groups – cultural, social and for-profit – to fund their ventures by drawing on relatively small contributions from a relatively large number of individuals using the internet, without standard financial intermediaries” (Mollick 2014).

Since 2010 there have been growing amounts of money raised globally through crowdfunding. Fobes estimates $34 billion in 2015 (compared to $16 billion in 2014, and $880 million in 2010). The World Bank estimates that crowdfunding will raise $93 billion annually by 2025. This growth couldn’t be achieved in the absence of internet technology, and social media are critical in promoting this form of alternative finance.

Cheung and Lee (2010) examined social influence processes in determining collective social action in the context of online social networks. Their model shows intentional soial action, with users considering themselves part of the social fabric. And they explain three processes of social influence: subjective norm – self outside of any group; group norm – self awareness as a member of a group; and social identity – self in context. Other authors explain social media popularity because of a lack of trust in traditional media, with people wary of information that emanates from people they do not know personally. Kaplin and Haenlein (2010) define social media as “a group of internet-based applications that build on the ideological and technological foundations of web 2.0 applications that allow the creation and exchange of user generated content” So it is a form of online interaction that enables people to create, comment, share and exchange content with other people.

So, how does social media facilitate finance, or crowd sourcing? Since social media assists in maintaining social ties, this should in turn aid facilitation of crowdfunding campaigns. Draw on Linus’s Law “given enough eyeballs, all bugs are shallow”. Large groups are more adept at detecting potential flaws in a campaign than individuals (alone). Thus providing fraudulent campaigns from raising money for crowdfunding projects. Facebook, Twitter, etc. provide spaces for sharing and connection are therefore suitable for crowdfunding campaigns. Studies have shown that 51% of Facebook users are more likely to buy a product after becoming a fan of the products Facebook page (Knudsen 2015).

Brossman (2015) views crowdfunding as existing in two phases (i) brand awareness and (ii) targeting people to support/back one’s campaign. And crowdfunding sites such as Kickstarted and IndieGoGo allow project creators to publish pertinent information and updates, as well as to link to social media. Those connections are present and that also helps deal with a relative lack of social networking functionality within the platform itself, where they are able to create project descriptions, they have a community of users and utilise web 2.0 technologies that allow users to comment on projects and attract money.

A study by Moisseyez (2013) on 100 Kickstarter projects found that connection between social media approval and success in funding. Mollick (2014) observed that crowdfunding success is associated with having a large number of friends in online social networks: a founder with ten Facebook friends would have a 9% chance of succeeding; one with 100 friends would have a 20% chance of success; one with 1000 friends would have a 40% chance of success. He cited a film industry example where more friends mapped to a much higher potential success rates.

So, in conclusion, we don’t have many studies on this are yet. But social media is observed to aid crowdfunding campaigns through its ability to network disparate people through the internet. One notable feature is that although there are main forms of social media, crowdfunding utilizes a limited number of spaces, primarily Facebook and Twitter. Furthermore future research should examine how the expertise of the creator (requestor of funds) and project type, social network, and online presence influence motivations.

Q&A

Q1) I was wondering if you see any connection between the types of people who back crowdfunding campaigns, and why particular patterns of social media use, or popularity are being found. For instance anecdotally the people who back lots of crowdfunding campaigns – not just one off – tend to be young men in their 30s and 40s. So I was wondering about that profile of backers and what that looks like… And if that profile of backer is part of what makes those social media approaches work.

A1) The majority of people using social media are young people… But young people as sources of finance for, say, small businesses… They are mainly likely to be either studying or starting professional career… But not accumulating money to give it out… So we see a disconnect… Between who is on social media… On Twitter, Facebook, etc. to raise finance… You successful in raising funding from people who cannot raise much… So one would expect people in mid career were using most social media, would expect more money coming from crowdfunding… One aspect of crowdfunding… We are looking at resources… You asking for small amounts… Then young people are able to spare that much…

Q2) So most people giving funding on crowdfunding sites are young people, and they give small amounts…

A2) Yes… And that data from Mollick… combined with evidence of people who are using Facebook…

Q2) What about other specialised crowdfunding networks… ?

A2) There is more work to be done. But even small crowdfunding networks will connect to supporters through social media…

Q3) Have you looked at the relative offerings of the crowdfunding campaigns?

A3) Yes, technology products are more successful on these platforms than other projects…

Using Enterprise Social Networks to Support Staff Knowledge Sharing in Higher Education – Corcoran Niall, Limerick Institute of Technology, Ireland and Aidan Duane, Waterford Institute of Technology, Ireland

This work is rooted in knowledge management, this is the basis for the whole study. So I wanted to start with a Ikujio Nonaka “in an econoy where the only certainty is uncertainty… ” And Lew Platt, former CEO of Hewlett-Packard said “If HP knew what HP knows it would be three times more productive” – highlighting the crucial role of knowledge sharing.

Organisations can gain competitive advantage through encouraging and promoting knowledge sharing – that’s the theory at least. It’s very important in knowledge-intensive organisations, such as public HEIs. HEIs need to compete in a global market place… We need to share knowledge… Do we do this?

And I want to think about this in the context of social media. We know that social media enable creation, sharing or exchange of information, ideas and media in virtual communities and networks. And organisational applications are close to some of the ideals of knowledge management: supporting group interaction towards establishing communities; enable creation and sharing of content; can help improve collaboration and communication with organisations; distinct technological features that are ideally suited for knowledge sharing; fundamental disruption in knowledge management; and social media is reinvigorating knowledge management as a field.

We do see Enterprise Social Networks (ESN). If you just bring one into an organisation, people don’t necessarily just go and use it. People need a reason to share. So another aspect is communities of practice (Lave and Wenger 1991), this is an important knowledge management strategy, increasingly used. This is about groups pf people who share a passion for something – loose and informal social structures, largely voluntary, and about sharing tacit knowledge. So Communities of Practice (CoP) tend to meet from time to time – in person or virtually.

ESN can be used to create virtual communities. This is particularly suitable for distributed communities – our university has multiple campuses for instance.

So, knowledge sharing in HEIs… Well many don’t do it. A number of studies have shown that KM implementation and knowledge sharing in HEIs is at a low level. Why? Organisational culture, organisational structures, beurocractic characteristics. And there is well documented divide/mistrust between faculty and support staff (silos) – particularly work from Australia, US and UK. So, can CoP and ESN help? Well in theory they can bypass structures that can reinforce silos. That’s an ideal picture, whether we get there is a different thing.

So our research looked at what the antecedents for staff knowledge sharing are; what the dominant problems in the implementation of ESN and CoP. The contextual setting here is Limerick Institute of Technology. I used to work in IT services and this work came significantly from this interest. There is a significant practical aspect to the research so action research seemed like the most appropriate approach.

So we had a three cycle action research project. We looked at Yammer. It has all the features of social networking you’d expect – can engage in conversations, tagged, shared, can upload content. It lends itself well to setting up virtual communities, very flexible and powerful tools for virtual communities. We started from scratch and grew to 209 users.

Some key findings… We found culture and structure are major barriers to staff knowledge sharing. We theorised that and found it to be absolutely the case. The faculty staff divide in HEI exacerbates the problem. Management have an important role to play in shaping a knowledge sharing environment. The existence of CoP are essential to build a successful knowledge sharing environment, and community leaders and champions are require for the ESN. Motivation to participate is also crucial. If they feel motivated, and they see benefit, that can be very helpful. And those benefits can potentially lead to culture change, which then effects motivation…

We found that our organisation has a strong hierarchical model. Very beaurocratic and rigid. Geographic dispersal doesn’t help. To fix this we need to move from a transactional culture. The current organisational structure contributes to the faculty staff divide, limits opportunities and motivations for staff and faculty to work together. But we also found examples where they were working well together. And in terms of the role of management, they have significant importance, and have to be involved to make this work.

Virtual communities are a Knowledge Management strategy has the potential to improve collaboration and interaction between staff, and it has to be seen as valued, relevant, a valid work activity. Staff motivation wise there are some highly motivated people, but not all. Management have to understand that.

So management need to understand the organisational culture; recognise the existence of structural and cultural problems; etc. Some of the challenges here are the public sector hierarchical structures – public accountability, media scrutiny, transitional culture etc.

Q&A

Q1) On a technical level, which tools are most promising for tacit knowledge sharing…

A1) The whole ability to have a conversation. Email doesn’t work for that, you can’t branch threads… That is a distinctive feature of Yammer groups, to also like/view/be onlookers in a conversation. We encourage people to like something if they read it, to see that it is useful. But the ability to have a proper conversation, and organised meetings and conversations in real time.

Q2) What kind of things are they sharing?

A2) We’ve seen some communities that are large, they have a real sense of collaboration. We’re had research coming out of that, some really positive outcomes.

Q3) Have you seen any evidence of use in different countries… What are barriers across different regions, if known?

A3) I think the barriers are similar to the conceptual model (in the proceedings) – both personal and organisational barriers… People are afraid largely to share stuff… They are nervous of being judged… Also that engagement on this platform might make managers thing that they are not working. Age is a limiting factor – economic issues mean we haven’t recruited new staff for almost 10 years, so we are older as a staff group.

Q3) Might be interested to compare to different cultures, with asian culture more closed I think…

A3) Yes, that would be really interesting to do…

Q4) I am trying to think how and what I might share with my colleagues in professional services, technical staff, etc.

A4) The way this is constructed is in communities… We have staff interested in using Office 365 and Classroom Notebook, and so we set up a group to discuss that. We have champions who lead that group and guide it. So what is posted there would be quite specific… But in Yammer you can also share to all… But we monitor and also train our users in how and where to post… You can sign up for groups or create new groups… And it is moderated. But not limited to specifically work related groups – sports and social groups are there too. And that helps grow the user base and helps people see benefits.

Q5) Have you looked at Slack at all? Or done any comparison there?

A5) We chose Yammer because of price… We have it in O365, very practical reason for that… We have looked at Slack but no direct comparison.

Finalists in the Social Media in Practice Excellence Competition present their Case Histories

EDINA Digital Footprint Consulting and Training Service – Nicola Osborne

No notes for this one…

Developing Social Paleantology – Lisa Lundgren;

This is work with a software development company, funded by the National Science Foundation. And this was a project to develop a community of practice around paleontology… People often think “dinosaur” but actually it’s about a much wider set of research and studies of fossils. For our fossil project to meet it’s goal, to develop and support that community, we needed to use social media. So we have a My Fossil community, which is closed to the community, but also a Facebook group and Twitter presence. We wanted to use social media in an educative way to engage the community with our work.

We began with design studies which looked at what basic elements to contribute to engage with social media, and how to engage. We were able to assess practical contributions and build an educatie and evidence-based social media plan. So we wanted to create daily posts using social paleantology, e.g. #TrilobiteTuesday; design branded image-focused posts that are practice-specific, meet design principles, often huperlinks to vetted paleontological websites; respond to members in ways that encourage chains of communication. There is a theoretical contribution here as well. And we think there are further opportunities to engage more with social paleontology and we are keen for feedback and further discussion. So, I’m here to chat!

 

Traditions Mobile App – Adam Peruta.

When new university students come to campus they have lots of concerns like what is this place, where do I fit in, how can I make new friends. That is particularly the case at small universities who want to ensure students feel part of the community, and want to stay around. his is where the Traditions Challenge app comes in – it provides challenges and activities to engage new students in university traditions and features. This was trialled at Ithaca University. So, for instance we encourage students to head along to go along to events, meet other new students, etc. We encourage students to meet their academic advisors outside of the classroom. To explore notable campus features. And to explore the local community more – like the farmers market. So we have a social feed – you can like, comment, there is an event calendar, a history of the school, etc. And the whole process is gamified, you gain points through challenges, you can go on the leaderboard so there are incentives to gain status… And there are prizes too.

Looking at the results this year… We had about 200 students who collectively completed over 1400 challenges, the person who completed the most (and won a shirt) completed 53 challenges. There are about 100 challenges in the app so it’s good they weren’t all done in one year. And we see over 50k screen views so we know that the app is getting more attention whether or not people engage in the challenges. Students focus groups raised themes of the enjoyment of the challenge list, motivation for participation (which varied), app design and user experience – if there’s one key takeaway: this demographic has really high expectations for user interface, design and tone; contribution to identity… Lots of academic research that the more students are engaged on campus, the more likely they will remain at that university and remain engaged through their studies and as alumni. So there is loads of potential here, and opportunity to do more with the data.

So, the digital experience is preferred, mobile development is expensive and time consuming, good UI/UX is imperative to success, universities are good at protecting their brands, and we learned that students really want to augment their on-campus academic experiences.

Conference organiser: Those were the finalists from yesterday, so we will award the prizes for first, second and third… and the PhD prize…

Third place is Lisa; Second place is me (yay!); First place is Adam and the Traditions mobile app.

I’m going to rely on others to tweet the PhD winners…

The best poster went to IT Alignment through Artificial Intelligence – Amir  – this was mainly based on Amir’s performance as his poster went missing so he had to present to an A4 version of the poster so he did a great job of presenting.

Thank you to our hosts here… And we hope you can join us in Limerick next year!

Thanks to all at ECSM 2017.

Jul 032017
 

Today I am at the Mykolo Romerio Universitetas in Vilnius, Lithuania, for the European Conference on Social Media 2017. As usual this is a liveblog so additions, corrections etc. all welcome… 

Welcome and Opening by the Conference and Programme Chairs: Aelita Skaržauskienė and Nomeda Gudelienė

Nomeda Gudelienė: I am head of research here and I want to welcome you to Lithuania. We are very honoured to have you here. Social media is very important for building connections and networking, but conferences are also really important still. And we are delighted to have you here in our beautiful Vilnius – I hope you will have time to explore our lovely city.

We were founded 25 years ago when our country gained independence from the Soviet Union. We focus on social studies – there was a gap for new public officials, for lawyers, etc. and our university was founded,

Keynote presentation: Dr. Edgaras Leichteris, Lithuanian Robotics Association – Society in the cloud – what is the future of digitalization?

I wanted to give something of an overview of how trends in ICT are moving – I’m sure you’ve all heard that none of us will have jobs in 20 years because robots will have them all (cue laughter).

I wanted to start with this complex timeline of emerging science and technology that gives an overview of Digital, Green, Bio, Nano, Neuro. Digitalisation is the most important of these trends, it underpins this all. How many of us think digitalisation will save paper? Maybe not for universities or government but young people are shifting to digital. But there are major energy implications of that, we are using a lot of power and heat to digitise our society. This takes us through some of those other areas…. Can you imagine social networking when we have direct neural interfaces?

This brings me to the Hype curve – where see a great deal of excitement, the trough of disillusionment and through to where the real work is. Gartner creates a hype cycle graph every year to illustrate technological trends. At the moment we can pick out areas like Augmented reality, virtual reality, digital currency. When you look at business impact… Well I thought that the areas that seem to be showing real change include Internet of Things – in modern factories you see very few people now, they are just there for packaging as we have sensors and devices everywhere. We have privacy-enhancing technologies, blockchain, brain computer interfaces, and virtual assistance. So we have technologies which are being genuinely disruptive.

Trends wise we also see political focus here. Why is digital a key focus in the European Union? Well we have captured only a small percentage of the potential. And when we look across the Digital Economy and Society index we see this is about skills, about high quality public services – a real priority in Lithuania at the moment – not just about digitalisation for it’s own sake. Now a few days ago the US press laughed at Jean Claude Junker admitting he still doesn’t have a smartphone, but at the same time, he and others leading the EU see that the future is digital.

Some months back I was asked at a training session “Close your eyes. You are now in 2050. What do you see?”. When I thought about that my view was rather dystopic, rather “Big Brother is watching you”, rather hierarchical. And then we were asked to throw out those ideas and focus instead on what can be done. In the Cimulact EU project we have been looking at citizens visions to look toward a future EU research and innovation agenda. In general I note that people from older European countries there was more optimism about green technologies, technology enabling societies… Whilst people from Eastern European countries have tended to be more concerned with the technologies themselves, and with issues of safety and privacy. And we’ve been bringing these ideas together. For me the vision is technology in the service of people, enabling citizens, and creating systems for green and smart city development, and about personal freedom and responsibility. What unites all of these scenarios?  The information was gathered offline. People wanted security, privacy, communication… They didn’t want the technologies per se.

Challenges here? I think that privacy and security is key for social media, and the focus on the right tool, for the right audience, at the right time. If we listen to Time Berners Lee we note that the web is developing in a way divergent from the original vision. Lorrie Faith Cranor, Carnegie Mellon University notes that privacy is possible in a laboratory condition, but in the reality of the real world, it is hard to actually achieve that. That’s why such people as Aral Balkan, self-styled Cyborg Rights Activist – he has founded a cross-Europe party just focusing on privacy issues. He says that the business model of mainstream technology under “surveillance capitalisms” is “people arming and it it is toxic to human rights and democracy”. And he is trying to bring those issues into more prominence.

Another challenge is engagement. The use and time on social media is increasing every year. But what does that mean. Mark Schaefer, Director of Schaefer Marketing Solutions, describes this as “content shock” – we don’t have the capacity to deal with and consume the amount of content we are now encountering. Jay Bayer just wrote the book “Hug your haters” making the differentiation between “offstage haters” vs. “onstage haters”. Offstage haters tend to be older, offline, and only go public if you do not respond. Onstage haters post to every social media network not thinking about the consequences. So his book is about how to respond to, and deal with, many forms of hate on the internet. And one of the recently consulted companies have 150 people working to respond to that sort of “onstage” hate.

And then we have the issue of trolling. In Lithuania we have a government trying to limit alcohol consumption – you can just imagine how many people were being supported by alcohol companies to comment and post and respond to that.

We so also need to think about engagement in something valuable. Here I wanted to highlight three initiatives, two are quite mature, the third is quite new. The first is “My Government” or E citizens. This is about engaging citizens and asking them what they think – they post a question, and provide a (simple) space for discussion. The one that I engaged with only had four respondents but it was really done well. Lithuania 2.0 was looking at ways to generate creative solutions at government level. That project ended up with a lot of nice features… Every time we took it out, they wanted new features… People engaged but then dropped off… What was being contributed didn’t seem directly enough fed into government, and there was a need to feedback to commentators what had happened as a result of their posts. So, we have reviewed this work and are designing a new way to do this which will be more focused around single topics or questions over a contained period of time, with direct routes to feed that into government.

And I wanted to talk about the right tools for the right audiences. I have a personal story here to do with the idea of whether you really need to be in every network. Colleagues asked why I was not on Twitter… There was lots of discussion, but only 2 people were using Twitter in the audience… So these people were trying to use a tool they didn’t understand to reach people who were not using those tools.

Thinking about different types of tools… You might know that last week in Vilnius we had huge rainfall and a flood… Here we have people sharing open data that allows us to track and understand that sort of local emergency.

And there is the issue of how to give users personalised tools, and give opportunity for different opinions – going beyond your filter bubble – and earn profit. My favourite tool was called Personal Journal – it had just the right combination – until that was brought by Flipboard. Algorithmic tailoring can do this well, but there is that need to make it work, to expose to wider views. There is a social responsibility aspect here.

So, the future seems to look like decentralisation – including safe silos that can connect to each other; and the right tools for the right audience. On decentralisation Blockchain, or technologies like it, are looking important. And we are starting to see possible use of that in Universities for credentialing. We can also talk about uses for decentralisation like this.

We will also see new forms of engagement going mass market. Observation of “digital natives” who really don’t want to work in a factory… See those people going to get a coffee, needing money… So putting on their visor/glasses and managing a team in a factory somewhere – maybe Australia – only until that money is earned. We also see better artificial intelligence working on the side of the end users.

The future is ours – we define now, what will happen!

Q&A

Q1) I was wondering what you mean by Blockchain, I haven’t heard it before.

A1) It’s quite complicated to explain… I suggest you Google it – some lovely explanations out there. We have a distributed

Q2) You spoke about the green issues around digitalisation, and I know Block Chain comes with serious environmental challenges – how do we manage that environmental and technological convenience challenge?

A2) Me and my wife have a really different view of green… She thinks we go back to the yurt and the plants. I think differently… I think yes, we consume more… But we have to find spots where we consume lots of energy and use technology to make it more sustainable. Last week  was at the LEGO factory in Denmark and they are working on how to make that sustainable… But that is challenging as their clients want trusted, robust, long-lasting materials. There are aready some technologies but we have to see how that will happen.

Q3) How do you see the role of artificial intelligence in privacy? Do you see it as a smart agent and intermediary between consumers and marketers?

A3) I am afraid of a future like Elon Musk where artificial intelligence takes over. But what AI can do is that it can help us interpret data for our decisions. And it can interpret patterns, filter information, help us make the best use of information. At the same time there is always a tension between advertisers and those who want to block advertisers. In Lithuanian media we see pop ups requesting that we switch off ad blocking tools… At the same time we will see more ad blocks… So Google, Amazon, Facebook… They will use AI to target us better in different ways. I remember hearing from someone that you will always have advertising – but you’ll like it as it will be tailored to your preferences.

Q4) Coming from a background of political sciences and public administration… You were talking about decentralisation… Wouldn’t it be useful to differentiate between developed and developing world, or countries in transition… In some of those contexts decentralisation can mean a lack of responsibility and accountability…

A4) We see real gaps already between cities and rural communities – increasingly cities are their own power and culture, with a lot of decisions taken like mini states. You talked a possible scenario that is quite 1984 like, of centralisation for order. But personally I still believe in decentralisation. There is a need for responsibility and accountability, but you have more potential for human rights and

Aelita Skaržauskienė: Thank you to Edgaras! I actually just spend a whole weekend reading about Block Chain as here in Lithuania we are becoming a hub for Fin Tech – financial innovation start ups.

So, I just wanted to introduce today here. Social media is very important for my department. More than 33 researchers here look at social technologies. Social media is rising in popularity, but more growth lies ahead. More than 85% of internet users are engaging with social media BUT over 5 billion people in the world still lack regular access to the internet, so that number will increase. There have already been so many new collaborations made possible for and by social media.

Thank you so much for your attention in this exciting and challenging research topic!

Stream B: Mini track on Social Media in Education (Chair: Nicola Osborne and Stefania Manca)

As I’m chairing this session (as Stefania is presenting), my notes do not include Q&A I’m afraid. But you can be confident that interesting questions were asked and answered!

The use of on-line media at a Distance Education University – Martins Nico, University of South Africa, Pretoria, South Africa

South Africa University is an online only university so I will be talking about research we have been doing on the use of Twitter, WhatsApp, Messenger, Skype and Facebook by students. A number of researchers have also explored obstacles experienced in social media. Some identified obstacles will be discussed.

In terms of professional teaching dispositoins these are principals, commitments, values and professional ethics that influence the attitude and behavious of educators, and I called on my background in organisational psychology and measuring instruments to explore different ideas of presence: virtual/technological; pedagogical; expert/cognitive; social. And these sit on a scale from Behaviours that are easily changed, and those that are difficult to change. And I want to focus on the difficult to change area of incorporating technologies significantly into practive – in the virtual/technologial presence area.

Now, about our university… We have 350k students and +/- 100k non-formal students. African and international students from 130 countries. We are a distance education university. 60% are between 25 and 39 and 63.9% are female. At Unisa we think about “blended” learning, from posting materials (snail mail) through to online presence. In our open online distance learning context we are using tools including WhatsApp, BBM, Mxit, WeChat, Research Gate, Facebook, LinkedIn, intranet, Google drive and wiki spaces, multimedia etc. We use a huge range, but it is up to the lecturer exactly which of these they use. For all the modules online you can view course materials, video clips, articles, etc. For this module that I’m showing here, you have to work online, you can’t work offline, it’s a digital course.

So, the aim of our research was to understand how effectively the various teaching dispositions are using the available online media, and to what extent there is a relationship between disposition and technology used. Most respondents we had (40.5%) had 1 to 3 years of service. Most respondents (45.1%) were Baby Boomers. Most were female (61%), most respondents were lecturers and senior lecturers.

Looking at the results, the most used was WhatsApp, with instant messaging and social networking high. Microbogging and digital curation were amongst the least used.

Now, when we compare that to the dispositions, we seen an interesting correlation between Social presence dispositions and instant messaging; virtual presence dispositions using research networking, cloud computing… The most significant relationships were between virtual and online tools. No significant correlation between pedagogical presence and any particular tools.

I just wanted to talk about the generations at play here: Baby boomers, Gen X-ers, and Millennials. Looking at the ANOVA analysis for generations and gender. Only for instance messaging and social networking was there any significant result. In both cases millennials use this most. In terms of gender we see females using social networking and instant messaging more than males. The results show younger generation or millennials and females use the two online media significantly more than other groups – for our university that has an implication to ensure our staff understand the spaces our students use.

The results confirmed that millennials are most inclined to use instant messaging and social networking. Females were using these the most.

So, my reocmmendation? To increase usage of online tools, the university will need to train academics in the usage of the various online tools. To arrange workshops on new technology, social media and mobile learning. And we need to advise and guide academics to increase web self-efficacy and compensate accordingly. And determine the needs and preferences of students pertaining to the use of social media in an ODL environment, and focus

Towards a Multilevel Framework for Analysing Academic Social Network Sites: A Network Socio-Technical Perspective – Manca Stefania, National Research Council of Italy and Juliana Elisa Raffaghelli, University of Florence, Italy

I work on the field of learning, distance education, distance learning, social media and social networking. I’m going to share with you some work I am doing with Juliana Elisa Raffaghelli on the use of social networking sites for academic purposes. I know there are lots of different definitions here. In this year I’m talking about the use of social media sites for scholarly communication. As we all know there are many different dispositions to communicate our work, for what we do, including academic publications, conferneces like this, but also we have seen a real increase in the use of social media for scholarly communication. And we have seen Acadeic.edu and ResearchGate  in widest use of these, but others are out there.

The aim of my study was to investigate these kinds of sites, not only in terms of adoption, uptake, what kind of actions people do in these sites. But the study is a theoretical piece of work taking a socio-technical perspective. But before I talk more about this I wanted to define some of the terms and context here.

Digital Scholarship is the use of digital evidence, methods of inquiry, research, publication and preservation to achieve scholarly and research goals. And can encompass both scholarly communication using digital media and research on digital media. Martin Weller, one of the first to explore this area, describes digital scholarship as shorthand of an intersection in technology-related developments namely: digital content; networked distribution; open practices. And the potential transformational quality of that intersection.

A recent update to this update, by Greenhow and Gleason (2014) have defined Social Scholarship as the means by which social media affordaces and potential values evolve the ways scholarship is done in academia. And Veletsianos and Kimmons (2012) have talked about Networked Participatory Scholarship as a new form of scholarship arising from these new uses of technology and new types of practice.

There are lots of concerns and tensions here that have been raised… The blurring boundaries of personal and professional identities. The challenge of unreliable information online. Many say that ResearchGate and Academia.edu have a huge number of fake profiles, and that not all of what is there can be considered reliable. There is also a perception that these sites may not be useful – a social factor. There is the challenge of time to curate different sites. And in the traditional idea of “publish or perish” there has been some concern over these sites.

The premise of this study is to look at popular academic sites like ResearchGate, like Academia.edu. Although these sites are increasingly transforming scholarly communication and academic identity, there is a need to understand these at a socio technical level, which is where this study comes in. Academic social network sites are networked socio-technical systems. These systems are determined by social forces and technological features. Design, implementation and use of such technologies sit in a wider cultural and social context (Hudson and Wolf 2003?).

I wanted to define these sites through a multilevel framework, with a socio-economic layer (ownership, governance, business model); techno-cultural layer (technology, user/usage, content); networked-scholar layer (networking, knowledge sharing, identity). Those first two layers come from a popular study of social networking usage, but we added that third level to capture those scholarly qualities. The first two levels refer to the structure and wider context.

We also wanted to bring in social capital theory/ies, encompassing the capacity of social networks to produce goods for mutual benefits (Bourdieu, 1986). This can take the form of useful information, personal relationships or group networks (Putnam 2000). We took this approach because the scholarly community can be viewed as knowledge sharing entities formed by trust, recognition etc. I will move past an overview of social capital types here, and move to my conclusion here…

This positions academic social network sites as networked socio-technical systems that afford social capital among scholars… And here we see structural and distributed scholarly capital.

So to finish a specific example: ResearchGate. The site was founded in 2008 by two physicists and a computer scientist. More than 12 million members distributed worldwide in 193 countries. The majority of members (60%) belong to scientific subject areas, and it is intended to open up science and enable new work and collaboration.

When we look at ResearchGate from the perspective of the socio-economic layer…. Ownership is for-profit. Governance is largely through terms and conditions. The business model is largely based on a wide range of free-of-charge services, with some subscription aspects.

From the techno-cultural layer… Technology signals automatically who one may be interested in connected with, news feeds, propts endorsements, new researchers to follow. And usage can be passive, or they can be active participants after making new connections. And content – it affords publication of diverse types of science outputs.

From the networked scholar layer. Networking – Follow and recommend, Knowledge of sharing – commenting, questions feature, search function, existing Q&As, expertise and skills, and Identity – through profile, score, reach and h-index.

On Linking Social Media, Learning Styles, and Augmented Reality in Education – Kurilovas Eugenijus, Julija Kurilova and Viktorija Dvareckiene, Vilnius University Institute of Mathematics and Informatics, Lithuania

Eugenijus: So, why augmented reality? Well according to predictions it will be the main environment for education by 2020 and we need to think about linking it to students on the one hand, and to academia as well. So, the aim of this work is to present an original method to identify students preferring to actively engage in social media and wanting to use augmented reality. To relate this to learning styles.

Looking over the literature we faced a tremendous development of social media, powered by innovative web technologies, web 2.0 and social networks. But so many different approaches here, and every student is different. Possibilities of AR seem almost endless. And the literature suggests AR may be more effective than traditional methods. Only one meta-analysis work directly addresses personalisation of AR-based systems/environments in education. The learning styles element of this work is about the differences of student needs, not specifically focused on this.

Another aspect of AR can be cognitive overload from the information, the technological devices, and the tasks they need to undertake. Few studies seem to look at pedagogy of AR, rather than tests of AR.

So, our method… All learning processes, activities and scenarios should be personalised to student learning styles. We undertook simple and convenient expert evaluation method based on application of trapezoid fuzzy learning. And looking at suitability of use in elearning. The question given to expertise focus on suitability of learning activities of social media and AR in learning. After that details explaining Felder-Silverman learning styles (4 different styles included) model were provided for the experts.

After the experts completed the questionnaire it’s easy to calculate the average values of suitability of the learning styles and learning activities for AR and social media. So we can now easily compute the average for learning styles… So every student could come in and answer a learning styles questionnaire, get their own table, their personal individual learning styles. Then combining that score, with expert ratings of AR and social media, we can calculate suitability indexes of all learning styles of particular students. The programme does this in, say, 20 seconds…

So, we asked 9 experts to share their opinion on particular learning styles… So here the experts see social media and AR as particularly suitable for visuals and activists (learning styles). We think that suitability indexes should be included in recommender systems – main thing in personalised learning system and shoudl be linked to particular students according to those suitability index. The higher suitability index the better the learning components fit particular students needed.

So, expert evaluation, linking learning activities and students by suitability index and recommender system are main intelligent technologies applied to personalise learning. An optimal learning scenario would make use of this to personalise learning. And as already noted Augmented Reality and social media are most suitable for visual and activist learners; most unsuitable for verbal and reflective learners… And that will be reflected in student happiness and outcomes. Visual and activist learners prefer to actively use learning scenarios based on application of AR and social media.

According to Felder and Silverman most people of college age and older are visual. Visual learners remember best what they see rather than what they hear. Visual learners are better able to remember images rather than verbal or text information. For visual learners the optimal learning scenario should include a range of visual materials.

Active learners do not learn much in situations that require them to be passive. They feel more comfortable with or better at active experimentation than reflective observation. For active learners the optimal scenario should include doing something that relates to the wider outside world.

And some conclusions… Learning styles show how this can be best used/tweaked to learners. The influence of visual and social media has shifted student expectations, but many teaching organisations are still quite traditional…

We now have a short break for lunch. This afternoon my notes will be sparse – I’ll be presenting in the Education Mini Track and then, shortly after, in the Social Media Excellence Awards strand. Normal service will be resumed after this afternoon’s coffee break. 

Stream B: Mini track on Social Media in Education (Chair: Nicola Osborne and Stefania Manca)

Digital Badges on Education: Past, Present and Future – Araujo Inês, Carlos Santos, Luís Pedro, and João Batista, Aveiro University, Portugal

I’ve come a little late into Ines’ talk but she is taken us through the history of badges as a certification, including from Roman times. 

This was used like an honour, but also as a punishment, with badges and tattoos used to classify that experience. For a pilgrim going to Compostello de Compagnario(?) they had a badge, but there was a huge range of fake badges out there. The pope eventually required you to come to rome to get your badges. We also have badges like martial arts belts, for scouts… So… Badges have baggage.

With the beginning of the internet we started the beginnings of digital badges, as a way to recognise achievements and to recognise professional achievements. So, we have the person who receives the badge, the person/organisation who issues the badge, and the place where the badge can be displayed. And we have incentives to collect and share badges associated with various cities across the world.

Many platforms have badges. We have Open Badges infrastructures (Credly, BadgeOS, etc.) and we have the place to display and share badges. In educational platforms we also have support for badges, including Moodle, Edmodo, Makewaves.es, SAPO campus (at our speaker’s home institution), etc. But in our VLE we didn’t see badges being used as we expected so we tried to look out at how badges are being used (see badgetheworld.org) worldwide…

How are badges being used? Authority; award and motivations; sequential orientation – gain one, then the other…; research; recognition; identity; evidence or achievement; credentialing. The biggest use was around two major areas: motivation (for students but also teachers and others), as well as credentialing. And in fact some 10% of digital badges are used to motivate and reward, and to recognise skills, of teachers. However major use is with students and that is split across award, credentialing, and evidence of achievement.

So, our final recommendations was for the integration of badges in education: that we should choose a platform, show the advantage of using a repository (e.g. a backpack for digital badges); to choose the type of badge – mission type and/or award type; and enjoy it.

Based on this information we began a MOOC: Badges: how to use it. And you can see a poster on the MOOC. And this was based on the investigation we did for this work.

Q&A

Q1) Have you had some feedback, or collected some information on students’ interest on badges… How do they react or care about getting those badges?

A1) Open Badges are not really known to everyone in Portugal. The first task I had was to explain them, and what the advantages there were. Teachers like the idea… They feel that it is very important for their students and have tried it for their students. Most of the experiments show students enjoying the badges… But I’m not sure that they understand that they can use it again if they show it in social media, into the community… But that is a task still to do. The first experience I have, I’ve known about from the teachers who were in the MOOC, they enjoy it, they liked it, they asked for more badges.

Q2) I know about the concept here… Any issues with dual ways to assess students – grades and badges.

A2) Teachers can use them with grading, in parallel. Or if they use them in sequence, they understand how to get to achieve that grade. Teacher has to decide how best to use them… Whether to use them or to motivate to a better grade.

Q3) Thank you! I’m co-ordinating an EU open badge project so I’d like to invite you to publish. Is the MOOC only in Portuguese? My students are designing interactive modules – CC licensed – with best practice guidance. Maybe we can translate and reuse?

A3) It’s only in Portuguese at the moment. We have about 120 people engaged in the MOOC and it runs on SAPO Campus. They are working on a system of badges that can be used across all institutions so that teachers can share badges, a repository to choose from and use in their own teaching.

Comment) Some of that unification really useful for having a shared understanding of meaning and usage of badges.

Yes, but from what I could see teachers were not using badges because they hadn’t really seen examples of how to use them. And they get a badge at the end of the course!

Q4) What is the difference between digital badges and open badges.

A4) Open Badges is a specific standard designed by Mozilla. Digital badges can be created by everyone.

Comment) At my institution the badges are about transferrable skills… They have to meet unit learning outcomes, graduate learning outcomes. They can get prior learning certified through them as well to reduce taught classes for masters students. But that requires that solid infrastructure.

We have infrastructure to issue badge, someone can make and create, to issue a person. The badge has metadata, where it was issued, why, by whom… And then made available in repository. e.g. Mozilla backpack.

Exploring Risk, Privacy and the Impact of Social Media Usage with Undergraduates – Connelly Louise and Nicola Osborne, University of Edinburgh, UK

Thanks to all who came along! Find our abstract and (shortly after today) our preprint here.

And I’ve now moved on to the Best Practice Awards strand where I’ll be presenting shortly… I’ve come in to the questions for Lisa Lundgren (and J. Crippen Kent)’s presentation on using social media to develop social paleontology. From the questions I think I missed hearing about a really interesting project. 

EDINA Digital Footprint Consultancy & Training Service – Osborne Nicola, University of Edinburgh, UK 

Well, that was me. No notes here, but case study will be available soon. 

D-Move – Petrovic Otto, University of Graz, Austria

This is a method and software environment to anticipate “digital natives” acceptance of technology innovations. Looking particularly at how the academic sector is having long term impact on the private sector. And our students are digital natives, that’s importance. So, to introduce me, I’m professor of information systems at the University of Graz, Austria. I have had a number of international roles and have had a strong role in bridging the connection between academia and government, am a member of regulatory authority for telecommunications for Austria. And I have started three companies.

So, what is the challenge? In 2020 more than half of all the people living in our world are born and raised with diital media and the internet, they are digital natives. And they are quite different regarding their values and norms, behaviours and attitudes. Considering the big changes in industries like media, commerce, banking, transport or the travel industry. They have more and more aversion for traditional surveys based on “imagine a situation where you use a technology like…”. Meanwhile surveys designed, executed and interpreted by traditional “experts” will result in traditional views – the real experts are the digital natives. The results should be gained through digital natives’ lives…

So the solution? It is an implemented method, based on the Delphi approach. Digital Natives are used as experts in a multi-round, structured group communication process. In each round they collect their own impressions regarding the Delphi issue. So, for instance, we have digital natives engaging in self-monitoring of their activities.

So, we recruited 4 groups of 5 digital natives; round one discussion as well as interviews with 130 digital natives; field experience embedded in everyday live; discussion; and analysis. We want to be part of the daily life of the digital native, but a big monolithic space won’t work, things change, and different groups use different spaces. We need social media and we need other types of interfaces… We don’t know them today. We have a data capturing layer for pictures, video, annotations. We also need data storage, data presentation and sharing, data tagging and organisation, access control and privacy, private spaces and personalisation… And access control is crucial, as individuals want to keep their data private until they want to share it (if at all).

D-Move gives insights into changes in Digital Natives views, experiences, self-monitoring, etc. And in terms of understanding “why” digital natives behave as they do. The participants show high satisfaction with D-Move as a space for learning. D-Move has been implemented and used in different industries for many years – used for media, transport and logistics, travel industry, health and fitness. It started with messaging based social media, going to social media platforms, finally implementing social internet of things technologies. And we are currently working with one of the most prestigious hotels – with a customer base typically in their seventies… So we are using D-Move to better understand the luxury sector and what parts of technology they need to engage with. D-Move is part of Digital Natives “natural” communication behaviour. And an on-going cycle of scientific evaluation and further technical development.

In terms of the next steps, firstly the conceptual models will be applied to the whole process to better understand digital natives thinking, feeling and behaviour. Using different front ends focused on the internet of things technologies. And offering D-Move to different industries to book certain issues like using an omnibus survey. And D-Move is both a research environment and a teaching environment. We have two streams going in the same direction, including as a teaching instrument.

Q&A

Q1) Your digital native participants, how do you recruit them?

A1) It depends on the age group. It ranges from age 10 to nearer age 30. For our university we can reach 20-25 year old, for 10 years to 20 we work with schools. 25 to 30 years old is harder to recruit.

Q2) What about ethical issues? How do you get informed consent from 10 to 18 year olds.

A2) These issues are usually based on real issues in life, and this is why security and privacy is very important. And we have sophisticated ways of indicating what is and is not OK to share. This is partly through storing data in our storage. It is not a public system, the data is not accessible to others.

Q3) We’ve seen a few presentations on using data from participants. According to the POPI Act (based on EU GDPR, you can’t use data without consent… How do you get around that?

A3) It’s easier because it is not a public system, and we do not relate information in publications, only at an aggregated level.

At this point I feel it is important to note my usual “digital native” caveat that I don’t agree with the speaker on this term (or the generalisations around it) which has been disputed widely in the literature, including by Marc Prensky, it’s originator.

The Traditions Challenge mobile App – Peruta Adam, Syracuse University, New York, USA

I’ve been looking at how colleges and universities have been using social media in student recruitment, alumni engagement etc. And it has been getting harder and harder to get access to social media data over the years, so I decided to design my own thing.

So, think back to your first days of universities. You probably had a lot of concerns. For instance Ithaca College is in a town less than 7 miles wide, there isn’t a big sports programme, it is hard to build community. So… The Traditions Challenge is a mobile app to foster engagement and community building for incoming university students – this works as a sort of bucket list of things to do and engage with. This launched at Ithaca in August 2016 with over 100 challenges. For instance FYRE, which already encourages engagement, is a challenge here. Faculty Office Hours is it’s own challenge – a way to get students to find out about these. And the fountains – a notable feature on campus – you can have your image taken. And we encourage them to explore the town, for instance engaging with the farmers market.

So there is a list of challenges, there is also a feed to see what else is happening on campus. And there is information on the school. And this is all gamified. Challenges earn points, there is a leaderboard which gets students status. And there are some actual real world challenges – stickers, a nice sweatshirt, etc. And this is all designed to get students more engaged, and more engaged early on at university. There is a lot of academic research on students who are more involved and engaged, being more likely to stay at that university.

Traditions in the University are very important We have over 4000 institutions. And those traditions translate into a real sense of identity for students. There are materials on traditions, keep safe books for ticket stubs, images, etc. but these are not digital. And those are nice but there is no way to track what is going on (plus who takes pictures).  And in fact Ithaca tried that approach on campus – a pack, whiteboards, etc. But this year, with the app, there are many more data that can be quantified. This year we had around 200 sign ups (4% of on campus students). We didn’t roll out to everyone, but picked influencers and told them to invite friends, then them to invite their friends, etc. And those 200 sign ups did over 1400 challenges and 44 checked in for prizes. Out of the top ten challenges, 70% of the most popular challenges were off-campus, and 100% of those were non-academic experiences. There is a sense of students being most successful when they involved in a lot of things, and have more activities going on. It is hard for comparing the analogue with the app but we know that at least 44 students checked in for prizes with the app, versus 8 checking in when we ran the analogue challenges.

In terms of students responding to the challenges, they enjoyed the combination of academic and non-academic activities. One student, who’d been enrolled for 3 years, found out about events on campus through the app that he had never heard about before. Some really responded to the game, to the competition. Others just enjoyed the check list, and a way to gather memories. Some just really want the prize! (Others were a lot less excited). Maybe more prizes could also help – we are trying that.

In terms of App Design and UX. And this cohort hugely care about the wording of things, the look of things… Their expectation is really really high.

In terms of identity students reported feeling a real sense of connection to Ithaca – but it’s early days, we need some longitudinal data here.

We found that the digital experience is preferred. Mobile development is expensive and time consuming – I had an idea, tried to build a prototype, applied for a grant to hire a designer, but everyone going down this path have to understand that you need developers, designers, and marketing staff at the university to be involved. And like I said, the expectations were really high, We ran workshops before making anything to make sure we understood that expectation.

I would also note that universities in the US are really getting protective of their brand, the use of logos, fonts etc. They really trusted me but it took several goes to get a logo we were all happy with.

And finally, data from the app, from follow up work, show that students really want to augment their experience with on campus activities, off campus activities… And active and involved students seem to lead to active and involved alumni – that would be great data to track. And that old book approach was lovely as tangible things are good – but it’s easy to automate some printing from the app…

So, what’s happening now? Students are starting, they will see posters and postcards, they will see targeted Facebook ads.

I think that this is a good example of how a digital experience can connect with a really tangible experience.

And finally, I’m from Suracuse University, and I’d like to thank Ithaca College, and NEAT for their support.

Q&A

Q1) What is the quality of contribution like here?

A1) It looks quite a lot like Instagram update – a photo, text, tagging, you can edit it later.

Q2) And can you share to other social media?

A2) Yes, they can share to Facebook and Twitter.

Q3) I wanted to ask about the ethics of what happens when students take images of each other?

A3) Like other types of social media, that’s a social issue. But there is a way to flag images and admins can remove content as required.

Q4) Most of your data is from female participants?

A4) Yes, about 70% of people who took part were female participants.

Q5) How did you recruit users for your focus groups?

A5) We recruited our heaviest app users… We emailed them to invite them along. the other thing I wanted to note that it wasn’t me, or colleagues, running focus groups, it was student facilitators to make this peer to peer.

Q6) How reliable is the feedback? Aren’t they going to be easy to please here?

A6) Sure, they will be eager to please so there may be some bias. I will eventually be doing some research on these data points eventually.

Q7) Any plans to expand to other universities?

A7) Yes, would love to compare the three different types of US universities in particular.

Q8) Is the app free to students?

A8) Yes, I suspect if I was to monetize this it would be for the university – a license type set up.

Mini track on Social Media in Education – Chair: Nicola Osborne and Stefania Manca

Evaluation of e-learning via Social Networking Website by full-time Students in Russia – Pivovarov Ivan, RANEPA, Russia

Why did I look at this area? Well the Russian Government is presently struggling with poor education service delivery. There is great variety in the efficiency and quality of higher education. So, the Russian Government is looking for ways to make significant improvements. And, in my opinion, social media can be effective in full time teaching. And that’s what my research was looking at.

So, I wanted to determine the best techniques of delivery of e-learning via social networking websites. I was looking at vk.com rather than Facebook. VK is by far the biggest social media in Russia. The second biggest is Instagram. There is strong competition there.

So I was looking at the views of students about educational usage of HK, targeting bachelor students coming from the Russian Presidential Academy of National Economy and Public Administration – an atypical institution focused specifically on public administration. A special interest group was created on VK and the educational content was regularly uploaded there. We had 100s of people in this group – hoping for 1000 in future. So material would include assignments, educational contests, etc. And finally after six months of using this space, I decided to make a questionnaire and ask my students what they like, what they don’t like, what they didn’t like the most, etc. and we had 100 responses. Age wise 82% were between 18 and 21 years old; 12% are 21-24 years old; 6% were older than 24. This slide shows that users of social media are typically young, when they move on in life, have families etc, they don’t tend to use social media. We did also ask about Facebook, 53% had a Facebook account, 47% did not.

We asked what the advantages are of VK over Faceook. 52% said most of their friends were on VK. 13% said that VK had a more user friendly interface than Facebook. 29% said VK has a more interesting background – sharing of music, films etc. – than Facebook. Looking at usage of VK in educational purpose, 35% use it weekly; 31% very seldom; 14% 2-3 times a weel; 10% daily. Usage is generally heavier on week days, on the weekend that drops.

So, what motivated people to be a member of the special interest group on a social media website? Most (53%) said the ease of access to information; 31% the dissemination of information; 4% said for the chance of interaction. And when asked what the students wanted to improve, most (53%) wanted to increase teacher-student interaction – more teachers to join them on social media.

Students mostly preferred posts from teachers that were about administration of the unit (28%) and content (28%). When asked if the students wanted to watch video lectures, 85% said yes. One year after this work I started to record video lectures – short (5-10 mins) and they become available prior to a lecture. And then find some new definitions, new terms, etc. And in the lecture we follow up, go into details. We can go straight into discussion. So this response inspired me to create this video content.

I also asked if students had taken an online class before, 52% had, 48% hadn’t. I asked students how they likes social media interaction on social media – 86% of students found it positive (but I only asked the after they’d been assessed to avoid too much bias in results).

Conclusions here…. Well I wanted to compare Russian to other contexts. Students in Russia wanted more teacher-student interactions. “comments must be encouraged” was not present in our experiment but in research in Turkey

Q&A

Q1) Is there an equivalent to YouTube in Russia?

A1) Yes, YouTube is big. There is an alternative called RuTube – maybe more the Russian Vimeo. No Twitter – Telegram is nearest. And no Russian analogous to SnapChat but it is pushed away by Instagram Stories now I think. WhatsApp is very popular, but I don’t see the educational potential there. This semester I had students make online translations of my lecture… with Instagram Stories… VK does try and copy features from other worldwide spaces – they have stories. But Instragram is most popular.

Q2) Among the takeaways is the need for more intense interaction between students and teaching staff. Are your teaching staff motivated to do this? I do this as a “hobby” in my institution? Is it formalised in your school? And also you said about the strength of VK versus Facebook – you noted that people using VK drives traffic… So where do you see opportunities for new platforms in Russia?

A2) Your second question, that’s hard to predict. Two or three years ago it was hard to predict Instagram Stories or Snapchat. But I guess probably social media associated with sport…

Q2) Potential won’t be hampered by attitudes in the population to steer toward what they know.

A2) I don’t think so… On the time usage front I think my peers probably share your concerns about time and engagement.

Comment) It depends on how it develops… We have a minimum standard. In our LMS there is a widget, and staff have to make videos per semester for them – that’s now a minimum practice. Although in the long run teaching isn’t really rewarded – it’s research that is typically rewarded… Do you have to answer to a manager on this in terms of restrictions on trying things out?

A2) No, I am lucky, I am free to experiment. I have a big freedom I think.

Q3) Do you feel uncomfortable being in a social space with your students… To be appropriate in your profile picture… What is your dynamic?

A3) All my photos are clean anyway! Sports, conferences… But yes, as a University teacher you have to be sensible. You have to be careful with images etc… But still…

Comment) But that’s something people struggle with – whether to have one account or several…

A3) I’m a very public person… Open to everyone… So no embaressing photos! On LMS, my university has announced that we will have a new learning management system. But there is a a question of whether students will like that or engage with that. There is a Clayton Christenson concept of disruptive innovation. This tool wasn’t designed for education, but it can be… Will an LMS be comfortable for students to use though?

Comment) Our university is almost post-LMS… So maybe if you don’t have one already, you could jump somewhere else, to a web 2.0 delivery system…

A3) The system will be run and tested in Moscow, and then rolled out to the regions…

Q4) You ran this course for your students at your institution, but was the group open to others? And how does that work in terms of payments if some are students, some are not?

A4) Everyone can join the group. And when they finish, they don’t escape from the group, they stay, they engage, they like etc. Not everyone, but some. Including graduates. So the group is open and everyone can join it.

Developing Social Media Skills for Professional Online Reputation of Migrant Job-Seekers – Buchem Ilona, Beuth University of Applied Sciences Berlin, Germany

We have 12,800 students, many of whom have a migrant background, although the work I will present isn’t actually for our students, its for migrants seeking work.

Cue a short video on what it means to be a migrant moving across the world in seek of a brighter future and a safe place to call home. Noting the significant rise in migration, often because of conflict and uncertainty. 

That was a United Nations video about refugees. Germany has accepted a huge number of refugees, over 1.2 million in 2015, 2016. And, because of that, we have and need quite a complex structure of programmes and support for migrants making their home. At the same time here Germany has shortages of skilled workers so there is a need to match up skills and training here. There is particular need for doctors, engineers, experts in technology and ICT for instance.

But, it’s not al good news. Unemployment in Germany is twice as high among people who have migration background compared to those who do not. At the same time we have migrants with high skills and social capital but it is hard if not impossible to certify and check that. Migrant academics, including refugees, are often faced with unemployment, underemployment or challenging work patterns.

In that video we saw a certificate… Germany is a really organised country but that means without certificates and credentials available. But we also see the idea of the connected migrant, with social media enabling that – for social gain but also to help find jobs and training.

So the project here is “BeuthBonus”, a follow on project. We are targeted at skilled migrant workers – this partly fills a gap in delivery as training programmes for unskilled workers are more common. It was developed to help migrant academics to find appropriate work at the appropriate level. The project is funded by the German Federal Ministry of Research and Education, the German Federal Ministry of Labour, and also part of an EU Open Badges pilot as we are also an Open Badges Network pilot for recognition of skills.

Our participants 2015-16 are 28 in total (12 female, 16 male), from 61 applications. Various backgrounds but we have 20 different degrees there: 28% BA, 18% MA, 7% PhD. They are mainly 30-39 or 40-49 and they are typically from Tunisia, Afghanistan, Syria, etc.

So, the way this works is that we cooperate with different programmes – e.g. an engineer might take an engineering refresher/top up. We also have a module on social media – just one module – to help participants understand social media, develop their skills, and demonstrate their skills to employers. This is also a good fit as job applications are now overwhelmingly digital now. And also the employment of recruiters has moved from reserved to positive to a digital CV.

So, in terms of how companies in Germany are using social media in recruitment. Xing, a German language only version of a tool like LinkedIn, is the biggest for recruitment advertising. In terms of active sourcing in social media, 45% of job seekers prefer to be approached. And in fact 21% of job seekers would pay to be better visible in these space. 40% of job openings are actively sourced – higher in IT sector.

So we know that building an online professional reputation is important, and more highly skilled job hunters will particularly benefit from this. So, we have a particular way that we do this. We have a process for migrants to develop their online professional development. They start by searching for themselves, then others comment on what was found. They are asked to reflect and think about their own strengths and the requirements of the labour market. Then they go in and look at how the spaces are used, how people brand themselves, and use these spaces. Then some framing around a theme, plan what they will do, and then they set up a schedule for the next weeks and months… So they put it into action.

We then have instrumental ways to assess this – do they use social media, how do they use it, how often, how they connect with others, and how they express themselves online. We also take some culture specific and gender specific considerations into account in doing this.

And, to enhance online presence we look at OpenBadges, set goals, and work towards it. I will not introduce OpenBadges, but I will talk about how we understand competencies. So we have a tool called ProfilPASS – a way to capture experience as transferrable skills that can be presented to the world. We designed badges accordingly. And we have BeuthBonus Badges in the Open Badge Network, but these are on Moodle and available in German and in English to enable flexibility in appling for jobs. Those badges span different levels, they are issues badges at the appropriate levels, they can share them on Xing of LinkedIn as appropriate. And we also encourage them to also look at other sources of digital badges – from IBM developerWorks or Womens Business Club, etc.

So, these results have been really good. Before the programme we had 7% employed, but after we had 75% employed. This tends to be a short term perspective. Before the programme 0% had a digital CV, after 72% did. We see that 8% had an online profile before, but 86% now do. And that networking means they have contacts, and they have a better understanding of the labour market in Germany.

In our survey 83% felt Open Badges are useful for enhancing online reputation.

Open Badge Network has initiatives across the world. We work on Output 4: Open Badges in Territories. We work with employers on how best to articulate the names

Q&A

Q1) In your refugee and migration terminology, do you have subcategories?

A1) We do have sub categories around e.g. language level, so can refer them to language programmes before they are coming to us. And there had been a change – it used to be that economic migrants were not entitled to education, but that has changed now. Migrants and refugees are the target group. It depends on the target group…

Q2) In terms of the employer, do you create a contact point?

A2) We have an advisory board drawn from industry, also our trainers are drawn from industry.

Q3) I was wondering about the cultural differences about online branding?

A3) I have observations only, as we have only small samples and from many countries. One difference is that some people are more reserved, and would not approach someone in a direct way… They would wave (only)… And in Germany the hierarchy is not important in terms of having conversations, making approaches, but that isn’t the case in some other places. And sharing an image, and a persona… that can be challenging. That personal/professional mix can be even tricky.

Q4) How are they able to manage those presences online?

A4) Doing that searching in a group.. And with coaches they have direct support, a space to discuss what is needed, etc.

Q5) Lets say you take a refugee from country x, what is needed?

A5) They have to have a degree, and they have to have good german – a requirement of our funder – and they have to be located in Germany.

Comment) This seems like it is building so much capacity… I think what you are doing over there is fantastic and opening doors to lots of people.

Q6) In Germany, all natives have these skills already? Or do you do this for German people too? Maybe they should?

A6) For our students I tend to just provide guidance for this. But yes, maybe we need this for all our students too.

Jun 302017
 

Today I’m at ReCon 2017, giving a presentation later (flying the flag for the unconference sessions!) today but also looking forward to a day full of interesting presentations on publishing for early careers researchers.

I’ll be liveblogging (except for my session) and, as usual, comments, additions, corrections, etc. are welcomed. 

Jo Young, Director of the Scientific Editing Company, is introducing the day and thanking the various ReCon sponsors. She notes: ReCon started about five years ago (with a slightly different name). We’ve had really successful events – and you can explore them all online. We have had a really stellar list of speakers over the years! And on that note…

Graham Steel: We wanted to cover publishing at all stages, from preparing for publication, submission, journals, open journals, metrics, alt metrics, etc. So our first speakers are really from the mid point in that process.

SESSION ONE: Publishing’s future: Disruption and Evolution within the Industry

100% Open Access by 2020 or disrupting the present scholarly comms landscape: you can’t have both? A mid-way update – Pablo De Castro, Open Access Advocacy Librarian, University of Strathclyde

It is an honour to be at this well attended event today. Thank you for the invitation. It’s a long title but I will be talking about how are things are progressing towards this goal of full open access by 2020, and to what extent institutions, funders, etc. are being able to introduce disruption into the industry…

So, a quick introduction to me. I am currently at the University of Strathclyde library, having joined in January. It’s quite an old university (founded 1796) and a medium size university. Previous to that I was working at the Hague working on the EC FP7 Post-Grant Open Access Pilot (Open Aire) providing funding to cover OA publishing fees for publications arising from completed FP7 projects. Maybe not the most popular topic in the UK right now but… The main point of explaining my context is that this EU work was more of a funders perspective, and now I’m able to compare that to more of an institutional perspective. As a result o of this pilot there was a report commissioned b a British consultant: “Towards a competitive and sustainable open access publishing market in Europe”.

One key element in this open access EU pilot was the OA policy guidelines which acted as key drivers, and made eligibility criteria very clear. Notable here: publications to hybrid journals would not be funded, only fully open access; and a cap of no more than €2000 for research articles, €6000 for monographs. That was an attempt to shape the costs and ensure accessibility of research publications.

So, now I’m back at the institutional open access coalface. Lots had changed in two years. And it’s great to be back in this spaces. It is allowing me to explore ways to better align institutional and funder positions on open access.

So, why open access? Well in part this is about more exposure for your work, higher citation rates, compliant with grant rules. But also it’s about use and reuse including researchers in developing countries, practitioners who can apply your work, policy makers, and the public and tax payers can access your work. In terms of the wider open access picture in Europe, there was a meeting in Brussels last May where European leaders call for immediate open access to all scientific papers by 2020. It’s not easy to achieve that but it does provide a major driver… However, across these countries we have EU member states with different levels of open access. The UK, Netherlands, Sweden and others prefer “gold” access, whilst Belgium, Cyprus, Denmark, Greece, etc. prefer “green” access, partly because the cost of gold open access is prohibitive.

Funders policies are a really significant driver towards open access. Funders including Arthritis Research UK, Bloodwise, Cancer Research UK, Breast Cancer Now, British Heard Foundation, Parkinsons UK, Wellcome Trust, Research Councils UK, HEFCE, European Commission, etc. Most support green and gold, and will pay APCs (Article Processing Charges) but it’s fair to say that early career researchers are not always at the front of the queue for getting those paid. HEFCE in particular have a green open access policy, requiring research outputs from any part of the university to be made open access, you will not be eligible for the REF (Research Excellence Framework) and, as a result, compliance levels are high – probably top of Europe at the moment. The European Commission supports green and gold open access, but typically green as this is more affordable.

So, there is a need for quick progress at the same time as ongoing pressure on library budgets – we pay both for subscriptions and for APCs. Offsetting agreements are one way to do this, discounting subscriptions by APC charges, could be a good solutions. There are pros and cons here. In principal it will allow quicker progress towards OA goals, but it will disproportionately benefit legacy publishers. It brings publishers into APC reporting – right now sometimes invisible to the library as paid by researchers, so this is a shift and a challenge. It’s supposed to be a temporary stage towards full open access. And it’s a very expensive intermediate stage: not every country can or will afford it.

So how can disruption happen? Well one way to deal with this would be the policies – suggesting not to fund hybrid journals (as done in OpenAire). And disruption is happening (legal or otherwise) as we can see in Sci-Hub usage which are from all around the world, not just developing countries. Legal routes are possible in licensing negotiations. In Germany there is a Projekt Deal being negotiated. And this follows similar negotiations by open access.nl. At the moment Elsevier is the only publisher not willing to include open access journals.

In terms of tools… The EU has just announced plans to launch it’s own platform for funded research to be published. And Wellcome Trust already has a space like this.

So, some conclusions… Open access is unstoppable now, but still needs to generate sustainable and competitive implementation mechanisms. But it is getting more complex and difficult to disseminate to research – that’s a serious risk. Open Access will happen via a combination of strategies and routes – internal fights just aren’t useful (e.g. green vs gold). The temporary stage towards full open access needs to benefit library budgets sooner rather than later. And the power here really lies with researchers, which OA advocates aren’t always able to get informed. It is important that you know which are open and which are hybrid journals, and why that matters. And we need to think if informing authors on where it would make economic sense to publish beyond the remit of institutional libraries?

To finish, some recommended reading:

  • “Early Career Researchers: the Harbingers of Change” – Final report from Ciber, August 2016
  • “My Top 9 Reasons to Publish Open Access” – a great set of slides.

Q&A

Q1) It was interesting to hear about offsetting. Are those agreements one-off? continuous? renewed?

A1) At the moment they are one-off and intended to be a temporary measure. But they will probably mostly get renewed… National governments and consortia want to understand how useful they are, how they work.

Q2) Can you explain green open access and gold open access and the difference?

A2) In Gold Open Access, the author pays to make your paper open on the journal website. If that’s a hybrid – so subscription – journal you essentially pay twice, once to subscribe, once to make open. Green Open Access means that your article goes into your repository (after any embargo), into the world wide repository landscape (see: https://www.jisc.ac.uk/guides/an-introduction-to-open-access).

Q3) As much as I agree that choices of where to publish are for researchers, but there are other factors. The REF pressures you to publish in particular ways. Where can you find more on the relationships between different types of open access and impact? I think that can help?

A3) Quite a number of studies. For instance is APC related to Impact factor – several studies there. In terms of REF, funders like Wellcome are desperate to move away from the impact factor. It is hard but evolving.

Inputs, Outputs and emergent properties: The new Scientometrics – Phill Jones, Director of Publishing Innovation, Digital Science

Scientometrics is essentially the study of science metrics and evaluation of these. As Graham mentioned in his introduction, there is a whole complicated lifecycle and process of publishing. And what I will talk about spans that whole process.

But, to start, a bit about me and Digital Science. We were founded in 2011 and we are wholly owned by Holtzbrink Publishing Group, they owned Nature group. Being privately funded we are able to invest in innovation by researchers, for researchers, trying to create change from the ground up. Things like labguru – a lab notebook (like rspace); Altmetric; Figshare; readcube; Peerwith; transcriptic – IoT company, etc.

So, I’m going to introduce a concept: The Evaluation Gap. This is the difference between the metrics and indicators currently or traditionally available, and the information that those evaluating your research might actually want to know? Funders might. Tenure panels – hiring and promotion panels. Universities – your institution, your office of research management. Government, funders, policy organisations, all want to achieve something with your research…

So, how do we close the evaluation gap? Introducing altmetrics. It adds to academic impact with other types of societal impact – policy documents, grey literature, mentions in blogs, peer review mentions, social media, etc. What else can you look at? Well you can look at grants being awarded… When you see a grant awarded for a new idea, then publishes… someone else picks up and publishers… That can take a long time so grants can tell us before publications. You can also look at patents – a measure of commercialisation and potential economic impact further down the link.

So you see an idea germinate in one place, work with collaborators at the institution, spreading out to researchers at other institutions, and gradually out into the big wide world… As that idea travels outward it gathers more metadata, more impact, more associated materials, ideas, etc.

And at Digital Science we have innovators working across that landscape, along that scholarly lifecycle… But there is no point having that much data if you can’t understand and analyse it. You have to classify that data first to do that… Historically we did that was done by subject area, but increasingly research is interdisciplinary, it crosses different fields. So single tags/subjects are not useful, you need a proper taxonomy to apply here. And there are various ways to do that. You need keywords and semantic modeling and you can choose to:

  1. Use an existing one if available, e.g. MeSH (Medical Subject Headings).
  2. Consult with subject matter experts (the traditional way to do this, could be editors, researchers, faculty, librarians who you’d just ask “what are the keywords that describe computational social science”).
  3. Text mining abstracts or full text article (using the content to create a list from your corpus with bag of words/frequency of words approaches, for instance, to help you cluster and find the ideas with a taxonomy emerging

Now, we are starting to take that text mining approach. But to use that data needs to be cleaned and curated to be of use. So we hand curated a list of institutions to go into GRID: Global Research Identifier Database, to understand organisations and their relationships. Once you have that all mapped you can look at Isni, CrossRef databases etc. And when you have that organisational information you can include georeferences to visualise where organisations are…

An example that we built for HEFCE was the Digital Science BrainScan. The UK has a dual funding model where there is both direct funding and block funding, with the latter awarded by HEFCE and it is distributed according to the most impactful research as understood by the REF. So, our BrainScan, we mapped research areas, connectors, etc. to visualise subject areas, their impact, and clusters of strong collaboration, to see where there are good opportunities for funding…

Similarly we visualised text mined impact statements across the whole corpus. Each impact is captured as a coloured dot. Clusters show similarity… Where things are far apart, there is less similarity. And that can highlight where there is a lot of work on, for instance, management of rivers and waterways… And these weren’t obvious as across disciplines…

Q&A

Q1) Who do you think benefits the most from this kind of information?

A1) In the consultancy we have clients across the spectrum. In the past we have mainly worked for funders and policy makers to track effectiveness. Increasingly we are talking to institutions wanting to understand strengths, to predict trends… And by publishers wanting to understand if journals should be split, consolidated, are there opportunities we are missing… Each can benefit enormously. And it makes the whole system more efficient.

Against capital – Stuart Lawson, Birkbeck University of London

So, my talk will be a bit different. The arguements I will be making are not in opposition to any of the other speakers here, but is about critically addressing our current ways we are working, and how publishing works. I have chosen to speak on this topic today as I think it is important to make visible the political positions that underly our assumptions and the systems we have in place today. There are calls to become more efficient but I disagree… Ownership and governance matter at least as much as the outcome.

I am an advocate for open access and I am currently undertaking a PhD looking at open access and how our discourse around this has been coopted by neoliberal capitalism. And I believe these issues aren’t technical but social and reflect inequalities in our society, and any company claiming to benefit society but operating as commercial companies should raise questions for us.

Neoliberalism is a political project to reshape all social relations to conform to the logic of capital (this is the only slide, apparently a written and referenced copy will be posted on Stuart’s blog). This system turns us all into capital, entrepreneurs of our selves – quantification, metricification whether through tuition fees that put a price on education, turn students into consumers selecting based on rational indicators of future income; or through pitting universities against each other rather than collaboratively. It isn’t just overtly commercial, but about applying ideas of the market in all elements of our work – high impact factor journals, metrics, etc. in the service of proving our worth. If we do need metrics, they should be open and nuanced, but if we only do metrics for people’s own careers and perform for careers and promotion, then these play into neoliberal ideas of control. I fully understand the pressure to live and do research without engaging and playing the game. It is easier to choose not to do this if you are in a position of privelege, and that reflects and maintains inequalities in our organisations.

Since power relations are often about labour and worth, this is inevitably part of work, and the value of labour. When we hear about disruption in the context of Uber, it is about disrupting rights of works, labour unions, it ignores the needs of the people who do the work, it is a neo-liberal idea. I would recommend seeing Audrey Watters’ recent presentation for University of Edinburgh on the “Uberisation of Education”.

The power of capital in scholarly publishing, and neoliberal values in our scholarly processes… When disruptors align with the political forces that need to be dismantled, I don’t see that as useful or properly disruptive. Open Access is a good thing in terms of open access. But there are two main strands of policy… Research Councils have spent over £80m to researchers to pay APCs. Publishing open access do not require payment of fees, there are OA journals who are funded other ways. But if you want the high end visible journals they are often hybrid journals and 80% of that RCUK has been on hybrid journals. So work is being made open access, but right now this money flows from public funds to a small group of publishers – who take a 30-40% profit – and that system was set up to continue benefitting publishers. You can share or publish to repositories… Those are free to deposit and use. The concern of OA policy is the connection to the REF, it constrains where you can publish and what they mean, and they must always be measured in this restricted structure. It can be seen as compliance rather than a progressive movement toward social justice. But open access is having a really positive impact on the accessibility of research.

If you are angry at Elsevier, then you should also be angry at Oxford University and Cambridge University, and others for their relationships to the power elite. Harvard made a loud statement about journal pricing… It sounded good, and they have a progressive open access policy… But it is also bullshit – they have huge amounts of money… There are huge inequalities here in academia and in relationship to publishing.

And I would recommend strongly reading some history on the inequalities, and the racism and capitalism that was inherent to the founding of higher education so that we can critically reflect on what type of system we really want to discover and share scholarly work. Things have evolved over time – somewhat inevitably – but we need to be more deliberative so that universities are more accountable in their work.

To end on a more positive note, technology is enabling all sorts of new and inexpensive ways to publish and share. But we don’t need to depend on venture capital. Collective and cooperative running of organisations in these spaces – such as the cooperative centres for research… There are small scale examples show the principles, and that this can work. Writing, reviewing and editing is already being done by the academic community, lets build governance and process models to continue that, to make it work, to ensure work is rewarded but that the driver isn’t commercial.

Q&A

Comment) That was awesome. A lot of us here will be to learn how to play the game. But the game sucks. I am a professor, I get to do a lot of fun things now, because I played the game… We need a way to have people able to do their work that way without that game. But we need something more specific than socialism… Libraries used to publish academic data… Lots of these metrics are there and useful… And I work with them… But I am conscious that we will be fucked by them. We need a way to react to that.

Redesigning Science for the Internet Generation – Gemma Milne, Co-Founder, Science Disrupt

Science Disrupt run regular podcasts, events, a Slack channel for scientists, start ups, VCs, etc. Check out our website. We talk about five focus areas of science. Today I wanted to talk about redesigning science for the internet age. My day job is in journalism and I think a lot about start ups, and to think about how we can influence academia, how success is manifests itself in the internet age.

So, what am I talking about? Things like Pavegen – power generating paving stones. They are all over the news! The press love them! BUT the science does not work, the physics does not work…

I don’t know if you heard about Theranos which promised all sorts of medical testing from one drop of blood, millions of investments, and it all fell apart. But she too had tons of coverage…

I really like science start ups, I like talking about science in a different way… But how can I convince the press, the wider audience what is good stuff, and what is just hype, not real… One of the problems we face is that if you are not engaged in research you either can’t access the science, and can’t read it even if they can access the science… This problem is really big and it influences where money goes and what sort of stuff gets done!

So, how can we change this? There are amazing tools to help (Authorea, overleaf, protocol.io, figshare, publons, labworm) and this is great and exciting. But I feel it is very short term… Trying to change something that doesn’t work anyway… Doing collaborative lab notes a bit better, publishing a bit faster… OK… But is it good for sharing science? Thinking about journalists and corporates, they don’t care about academic publishing, it’s not where they go for scientific information. How do we rethink that… What if we were to rethink how we share science?

AirBnB and Amazon are on my slide here to make the point of the difference between incremental change vs. real change. AirBnB addressed issues with hotels, issues of hotels being samey… They didn’t build a hotel, instead they thought about what people want when they traveled, what mattered for them… Similarly Amazon didn’t try to incrementally improve supermarkets.. They did something different. They dug to the bottom of why something exists and rethought it…

Imagine science was “invented” today (ignore all the realities of why that’s impossible). But imagine we think of this thing, we have to design it… How do we start? How will I ask questions, find others who ask questions…

So, a bit of a thought experiment here… Maybe I’d post a question on reddit, set up my own sub-reddit. I’d ask questions, ask why they are interested… Create a big thread. And if I have a lot of people, maybe I’ll have a Slack with various channels about all the facets around a question, invite people in… Use the group to project manage this project… OK, I have a team… Maybe I create a Meet Up Group for that same question… Get people to join… Maybe 200 people are now gathered and interested… You gather all these folk into one place. Now we want to analyse ideas. Maybe I share my question and initial code on GitHub, find collaborators… And share the code, make it open… Maybe it can be reused… It has been collaborative at every stage of the journey… Then maybe I want to build a microscope or something… I’d find the right people, I’d ask them to join my Autodesk 360 to collaboratively build engineering drawings for fabrication… So maybe we’ve answered our initial question… So maybe I blog that, and then I tweet that…

The point I’m trying to make is, there are so many tools out there for collaboration, for sharing… Why aren’t more researchers using these tools that are already there? Rather than designing new tools… These are all ways to engage and share what you do, rather than just publishing those articles in those journals…

So, maybe publishing isn’t the way at all? I get the “game” but I am frustrated about how we properly engage, and really get your work out there. Getting industry to understand what is going on. There are lots of people inventing in new ways.. YOu can use stuff in papers that isn’t being picked up… But see what else you can do!

So, what now? I know people are starved for time… But if you want to really make that impact, that you think is more interested… I undesrtand there is a concern around scooping… But there are ways to do that… And if you want to know about all these tools, do come talk to me!

Q&A

Q1) I think you are spot on with vision. We want faster more collaborative production. But what is missing from those tools is that they are not designed for researchers, they are not designed for publishing. Those systems are ephemeral… They don’t have DOIs and they aren’t persistent. For me it’s a bench to web pipeline…

A1) Then why not create a persistent archived URI – a webpage where all of a project’s content is shared. 50% of all academic papers are only read by the person that published them… These stumbling blocks in the way of sharing… It is crazy… We shouldn’t just stop and not share.

Q2) Thank you, that has given me a lot of food for thought. The issue of work not being read, I’ve been told that by funders so very relevant to me. So, how do we influence the professors… As a PhD student I haven’t heard about many of those online things…

A2) My co-founder of Science Disrupt is a computational biologist and PhD student… My response would be about not asking, just doing… Find networks, find people doing what you want. Benefit from collaboration. Sign an NDA if needed. Find the opportunity, then come back…

Q3) I had a comment and a question. Code repositories like GitHub are persistent and you can find a great list of code repositories and meta-articles around those on the Journal of Open Research Software. My question was about AirBnB and Amazon… Those have made huge changes but I think the narrative they use now is different from where they started – and they started more as incremental change… And they stumbled on bigger things, which looks a lot like research… So… How do you make that case for the potential long term impact of your work in a really engaging way?

A3) It is the golden question. Need to find case studies, to find interesting examples… a way to showcase similar examples… and how that led to things… Forget big pictures, jump the hurdles… Show that bigger picture that’s there but reduce the friction of those hurdles. Sure those companies were somewhat incremental but I think there is genuinely a really different mindset there that matters.

And we now move to lunch. Coming up…

UNCONFERENCE SESSION 1: Best Footprint Forward – Nicola Osborne, EDINA

This will be me – talking about managing a digital footprint and how robust web links are part of that lasting digital legacy- so no post from me but you can view my slides on Managing Your Digital Footprint and our Reference Rot in Theses: A HiberActive Pilot here.

SESSION TWO: The Early Career Researcher Perspective: Publishing & Research Communication

Getting recognition for all your research outputs – Michael Markie, F1000

I’m going to talk about things you do as researchers that you should get credit for, not just traditional publications. This week in fact there was a very interesting article on the history of science publishing “Is the staggering profitable business of scientific publishing bad for science?”. Publishers came out of that poorly… And I think others are at fault here too, including the research community… But we do have to take some blame.

There’s no getting away from the fact that the journal is the coin of the realm, for career progression, institutional reporting, grant applications. For the REF, will there be impact factors? REF says maybe not, but institutions will be tempted to use that to prioritise. Publishing is being looked at by impact factor…

And it’s not just where you publish. There are other things that you do in your work and which you should get ore credit for. Data; software/code – in bioinformatics there are new softwares and tools that are part of the research, are they getting the recognition they should; all results – not just the successes but also the negative results… Publishers want cool and sexy stuff but realistically we are funded for this, we should be able to publish and be recognised for it; peer review – there is no credit for it, peer reviews often improve articles and warrant credit; expertise – all the authors who added expertise, including non-research staff, everyone should know who contributed what…

So I see research as being more than a journal article. Right now we just package it all up into one tidy thing, but we should be fitting into that bigger picture. So, I’m suggesting that we need to disrupt it a bit more and pubis in a different way… Publishing introduces delays – of up to a year. Journals don’t really care about data… That’s a real issue for reproducibility.  And there is bias involved in publishing, there is a real lack of transparency in publishing decisions. All of the above means there is real research waster. At the same time there is demand for results, for quicker action, for wider access to work.

So, at F1000 we have been working on ways to address these issues. We launched Wellcome Open Research, and after launching that the Bill & Melinda Gated Foundation contacted us to build a similar platform. And we have also built an open research model for UCL Child Health (at St Ormond’s Street).

The process involves sending a paper in, checking there is plagiarism and that ethics are appropriate. But no other filtering. That can take up to 7 days. Then we ask for your data – no data then no publication. Then once the publication and data deposition is made, the work is published and an open peer review and user commenting process begins, they are names and credited, and they contribute to improve that article and contribute to the article revision. Those reviewers have three options: approved, approved with reservations, or not approved as it stands. So yo get to PMC and indexed in PubMed you need two “approved” status of two “approved with reservations” and an “approved”.

So this connects to lots of stuff… For Data thats with DataCite, DigShare, Plotly, Resource Identification Initiative. For Software/code we work with code ocean, Zenodo, GitHub. For All results we work with PubMed, you can publish other formats… etc.

Why are funders doing this? Wellcome Trust spent £7m on APCs last year… So this platform is partly as a service to stakeholders with a complementary capacity for all research findings. We are testing new approach to improve science and its impact – to accelerate access and sharing of findings and data; efficiency to reduce waste and support reproducibility; alternative OA model, etc.

Make an impact, know your impact, show your impact – Anna Ritchie, Mendeley, Elsevier

A theme across the day is that there is increasing pressure and challenges for researchers. It’s never been easier to get your work out – new technology, media, platforms. And yet, it’s never been harder to get your work seen: more researchers, producing more outputs, dealing with competition. So how do you ensure you and your work make an impact? Options mean opportunities, but also choices. Traditional publishing is still important – but not enough. And there are both older and newer ways to help make your research stand out.

Publishing campus is a big thing here. These are free resources to support you in publishing. There are online lectures, interactive training courses, and expert advice. And things happen – live webinars, online lectures (e.g. Top 10 Tips for Writing a Really Terrible Journal Article!), interactive course. There are suits of materials around publishing, around developing your profile.

At some point you will want to look at choosing a journal. Metrics may be part of what you use to choose a journal – but use both quantitative and qualitative (e.g. ask colleagues and experts). You can also use Elsevier Journal Finder – you can search for your title and abstract and subject areas to suggest journals to target. But always check the journal guidance before submitting.

There is also the opportunity for article enrichments which will be part of your research story – 2D radiological data viewer, R code Viewer, Virtual Microscope, Genome Viewer, Audioslides, etc.

There are also less traditional journals: Heliyon is all disciplines so you report your original and technically sound results of primary research, regardless of perceived impact. Methodsx is entirely about methods work. Data in Brief allows you to describe your data to facilitate reproducibility, make it easier to cite, etc. And an alternative to a data article is to add datasets on Mendeley.

And you can also use Mendeley to understand your impact through Mendeley Stats. There is a very detailed dashboard for each publication – this is powered by Scopus so works for all articles indexed in Scopus. Stats like users, Mendeley users with that article in their library, citations, related works… And you can see how your article is being shared. You can also show your impact on Mendeley, with a research profile that is as comprehensive as possible –  not just your publications but with wider impacts, press mentions…. And enabling you to connect to other researchers, to other articles and opportunities. This is what we are trying to do to make Mendeley help you build your online profile as a researcher. We intend to grow those profiles to give a more comprehensive picture of you as a researcher.

And we want to hear from you. Every journal, platform, and product is co-developed with ongoing community input. So do get in touch!

How to share science with hard to reach groups and why you should bother – Becky Douglas

My background is physics, high energy gravitational waves, etc… As I was doing my PhD I got very involved in science engagement. Hopefully most of you think about science communication and public outreach as being a good thing. It does seem to be something that arise in job interviews and performance reviews. I’m not convinced that everyone should do this – not everyone enjoys or is good at it – but there is huge potential if you are enthusiastic. And there is more expectation on scientists to do this to gain recognition, to help bring trust back to scientists, and right some misunderstanding. And by the way talks and teaching don’t count here.

And not everyone goes to science festivals. It is up to us to provide alternative and interesting things for those people. There are a few people who won’t be interested in science… But there are many more people who don’t have time or don’t see the appeal to them. These people deserve access to new research… And there are many ways to communicate that research. New ideas are always worth doing, and can attract new people and get dialogue you’d never expect.

So, article writing is a great way to reach out… Not just in science magazines (or on personal blogs). Newspapers and magazines will often print science articles – reach out to them. And you can pitch other places too – Cosmo prints science. Mainstream publications are desperate for people who understand science to write about it in engaging ways – sometimes you’ll be paid for your work as well.

Schools are obvious, but they are great ways to access people from all backgrounds. You’ll do extra well if you can connect it to the current curriculum! Put the effort in to build a memorable activity or event. Send them home with something fun and you may well reach parents as well…

More unusual events would be things like theatre, for instance Lady Scientists Stitch and Bitch. Stitch and Bitch is an international thing where you get together and sew and craft and chat. So this show was a play which was about travelling back in time to gather all the key lady scientists, and they sit down to discuss science over some knitting and sewing. Because it was theatre it was an extremely diverse group, not people who usually go to science events. When you work with non scientists you get access to a whole new crowd.

Something a bit more unusual… Soapbox Science, I brought to Glasgow in 2015. It’s science busking where you talk about your cutting edge research. Often attached to science festivals but out in public, to draw a crowd from those shopping, or visiting museums, etc. It’s highly interactive. Most had not been to a science event before, they didn’t go out to see science, but they enjoyed it…

And finally, interact with local communities. WI have science events, Scouts and Guides, meet up groups… You can just contact and reach out to those groups. They have questions in their own effort. It allows you to speak to really interesting groups. But it does require lots of time. But I was based in Glasgow, now in Falkirk, and I’ve just done some of this with schools in the Goebbels where we knew that the kids rarely go on to science subjects…

So, this is really worth doing. You work, if it is tax-payer funded, should be accessible to the public. Some people don’t think they have an interest in science – some are right but others just remember dusty chalkboards and bland text books. You have to show them it’s something more than that.

What helps or hinders science communication by early career researchers? – Lewis MacKenzie

I’m a postdoc at the University of Leeds. I’m a keen science communicator and I try to get out there as much as possible… I want to talk about what helps or hinders science communication by early career researchers.

So, who are early career researchers? Well undergraduates are a huge pool of early career researchers and scientists which tend to be untapped; also PhDs; also postdocs. There are some shared barriers here: travel costs, time… That is especially the case in inaccessible parts of Scotland. There is a real issue that science communication is work (or training). And not all supervisors have a positive attitude to science communication. As well as all the other barriers to careers in science of course.

Let’s start with science communication training. I’ve been through the system as an undergraduate, PhD students and postdocs. A lot of training are (rightly) targeted at PhD students, often around writing, conferences, elevator pitches, etc. But there are issues/barriers for ECRs include… Pro-active sci comm is often not formally recognized as training/CPD/workload – especially at evenings and weekends. I also think undergraduate sci comm modules are minimal/non-existent. You get dedicated sci comm masters now, there is lots to explore. And there are relatively poor sci comm training opportunities for post docs. But across the board media skills training pretty much limited – how do you make youtube videos, podcasts, web comics, writing in a magazine – and that’s where a lot of science communication takes place!

Sci Comm in Schools includes some great stuff. STEMNET is an excellent way for ECRs, industry, retirees, etc as volunteers, some basic training, background checks, and a contact hub with schools and volunteers. However it is a confusing school system (especially in England) and curricula. How do you do age-appropriate communication. And just getting to the schools can be tricky – most PhDs and Sci Comm people won’t have a car. It’s basic but important as a barrier.

Science Communication Competitions are quite widespread. They tend to be aimed at PhD students, incentives being experience, training and prizes. But there are issues/barriers for ECRs – often conventional “stand and talk” format; not usually collaborative – even though team work can be brilliant, the big famous science communicators work with a team to put their shows and work together; intense pressure of competitions can be off putting… Some alternative formats would help with that.

Conferences… Now there was a tweet earlier this week from @LizyLowe suggesting that every conference should have a public engagement strand – how good would that be?!

Research Grant “Impact Plans”: major funders now require “impact plans” revolving around science communication. That makes time and money for science communication which is great. But there are issues. The grant writer often designate activities before ECRs are recruited. These prescriptive impact plans aren’t very inspiring for ECRS. Money may be inefficiently spent on things like expensive web design. I think we need a more agile approach to include input from ECRs once recruited.

Finally I wanted to finish with Science Communication Fellowships. These are run by people like Wellcome Trust Engagement Fellowships and the STFC. These are for the Olympic gold medallists of Sci Comm. But they are not great for ECRs. The dates are annual and inflexible – and the process is over 6 months – it is a slow decision making process. And they are intensively competitive so not very ECR friendly, which is a shame as many sci comm people are ECRs. So perhaps more institutions or agencies should offer sci comm fellowships? And  a continuous application process with shorter spells?

To sum up… ECRs at different career stages require different training and organisational support to enable science communication. And science communication needs to be recognised as formal work/training/education – not an out of hours hobby! There are good initiatives out there but there could be many more.

PANEL DISCUSSION – Michael Markie, F1000 (MM); Anna Ritchie, Mendeley, Elsevier (AR); Becky Douglas (BD); Lewis MacKenzie (LW) – chaired by Joanna Young (JY)

Q1 (JY): Picking up on what you said about Pathways to Impact statements… What advice would you give to ECRs if they are completing one of these? What should they do?

A1 (LM): It’s quite a weird thing to do… Two strands… This research will make loads of money and commercialise it; and the science communication strand. It’s easier to say you’ll do a science festival event, harder to say you’ll do press release… Can say you will blog you work once a month, or tweet a day in the lab… You can do that. In my fellowship application I proposed a podcast on biophysics that I’d like to do. You can be creative with your science communication… But there is a danger that people aren’t imaginative and make it a box-ticking thing. Just doing a science festival event and a webpage isn’t that exciting. And those plans are written once… But projects run for three years maybe… Things change, skills change, people on the team change…

A1 (BD): As an ECR you can ask for help – ask supervisors, peers, ask online, ask colleagues… You can always ask for advice!

A1 (MM): I would echo that you should ask experienced people for help. And think tactically as different funders have their own priorities and areas of interest here too.

Q2: I totally agree with the importance of communicating your science… But showing impact of that is hard. And not all research is of interest to the public – playing devil’s advocate – so what do you do? Do you broaden it? Do you find another way in?

A2 (LM): Taking a step back and talking about broader areas is good… I talk a fair bit about undergraduates as science communicators… They have really good broad knowledge and interest. They can be excellent. And this is where things like Science Soapbox can be so effective. There are other formats too.. Things like Bright Club which communicates research through comedy… That’s really different.

A2 (BD) I would agree with all of that. I would add that if you want to measure impact then you have to think about it from the outset – will you count people, some sort of voting or questionnaires. YOu have to plan this stuff in. The other thing is that you have to pitch things carefully to your audience. If I run events on gravitational waves I will talk about space and black holes… Whereas with a 5 year old I ask about gravity and we jump up and down so they understand what is relevant to them in their lives.

A2 (LM): In terms of metrics for science communication… At the British Science Association conference a few years back and this was a major theme… Becky mentioned getting kids to post notes in boxes at sessions… Professional science communicators think a great deal about this… Maybe not as much us “Sunday Fun Run” type people but we should engage more.

Comment (AR): When you prepare an impact statement are you asked for metrics?

A2 (LM): Not usually… They want impact but don’t ask about that…

A2 (BD): Whether or not you are asked for details of how something went you do want to know how you did… And even if you just ask “Did you learn something new today?” that can be really helpful for understanding how it went.

Q3: I think there are too many metrics… As a microbiologist… which ones should I worry about? Should there be a module at the beginning of my PhD to tell me?

A3 (AR): There is no one metric… We don’t want a single number to sum us up. There are so many metrics as one number isn’t enough, one isn’t enough… There is experimentation going on with what works and what works for you… So be part of the conversation, and be part of the change.

A3 (MM): I think there are too many metrics too… We are experimenting. Altmetrics are indicators, there are citations, that’s tangible… We just have to live with a lot of them all at once at the moment!

UNCONFERENCE SESSION 2: Preprints: A journey through time – Graham Steel

This will be a quick talk plus plenty of discussion space… From the onset of thinking about this conference I was very keen to talk about preprints…

So, who knows what a preprint is? There are plenty of different definitions out there – see Neylon et al 2017. But we’ll take the Wikipedia definition for now. I thought preprints dates to the 1990s. But I found a paper that referenced a pre-print from 1922!

Lets start there… Preprints were ticking along fine… But then a fightback began, In 1966 preprinte were made outlaws when Nature wanted to take “lethal steps” to end preprints. In 1969 we had a thing called the “Inglefinger Rule” – we’ll come back to that later… Technology wise various technologies ticked along… In 1989 Tim Berners Lee came along, In 1991 Cern set up, also ArXiv set up and grew swiftly… About 8k prepreints per month are uploaded to ArXiv each month as of 2016. Then, in 2007-12 we had Nature Preprints…

But in 2007, the fightback began… In 2012 the Ingelfinger rule was creating stress… There are almost 35k journals, only 37 still use the Ingelfinger rule… But they include key journals like Cell.

But we also saw the launch of BioaXiv in 2013. And we’ve had an explosion of preprints since then… Also 2013 there was a £5m Centre for Open Science set up. This is a central place for preprints… That is a central space, with over 2m preprints so far. There are now a LOT of new …Xiv preprint sites. In 2015 we saw the launch of the ASAPbio movement.

Earlier this year Mark Zuckerberg invested billions in boiXiv… But everything comes at a price…

Scottish spends on average £11m per year to access research through journals. The best average for APCs I could find is $906. Per pre-print it’s $10. If you want to post a pre-print you have to check the terms of your journal – usually extremely clear. Best to check in SHERPA/ROMEO.

If you want to find out more about preprints there is a great Twitter list, also some recommended preprints reading. Find these slides: slideshare.net/steelgraham and osf.io/zjps6/.

Q&A

Q1: I found Sherpa/Romeo by accident…. But really useful. Who runs it?

A1: It’s funded by Jisc

Q2: How about findability…

A2: ArXiv usually points to where this work has been submitted. And you can go back and add the DOI once published.

Q2: It’s acting as a static archive then? To hold the green copy

A2: And there is collaborative activity across that… And there is work to make those findable, to share them, they are shared on PubMed…

Q2: One of the problems I see is purely discoverability… Getting it easy to find on Google. And integration into knowledgebases, can be found in libraries, in portals… Hard for a researcher looking for a piece of research… They look for a subject, a topic, to search an aggregated platform and link out to it… To find the repository… So people know they have legal access to preprint copies.

A2: You have COAR at OU which aggregates preprints, suggests additional items when you search. There is ongoing work to integrate with CRIS systems, frequently commercial so interoperability here.

Comment: ArXiv is still the place for high energy physics so that is worth researchers going directly too…

Q3: Can I ask about preprints and research evaluation in the US?

A3: It’s an important way to get the work out… But the lack of peer review is an issue there so emerging stuff there…

GS: My last paper was taking forever to come out, we thought it wasn’t going to happen… We posted to PeerJ but discovered that that journal did use the Inglefinger Rule which scuppered us…

Comment: There are some publishers that want to put preprints on their own platform, so everything stays within their space… How does that sit/conflict with what libraries do…

GS: It’s a bit “us! us! us!”

Comment: You could see all submitted to that journal, which is interesting… Maybe not health… What happens if not accepted… Do you get to pull it out? Do you see what else has been rejected? Could get dodgy… Some potential conflict…

Comment: I believe it is positioned as a separate entity but with a path of least resistance… It’s a question… The thing is.. If we want preprints to be more in academia as opposed to publishers… That means academia has to have the infrastructure to do that, to connect repositories discoverable and aggregated… It’s a potential competitive relationship… Interesting to see how it plays out…

Comment: For Scopus and Web of Science… Those won’t take preprints… Takes ages… And do you want to give up more rights to the journals… ?

Comment: Can see why people would want multiple copies held… That seems healthy… My fear is it requires a lot of community based organisation to be a sustainable and competitive workflow…

Comment: Worth noting the radical “platinum” open access… Lots of preprints out there… Why not get authors to submit them, organise into free, open journal without a publisher… That’s Tim Garrow’s thing… It’s not hard to put together a team to peer review thematically and put out issues of a journal with no charges…

GS: That’s very similar to open library of humanities… And the Wellcome Trust & Gates Foundation stuff, and big EU platform. But the Gates one could be huge. Wellcome Trust is relatively small so far… But EU-wide will be major ramifications…

Comment: Platinum is more about overlay journals… Also like Scope3 and they do metrics on citations etc. to compare use…

GS: In open access we know about green, gold and with platinum it’s free to author and reader… But use of words different in different contexts…

Q4: What do you think the future is for pre-prints?

A4 – GS: There is a huge boom… There’s currently some duplication of central open preprints platform. But information is clear on use and uptake is on the rise… It will plateau at some point like PLoSOne. They launched 2006 and they probably plateaued around 2015. But it is number 2 in the charts of mega-journals, behind Scientific Reports. They increased APCs (around $1450) and that didn’t help (especially as they were profitable)…

SESSION THREE: Raising your research profile: online engagement & metrics

Green, Gold, and Getting out there: How your choice of publisher services can affect your research profile and engagement – Laura Henderson, Editorial Program Manager, Frontiers

We are based in Lausanne in Switzerland. We are fully digital, fully open access publisher. All of 58 journals are published under CC-BY licenses. And the organisation was set up scientists that wanted to change the landscape. So I wanted to talk today about how this can change your work.

What is traditional academic publishing?

Typically readers pay – journal subscriptions via institution/library or pay per view. Given the costs and number of articles they are expensive – ¢14B journals revenue in 2014 works out at $7k per article. It’s slow too.. Journal rejection cascade can take 6 months to a year each time. Up to 1 million papers – valid papers – are rejected every year. And these limit access to research around 80% of research papers are behind subscription paywalls. So knowledge gets out very slowly and inaccessibly.

By comparison open access… Well Green OA allows you to publish an dthen self-archive your paper in a repository where it can be accessed for free. you can use an institutional or central repository, or I’d suggest both. And there can be a delay due to embargo. Gold OA makes research output immediately available from th epublisher and you retain the copyright so no embargoes. It is fully discoverable via indexing and professional promotion services to relevant readers. No subscription fee to reader but usually involves APCs to the institution.

How does Open Access publishing compare? Well it inverts the funding – institution/grant funder supports authors directly, not pay huge subscrition fees for packages dictates by publishers. It’s cheaper – Green OA is usually free. Gold OA average fee is c. $1000 – $3000 – actually that’s half what is paid for subscription publishing. We do see projections of open access overtaking subscription publishing by 2020.

So, what benefits does open access bring? Well there is peer-review; scalable publishing platforms; impact metrics; author discoverability and reputation.

And I’d now like to show you what you should look for from any publisher – open access or others.

Firstly, you should expect basic services: quality assurance and indexing. Peter Suber suggests checking the DOAJ – Directory of Open Access Journals. You can also see if the publisher is part of OASPA which excludes publishers who fail to meet their standards. What else? Look for peer review nad good editors – you can find the joint COPE/OASPA/DOAJ Principles of Transaparancy and Best Practice in Scholarly Publishing. So you need to have clear peer review proceses. And you need a governing board and editors.

At Frontiers we have an impact-neutral peer review oricess. We don’t screen for papers with highest impact. Authors, reviewers and handling Associate Editor interact directly with each other in the online forum. Names of editors and reviewers publishhed on final version of paper. And this leads to an average of 89 days from submission to acceptance – and that’s an industry leading timing… And that’s what won an ASPLP Innovation Award.

So, what are the extraordinary services a top OA publisher can provide? Well altmetrics are more readily available now. Digital articles are accessible and trackable. In Frontiers our metrics are built into every paper… You can see views, downloads, and reader demographics. And that’s post-publication analytics that doesn’t rely on impact factor. And it is community-led imapact – your peers decide the impact and importance.

How discoverable are you? We launched a bespoke built-in networking profile for every author and user: Loop. Scrapes all major index databases to find youe work – constatly updating. It’s linked to Orchid and is included in peer review process. When people look at your profile you can truly see your impact in the world.

In terms of how peers find your work we have article alerts going to 1 million people, and a newsletter that goes to 300k readers. And our articles have 250 million article views and downloads, with hotspots in Mountain View California, and in Shendeng, and areas of development in the “Global South”.

So when you look for a publisher, look for a publisher with global impact.

What are all these dots and what can linking them tell me? – Rachel Lammey, Crossref

Crossref are a not for profit organisation. So… We have articles out there, datasets, blogs, tweets, Wikipedia pages… We are really interested to understand these links. We are doing that through Crossref Event Data, tracking the conversation, mainly around objects with a DOI. The main way we use and mention publications is in the citations of articles. That’s the traditional way to discuss research and understand news. But research is being used in lots of different ways now – Twitter and Reddit…

So, where does Crossref fit in? It is the DOI registration agency for scholarly content. Publishers register their content with us. URLs do change and do break… And that means you need something ore persistent so it can still be used in their research… Last year at ReCon we tried to find DOI gaps in reference lists – hard to do. Even within journals publications move around… And switch publishers… The DOI fixes that reference. We are sort of a switchboard for that information.

I talked about citations and references… Now we are looking beyong that. It is about capturing data and relationships so that understanding and new services (by others) can be built… As such it’s an API (Application Programming Interface) – it’s lots of data rather than an interface. SO it captures subject, relation, object, tweet, mentions, etc. We are generating this data (As of yesterday we’ve seen 14 m events), we are not doing anything with it so this is a clear set of data to do further work on.

We’ve been doing work with NISO Working Group on altmetrics, but again, providing the data not the analysis. So, what can this data show? We see citation rings/friends gaming the machine; potential peer review scams; citation patterns. How can you use this data? Almost any way. Come talk to us about Linked Data; Article Level Metrics; general discoverability, etc.

We’ve done some work ourselves… For instant the Live Data from all sources – including Wikipedia citing various pages… We have lots of members in Korea, and started looking just at citations on Korean Wikipedia. It’s free under a CC0 license. If you are interested, go make something cool… Come ask me questions… And we have a beta testing group and we welcome you feedback and experiments with our data!

The wonderful world of altmetrics: why researchers’ voices matter – Jean Liu, Product Development Manager, Altmetric

I’m actually five years out of graduate school, so I have some empathy with PhD students and ECRs. I really want to go through what Altmetrics is and what measures there are. It’s not controversial to say that altmetrics have been experiencing a meteoric rise over the last few years… That is partly because we have so much more to draw upon than the traditional journal impact factors, citation counts, etc.

So, who are altmetrics.com? We have about 20 employees, founded in 2011 and all based in London. And we’ve started to see that people re receptive to altmetrics, partly because of the (near) instant feedback… We tune into the Twitter firehose – that phrase is apt! Altmetrics also showcase many “flavours” of attention and impact that research can have – and not just articles. And the signals we tracked are highly varies: policy documents, news, blogs, Twitter, post-publication peer review, Facebook, Wikipedia, LinkedIn, Reddit, etc.

Altmetrics also have limitations. They are not a replacement for peer review or citation-based metrics. They can be gamed – but data providers have measures in place to guard against this. We’ve seen interesting attempts at gamification – but often caught…

Researchers are not only the ones who receive attention in altmetrics, but they are also the ones generating attention that make up altmetrics – but not all attention is high quality or trustworthy. We don’t want to suggest that researchers should be judged just on altmetrics…

Meanwhile Universities are asking interesting questions: how an our researchers change policy? Which conference can I send people to which will be most useful, etc.

So, lets see the topic of “diabetic neuropathy”. Looking around we can see a blog, an NHS/Nice guidance document, and a The Conversation. A whole range of items here. And you can track attention over time… Both by volume, but also you can look at influencers across e.g. News Outlets, Policy Outlets, Blogs and Tweeters. And you can understand where researcher voices feature (all are blogs). And I can then compare news and policy and see the difference. The profile for News and Blogs are quite different…

How can researchers voices be heard? Well you can write for a different audience, you can raise the profile of your work… You can become that “go-to” person. You also want to be really effective when you are active – altmetrics can help you to understand where your audience is and how they respond, to understand what is working well.

And you can find out more by trying the altmetric bookmarking browser plugin, by exploring these tools on publishing platforms (where available), or by taking a look.

How to help more people find and understand your work – Charlie Rapple, Kudos

I’m sorry to be the last person on the agenda, you’ll all be overwhelmed as there has been so much information!

I’m one of the founders of Kudos and we are an organisation dedicated to helping you increase the reach and impact of your work. There is such competition for funding, a huge growth in outputs, there is a huge fight for visibility and usage, a drive for accountability and a real cult of impact. You are expected to find and broaden the audience for your work, to engage with the public. And that is the context in which we set up Kudos. We want to help you navigate this new world.

Part of the challenge is knowing where to engage. We did a survey last year with around 3000 participants to ask how they share their work – conferences, academic networking, conversations with colleagues all ranked highly; whilst YouTube, slideshare, etc. are less used.

Impact is built on readership – impacts cross a variety of areas… But essentially it comes down to getting people to find and read your work. So, for me it starts with making sure you increase the number of people reaching and engaging with your work. Hence the publication is at the centre – for now. That may well be changing as other material is shared.

We’ve talked a lot about metrics, there are very different ones and some will matter more to you than others. Citations have high value, but so do mentions, clicks, shares, downloads… Do take the time to think about these. And think about how your own actions and behaviours contribute back to those metrics… So if you email people about your work, track that to see if it works… Make those connections… Everyone has their own way and, as Nicola was saying in the Digital Footprint session, communities exist already, you have to get work out there… And your metrics have to be about correlating what happens – readership and citations. Kudos is a management tool for that.

In terms of justifying time here is that communications do increase impact. We have been building up data on how that takes place. A team from Nanyang Technological Institute did a study of our data in 2016 and they saw that the Kudos tools – promoting their work – they had 23% higher growth in downloads of full text on publisher sites. And that really shows the value of doing that engagement. It will actually lead to meaningful results.

So a quick look at how Kudos works… It’s free for researchers (www.growkudos.com) and it takes about 15 minutes to set up, about 10 minutes each time you publish something new. You can find a publication, you can use your ORCID if you have one… It’s easy to find your publication and once you have then you have page for that where you can create a plain language explanation of your work and why it is important – that is grounded in talking to researchers about what they need. For example: http://bit.ly/plantsdance. That plain text is separate from the abstract. It’s that first quick overview. The advantage of this is that it is easier for people within the field to skim and scam your work; people outside your field in academia can skip terminology of your field and understand what you’ve said. There are also people outside academia to get a handle on research and apply it in non-academic ways. People can actually access your work and actually understand it. There is a lot of research to back that up.

Also on publication page you can add all the resources around your work – code, data, videos, interviews, etc. So for instance Claudia Sick does work on baboons and why they groom where they groom – that includes an article and all of that press coverage together. That publication page gives you a URL, you can post to social media from within Kudos. You can copy the trackable link and paste wherever you like. The advantage to doing this in Kudos is that we can connect that up to all of your metrics and your work. You can get them all in one place, and map it against what you have done to communicate. And we map those actions to show which communications are more effective for sharing… You can really start to refine your efforts… You might have built networks in one space but the value might all be in another space.

Sign up now and we are about to launch a game on building up your profile and impact, and scores your research impact and lets you compare to others.

PANEL DISCUSSION – Laura Henderson, Editorial Program Manager, Frontiers (LH); Rachel Lammey, Crossref (RL); Jean Liu, Product Development Manager, Altmetric (JL); Charlie Rapple, Kudos (CR). 

Q1: Really interesting but how will the community decide which spaces we should use?

A1 (CR): Yes, in the Nangyang work we found that most work was shared on Facebook, but more links were engaged with on Twitter. There is more to be done, and more to filter through… But we have to keep building up the data…

A1 (LH): We are coming from the same sort of place as Jean there, altmetrics are built into Frontiers, connected to ORCID, Loop built to connect to institutional plugins (totally open plugin). But it is such a challenge… Facebook, Twitter, LinkedIn, SnapChat… Usually personal choice really, we just want to make it easier…

A1 (JL): It’s about interoperability. We are all working in it together. You will find certain stats on certain pages…

A1 (RL): It’s personal choice, it’s interoperability… But it is about options. Part of the issue with impact factor is the issue of being judged by something you don’t have any choice or impact upon… And I think that we need to give new tools, ways to select what is right for them.

Q2: These seem like great tools, but how do we persuade funders?

A2 (JL): We have found funders being interested independently, particularly in the US. There is this feeling across the scholarly community that things have to change… And funders want to look at what might work, they are already interested.

A2 (LH): We have an office in Brussels which lobbies to the European Commission, we are trying to get our voice for Open Science heard, to make difference to policies and mandates… The impact factor has been convenient, it’s well embedded, it was designed by an institutional librarian, so we are out lobbying for change.

A2 (CR): Convenience is key. Nothing has changed because nothing has been convenient enough to replace the impact factor. There is a lot of work and innovation in this area, and it is not only on researchers to make that change happen, it’s on all of us to make that change happen now.

Jo Young (JY): To finish a few thank yous… Thank you all for coming a lot today, to all of our speakers, and a huge thank you for Peter and Radic (our cameramen), to Anders, Graham and Jan for work in planning this. And to Nicola and Amy who have been liveblogging, and to all who have been tweeting. Huge thanks to CrossRef, Frontiers, F1000, JYMedia, and PLoS.

And with that we are done. Thanks to all for a really interesting and busy day!

 

Jun 152017
 

I am again at the IIPC WAC / RESAW Conference 2017 and today I am in the very busy technical strand at the British Library. See my Day One post for more on the event and on the HiberActive project, which is why I’m attending this very interesting event.

These notes are live so, as usual, comments, additions, corrections, etc. are very much welcomed.

Tools for web archives analysis & record extraction (chair Nicholas Taylor)

Digging documents out of the archived web – Andrew Jackson

This is the technical counterpoint to the presentation I gave yesterday… So I talked yesterday about the physical workflow of catalogue items… We found that the Digital ePrints team had started processing eprints the same way…

  • staff looked in an outlook calendar for reminders
  • looked for new updates since last check
  • download each to local folder and open
  • check catalogue to avoid re-submitting
  • upload to internal submission portal
  • add essential metadata
  • submit for ingest
  • clean up local files
  • update stats sheet
  • Then inget usually automated (but can require intervention)
  • Updates catalogue once complete
  • New catalogue records processed or enhanced as necessary.

It was very manual, and very inefficient… So we have created a harvester:

  • Setup: specify “watched targets” then…
  • Harvest (harvester crawl targets as usual) –> Ingested… but also…
  • Document extraction:
    • spot documents in the crawl
    • find landing page
    • extract machine-readable metadata
    • submit to W3ACT (curation tool) for review
  • Acquisition:
    • check document harvester for new publications
    • edit essential metadata
    • submit to catalogue
  • Cataloguing
    • cataloguing records processed as necessary

This is better but there are challenges. Firstly, what is a “publication?”. With the eprints team there was a one-to-one print and digital relationship. But now, no more one-to-one. For example, gov.uk publications… An original report will has an ISBN… But that landing page is a representation of the publication, that’s where the assets are… When stuff is catalogued, what can frustrate technical folk… You take date and text from the page – honouring what is there rather than normalising it… We can dishonour intent by capturing the pages… It is challenging…

MARC is initially alarming… For a developer used to current data formats, it’s quite weird to get used to. But really it is just encoding… There is how we say we use MARC, how we do use MARC, and where we want to be now…

One of the intentions of the metadata extraction work was to provide an initial guess of the catalogue data – hoping to save cataloguers and curators time. But you probably won’t be surprised that the names of authors’ names etc. in the document metadata is rarely correct. We use the worse extractor, and layer up so we have the best shot. What works best is extracting the HTML. Gov.uk is a big and consistent publishing space so it’s worth us working on extracting that.

What works even better is the gov.uk API data – it’s in JSON, it’s easy to parse, it’s worth coding as it is a bigger publisher for us.

But now we have to resolve references… Multiple use cases for “records about this record”:

  • publisher metadata
  • third party data sources (e.g. Wikipedia)
  • Our own annotations and catalogues
  • Revisit records

We can’t ignore the revisit records… Have to do a great big join at some point… To get best possible quality data for every single thing….

And this is where the layers of transformation come in… Lots of opportunities to try again and build up… But… When I retry document extraction I can accidentally run up another chain each time… If we do our Solr searches correctly it should be easy so will be correcting this…

We do need to do more future experimentation.. Multiple workflows brings synchronisation problems. We need to ensure documents are accessible when discoverable. Need to be able to re-run automated extraction.

We want to iteratively improve automated metadata extraction:

  • improve HTML data extraction rules, e.g. Zotero translators (and I think LOCKSS are working on this).
  • Bring together different sources
  • Smarter extractors – Stanford NER, GROBID (built for sophisticated extraction from ejournals)

And we still have that tension between what a publication is… A tension between established practice and publisher output Need to trial different approaches with catalogues and users… Close that whole loop.

Q&A

Q1) Is the PDF you extract going into another repository… You probably have a different preservation goal for those PDFs and the archive…

A1) Currently the same copy for archive and access. Format migration probably will be an issue in the future.

Q2) This is quite similar to issues we’ve faced in LOCKSS… I’ve written a paper with Herbert von de Sompel and Michael Nelson about this thing of describing a document…

A2) That’s great. I’ve been working with the Government Digital Service and they are keen to do this consistently….

Q2) Geoffrey Bilder also working on this…

A2) And that’s the ideal… To improve the standards more broadly…

Q3) Are these all PDF files?

A3) At the moment, yes. We deliberately kept scope tight… We don’t get a lot of ePub or open formats… We’ll need to… Now publishers are moving to HTML – which is good for the archive – but that’s more complex in other ways…

Q4) What does the user see at the end of this… Is it a PDF?

A4) This work ends up in our search service, and that metadata helps them find what they are looking for…

Q4) Do they know its from the website, or don’t they care?

A4) Officially, the way the library thinks about monographs and serials, would be that the user doesn’t care… But I’d like to speak to more users… The library does a lot of downstream processing here too..

Q4) For me as an archivist all that data on where the document is from, what issues in accessing it they were, etc. would extremely useful…

Q5) You spoke yesterday about engaging with machine learning… Can you say more?

A5) This is where I’d like to do more user work. The library is keen on subject headings – thats a big high level challenge so that’s quite amenable to machine learning. We have a massive golden data set… There’s at least a masters theory in there, right! And if we built something, then ran it over the 3 million ish items with little metadata could be incredibly useful. In my 0pinion this is what big organisations will need to do more and more of… making best use of human time to tailor and tune machine learning to do much of the work…

Comment) That thing of everything ending up as a PDF is on the way out by the way… You should look at Distil.pub – a new journal from Google and Y combinator – and that’s the future of these sorts of formats, it’s JavaScript and GitHub. Can you collect it? Yes, you can. You can visit the page, switch off the network, and it still works… And it’s there and will update…

A6) As things are more dynamic the re-collecting issue gets more and more important. That’s hard for the organisation to adjust to.

Nick Ruest & Ian Milligan: Learning to WALK (Web Archives for Longitudinal Knowledge): building a national web archiving collaborative platform

Ian: Before I start, thank you to my wider colleagues and funders as this is a collaborative project.

So, we have a fantastic web archival collections in Canada… They collect political parties, activist groups, major events, etc. But, whilst these are amazing collections, they aren’t accessed or used much. I think this is mainly down to two issues: people don’t know they are there; and the access mechanisms don’t fit well with their practices. Maybe when the Archive-it API is live that will fix it all… Right now though it’s hard to find the right thing, and the Canadian archive is quite siloed. There are about 25 organisations collecting, most use the Archive-It service. But, if you are a researcher… to use web archives you really have to interested and engaged, you need to be an expert.

So, building this portal is about making this easier to use… We want web archives to be used on page 150 in some random book. And that’s what the WALK project is trying to do. Our goal is to break down the silos, take down walls between collections, between institutions. We are starting out slow… We signed Memoranda of Understanding with Toronto, Alberta, Victoria, Winnipeg, Dalhousie, Simon Fraser University – that represents about half of the archive in Canada.

We work on workflow… We run workshops… We separated the collections so that post docs can look at this

We are using Warcbase (warcbase.org) and command line tools, we transferred data from internet archive, generate checksums; we generate scholarly derivatives – plain text, hypertext graph, etc. In the front end you enter basic information, describe the collection, and make sure that the user can engage directly themselves… And those visualisations are really useful… Looking at visualisation of the Canadian political parties and political interest group web crawls which track changes, although that may include crawler issues.

Then, with all that generated, we create landing pages, including tagging, data information, visualizations, etc.

Nick: So, on a technical level… I’ve spent the last ten years in open source digital repository communities… This community is small and tight-knit, and I like how we build and share and develop on each others work. Last year we presented webarchives.ca. We’ve indexed 10 TB of warcs since then, representing 200+ M Solr docs. We have grown from one collection and we have needed additional facets: institution; collection name; collection ID, etc.

Then we have also dealt with scaling issues… 30-40Gb to 1Tb sized index. You probably think that’s kinda cute… But we do have more scaling to do… So we are learning from others in the community about how to manage this… We have Solr running on an Open Stack… But right now it isn’t at production scale, but getting there. We are looking at SolrCloud and potentially using a Shard2 per collection.

Last year we had a Solr index using the Shine front end… It’s great but… it doesn’t have an active open source community… We love the UK Web Archive but… Meanwhile there is BlackLight which is in wide use in libraries. There is a bigger community, better APIs, bug fixes, etc… So we have set up a prototype called WARCLight. It does almost all that Shine does, except the tree structure and the advanced searching..

Ian spoke about derivative datasets… For each collection, via Blacklight or ScholarsPortal we want domain/URL Counts; Full text; graphs. Rather than them having to do the work, they can just engage with particular datasets or collections.

So, that goal Ian talked about: one central hub for archived data and derivatives…

Q&A

Q1) Do you plan to make graphs interactive, by using Kibana rather than Gephi?

A1 – Ian) We tried some stuff out… One colleague tried R in the browser… That was great but didn’t look great in the browser. But it would be great if the casual user could look at drag and drop R type visualisations. We haven’t quite found the best option for interactive network diagrams in the browser…

A1 – Nick) Generally the data is so big it will bring down the browser. I’ve started looking at Kibana for stuff so in due course we may bring that in…

Q2) Interesting as we are doing similar things at the BnF. We did use Shine, looked at Blacklight, but built our own thing…. But we are looking at what we can do… We are interested in that web archive discovery collections approaches, useful in other contexts too…

A2 – Nick) I kinda did this the ugly way… There is a more elegant way to do it but haven’t done that yet..

Q2) We tried to give people WARC and WARC files… Our actual users didn’t want that, they want full text…

A2 – Ian) My students are quite biased… Right now if you search it will flake out… But by fall it should be available, I suspect that full text will be of most interest… Sociologists etc. think that network diagram view will be interesting but it’s hard to know what will happen when you give them that. People are quickly put off by raw data without visualisation though so we think it will be useful…

Q3) Do you think in few years time

A3) Right now that doesn’t scale… We want this more cloud-based – that’s our next 3 years and next wave of funded work… We do have capacity to write new scripts right now as needed, but when we scale that will be harder,,,,

Q4) What are some of the organisational, admin and social challenges of building this?

A4 – Nick) Going out and connecting with the archives is a big part of this… Having time to do this can be challenging…. “is an institution going to devote a person to this?”

A4 – Ian) This is about making this more accessible… People are more used to Backlight than Shine. People respond poorly to WARC. But they can deal with PDFs with CSV, those are familiar formats…

A4 – Nick) And when I get back I’m going to be doing some work and sharing to enable an actual community to work on this..

Gregory Wiedeman: Automating access to web archives with APIs and ArchivesSpace

A little bit of context here… University at Albany, SUNY we are a public university with state records las that require us to archive. This is consistent with traditional collecting. But we no dedicated web archives staff – so no capacity for lots of manual work.

One thing I wanted to note is that web archives are records. Some have paper equivalent, or which were for many years (e.g. Undergraduate Bulletin). We also have things like word documents. And then we have things like University sports websites, some of which we do need to keep…

The seed isn’t a good place to manage these as records. But archives theory and practices adapt well to web archives – they are designed to scale, they document and maintain context, with relationship to other content, and a strong emphasis on being a history of records.

So, we are using DACS: Describing Archives: A Content Standard to describe archives, why not use that for web archives? They focus on intellectual content, ignorant of formats; designed for pragmatic access to archives. We also use ArchiveSpace – a modern tool for aggregated records that allows curators to add metadata about a collection. And it interleaved with our physical archives.

So, for any record in our collection.. You can specify a subject… a Python script goes to look at our CDX, looks at numbers, schedules processes, and then as we crawl a collection the extents and data collected… And then shows in our catalogue… So we have our paper records, our digital captures… Uses then can find an item, and only then do you need to think about format and context. And, there is an awesome article by David Graves(?) which talks about that aggregation encourages new discovery…

Users need to understand where web archives come from. They need provenance to frame of their research question – it adds weight to their research. So we need to capture what was attempted to be collected – collecting policies included. We have just started to do this with a statement on our website. We need a more standardised content source. This sort of information should be easy to use and comprehend, but hard to find the right format to do that.

We also need to capture what was collected. We are using the Archive-It Partner Data API, part of the Archive-It 5.0 system. That API captures:

  • type of crawl
  • unique ID
  • crawl result
  • crawl start, end time
  • recurrence
  • exact data, time, etc…

This looks like a big JSON file. Knowing what has been captured – and not captured – is really important to understand context. What can we do with this data? Well we can see what’s in our public access system, we can add metadata, we can present some start times, non-finish issues etc. on product pages. BUT… it doesn’t address issues at scale.

So, we are now working on a new open digital repository using the Hydra system – though not called that anymore! Possibly we will expose data in the API. We need standardised data structure that is independent of tools. And we also have a researcher education challenge – the archival description needs to be easy to use, re-share and understand.

Find our work – sample scripts, command line query tools – on Github:

http://github.com/UAlbanyArchives/describingWebArchives

Q&A

Q1) Right now people describe collection intent, crawl targets… How could you standardise that?

A1) I don’t know… Need an intellectual definition of what a crawl is… And what the depth of a crawl is… They can produce very different results and WARC files… We need to articulate this in a way that is clear for others to understand…

Q1) Anything equivalent in the paper world?

A1) It is DACS but in the paper work we don’t get that granular… This is really specific data we weren’t really able to get before…

Q2) My impression is that ArchiveSpace isn’t built with discovery of archives in mind… What would help with that…

A2) I would actually put less emphasis on web archives… Long term you shouldn’t have all these things captures. We just need an good API access point really… I would rather it be modular I guess…

Q3) Really interesting… the definition of Archive-It, what’s in the crawl… And interesting to think about conveying what is in the crawl to researchers…

A3) From what I understand the Archive-It people are still working on this… With documentation to come. But we need granular way to do that… Researchers don’t care too much about the structure…. They don’t need all those counts but you need to convey some key issues, what the intellectual content is…

Comment) Looking ahead to the WASAPI presentation… Some steps towards vocabulary there might help you with this…

Comment) I also added that sort of issue for today’s panels – high level information on crawl or collection scope. Researchers want to know when crawlers don’t collect things, when to stop – usually to do with freak outs about what isn’t retained… But that idea of understanding absence really matters to researchers… It is really necessary to get some… There is a crapton of data in the partners API – most isn’t super interesting to researchers so some community effort to find 6 or 12 data points that can explain that crawl process/gaps etc…

A4) That issue of understanding users is really important, but also hard as it is difficult to understand who our users are…

Harvesting tools & strategies (Chair: Ian Milligan)

Jefferson Bailey: Who, what, when, where, why, WARC: new tools at the Internet Archive

Firstly, apologies for any repetition between yesterday and today… I will be talking about all sorts of updates…

So, WayBack Search… You can now search WayBackMachine… Including keyword, host/domain search, etc. The index is build on inbound anchor text links to a homepage. It is pretty cool and it’s one way to access this content which is not URL based. We also wanted to look at domain and host routes into this… So, if you look at the page for, say, parliament.uk you can now see statistics and visualisations. And there is an API so you can make your own visualisations – for hosts or for domains.

We have done stat counts for specific domains or crawl jobs… The API is all in json so you can just parse this for, for example, how much of what is archived for a domain is in the form of PDFs.

We also now have search by format using the same idea, the anchor text, the file and URL path, and you can search for media assets. We don’t have exciting front end displays yet… But I can search for e.g. Puppy, mime type: video, 2014… And get lots of awesome puppy videos [the demo is the Puppy Bowl 2014!]. This media search is available for some of the WayBackMachine for some media types… And you can again present this in the format and display you’d like.

For search and profiling we have a new 14 column CDX including new language, simhash, sha256 fields. Language will help users find material in their local/native languages. The SIMHASH is pretty exciting… that allows you to see how much a page has changed. We have been using it on Archive It partners… And it is pretty good. For instance seeing government blog change month to month shows the (dis)similarity.

For those that haven’t seen the Capture tool – Brozzler is in production in Archive-it with 3 doze orgaisations and using it. This has also led to warcprox developments too. It was intended for AV and social media stuff. We have a chromium cluster… It won’t do domain harvesting, but it’s good for social media.

In terms of crawl quality assurance we are working with the Internet Memory Foundation to create quality toools. These are building on internal crawl priorities work at IA crawler beans, comparison testing. And this is about quality at scale. And you can find reports on how we also did associated work on the WayBackMachine’s crawl quality. We are also looking at tools to monitor crawls for partners, trying to find large scale crawling quality as it happens… There aren’t great analytics… But there are domain-scale monitoring, domain scale patch crawling, and Slack integrations.

For doman scale work, for patch crawling we use WAT analysis for embeds and most linked. We rank by inbound links and add to crawl. ArchiveSpark is a framework for cluster-based data extraction and derivation (WA+).

Although this is a technical presentation we are also doing an IMLS funded project to train public librarians in web archiving to preserve online local history and community memory, working with partners in various communities.

Other collaborations and research include our end of term web archive 2016/17 when the administration changes… No one is official custodian for the gov.uk. And this year the widespread deletion of data has given this work greater profile than usual. This time the work was with IA, LOC, UNT, GWU, and others. 250+ TB of .gov/.mil as well as White House and Obama social media content.

There had already been discussion of the Partner Data API. We are currently re-building this so come talk to me if you are interested in this. We are working with partners to make sure this is useful. makes sense, and is made more relevant.

We take a lot of WARC files from people to preserve… So we are looking to see how we can get partners to do this with and for it. We are developing a pipeline for automated WARC ingest for web services.

There will be more on WASAPI later, but this is part of work to ensure web archives are more accessible… And that uses API calls to connect up repositories.

We have also build a WAT API that allows you to query most of the metadta for a WARC file. You can feed it URLs, and get back what you want – except the page type.

We have new portals and searches now and coming. This is about putting new search layers on TLD content in the WayBackMachine… So you can pick media types, and just from one domain, and explore them all…

And with a statement on what archives should do – involving a gif of a centaur entering a rainbow room – that’s all… 

Q&A

Q1) What are implications of new capabilities for headless browsing for Chrome for Brozzler…

A1 – audience) It changes how fast you can do things, not really what you can do…

Q2) What about http post for WASAPI

A2) Yes, it will be in the Archive-It web application… We’ll change a flag and then you can go and do whatever… And there is reporting on the backend. Doesn’t usually effect crawl budgets, it should be pretty automated… There is a UI.. Right now we do a lot manually, the idea is to do it less manually…

Q3) What do you do with pages that don’t specify encoding… ?

A3) It doesn’t go into url tokenisation… We would wipe character encoding in anchor text – it gets cleaned up before elastic search..

Q4) The SIMHASH is before or after the capture? And can it be used for deduplication

A4) After capture before CDX writing – it is part of that process. Yes, it could be used for deduplication. Although we do already do URL deduplication… But we could compare to previous SIMHASH to work out if another copy is needed… We really were thinking about visualising change…

Q5) I’m really excited about WATS… What scale will it work on…

A5) The crawl is on 100 TB – we mostly use existing WARC and Json pipeline… It performs well on something large. But if a lot of URLs, it could be a lot to parse.

Q6) With quality analysis and improvement at scale, can you tell me more about this?

A6) We’ve given the IMF access to our own crawls… But we have been compared our own crawls to our own crawls… Comparing to Archive-it is more interesting… And looking at domain level… We need to share some similar size crawls – BL and IA – and figure out how results look and differ. It won’t be content based at that stage, it will be hotpads and URLs and things.

Michele C. Weigle, Michael L. Nelson, Mat Kelly & John Berlin: Archive what I see now – personal web archiving with WARCs

Mat: I will be describing tools here for web users. We want to enable individuals to create personal web archives in a self-contained way, without external services. Standard web archiving tools are difficult for non IT experts. “Save page as” is not suitable for web archiving. Why do this? It’s for people who don’t want to touch the commend line, but also to ensure content is preserved that wouldn’t otherwise be. More archives are more better.

It is also about creation and access, as both elements are important.

So, our goals involve advancing development of:

  • WARCreate – create WARC from what you see in your browser.
  • Web Archiving Integration Layer (WAIL)
  • Mink

WARCcreate is… A Chrome browser extension to save WARC files from your browser, no credentials pass through 3rd parties. It heavilt leverages Chrome webRequest API. ut it was build in 2012 so APIs and libraries have evolved so we had to work on that. We also wanted three new modes for bwoser based preservation: record mode – retain buffer as you browse; countdown mode – preserve reloading page on an interval; event mode – preserve page when automatically reloaded.

So you simply click on the WARCreate button the browser to generate WARC files for non technical people.

Web Archiving Integration Layer (WAIL) is a stand-alone desktop application, it offers collection-based web archiving, and includes Heritrix for crawling, OpenWayback for replay, and Python scripts compiled to OS-native binaries (.app, .exe). One of the recent advancements was a new user interface. We ported Python to Electron – using web technologies to create native apps. And that means you can use native languages to help you to preserve. We also moves from a single archive to collection-based archiving. We also ported OpenWayback to pywb. And we also started doing native Twitter integration – over time and hashtags…

So, the original app was a tool to enter a URI and then get a notification. The new version is a little more complicated but provides that new collection-based interface. Right now both of these are out there… Eventually we’d like to merge functionality here. So, an example here, looking at the UK election as a collection… You can enter information, then crawl to within defined boundaries… You can kill processes, or restart an old one… And this process integrates with Heritrix to give status of a task here… And if you want to Archive Twitter you can enter a hashtag and interval, you can also do some additional filtering with keywords, etc. And then once running you’ll get notifications.

Mink… is a Google Chrome browser extension. It indicates archival capture count as you browse. Quickly submits URI to multiple archives from UI. From Mink(owski) space. Our recent enhancements include enhancements to the interface to add the number of archives pages to icon at bottom of page. And allows users to set preferences on how to view large set of memetos. And communication with user-specified or local archives…

The old mink interface could be affected by page CSS as in the DOM. So we ave moved to shadow DOM, making it more reliable and easy to use. And then you have a more consistent, intuitive iller columns for many captures. It’s an integration of live and archive web, whilst you are viewing the live web. And you can see year, month, day, etc. And it is refined to what you want to look at this. And you have an icon in Mink to make a request to save the page now – and notification of status.

So, in terms of tool integration…. We want to ensure integration between Mink and WAIL so that Mink points to local archives. In the future we want to decouple Mink from external Memento aggregator – client-side customisable collection of archives instead.

See: http://bit.ly/iipcWAC2017 for tools and source code.

Q&A

Q1) Do you see any qualitative difference in capture between WARCreate and WARC recorder?

A1) We capture the representation right at the moment you saw it.. Not the full experience for others, but for you in a moment of time. And that’s our goal – what you last saw.

Q2) Who are your users, and do you have a sense of what they want?

A2) We have a lot of digital humanities scholars wanting to preserve Twitter and Facebook – the stream as it is now, exactly as they see it. So that’s a major use case for us.

Q3) You said it is watching as you browse… What happens if you don’t select a WARC

A3) If you have hit record you could build up content as pages reload and are in that record mode… It will impact performance but you’ll have a better capture…

Q3) Just a suggestion but I often have 100 tabs open but only want to capture something once a week so I might want to kick it off only when I want to save it…

Q4) That real time capture/playback – are there cool communities you can see using this…

A4) Yes, I think with CNN coverage of a breaking storm allows you to see how that story evolves and changes…

Q5) Have you considered a mobile version for social media/web pages on my phone?

A5) Not currently supported… Chrome doesn’t support that… There is an app out there that lets you submit to archives, but not to create WARC… But there is a movement to making those types of things…

Q6) Personal archiving is interesting… But jailed in my laptop… great for personal content… But then can I share my WARC files with the wider community .

A6) That’s a good idea… And more captures is better… So there should be a way to aggregate these together… I am currently working on that, but you should need to be able to specify what is shared and what is not.

Q6) One challenge there is about organisations and what they will be comfortable with sharing/not sharing.

Lozana Rossenova and IIya Kreymar, Rhizome: Containerised browsers and archive augmentation

Lozana: As you probably know Webrecorder is a high fidelity interactive recording of any web site you browse – and how you engage. And we have recently released an App in electron format.

Webrecorder is a worm’s eye view of archiving, tracking how users actually move around the web… For instance for instragram and Twitter posts around #lovewins you can see the quality is high. Webrecorder uses symmetrical archiving – in the live browser and in a remote browser… And you can capture then replay…

In terms of how we organise webrecorder: we have collections and sessions.

The thing I want to talk about today is on Remote browsers, and my work with Rhizome on internet art. And a lot of these works actually require old browser plugins and tools… So Webrecorder enables capture and replay even where technology no longer available.

To clarify: the programme says “containerised” but we now refer to this as “remote browsers” – still using Docker cotainers to run these various older browsers.

When you go to record a site you select the browser, and the site, and it begins the recording… The Java Applet runs and shows you a visulisation of how it is being captured. You can do this with flash as well… If we open a multimedia in your normal (Chrome) browser, it isn’t working. Restoration is easier with just flash, need other things to capture flash with other dependencies and interactions.

Remote browsers are really important for Rhizome work in general, as we use them to stage old artworks in new exhibitions.

Ilya: I will be showing some upcoming beta features, including ways to use webrecorder to improve other arhives…

Firstly, which other web archives? So I built a public web archives repsitory:

https://github.com/webrecorder/public-web-archives

And with this work we are using WAM – the Web Archiving Manifest. And added a WARC source URI and WARC creation date field to the WARC Header at the moment.

So, Jefferson already talked about patching – patching remote archives from the live web… is an approach where we patch either from live web or from other archives, depending on what is available or missing. So, for instance, if I look at a Washington Post page in the archive from 2nd March… It shows how other archives are being patched in to me to deliver me a page… In the collection I have a think called “patch” that captures this.

Once pages are patched, then we introduce extraction… We are extracting again using remote archiving and automatic patching. So you combine extraction and patching features. You create two patches and two WARC files. I’ll demo that as well… So, here’s a page from the CCA website and we can patch that… And then extract that… And then when we patch again we get the images, the richer content, a much better recording of the page. So we have 2 WARCs here – one from the British Library archive, one from the patching that might be combined and used to enrich that partial UKWA capture.

Similarly we can look at a CNN page and take patches from e.g. the Portuguese archive. And once it is done we have a more complete archive… When we play this back you can display the page as it appeared, and patch files are available for archives to add to their copy.

So, this is all in beta right now but we hope to release it all in the near future…

Q&A

Q1) Every web archive already has a temporal issue where the content may come from other dates than the page claims to have… But you could aggrevate that problem. Have you considered this?

A1) Yes. There are timebounds for patching. And also around what you display to the user so they understand what they see… e.g. to patch only within the week or the month…

Q2) So it’s the closest date to what is in web recorder?

A2) The other sources are the closest successful result on/closest to the date from another site…

Q3) Rather than a fixed window for collection, seeing frequently of change might be useful to understand quality/relevance… But I think you are replaying

A3)Have you considered a headless browser… with the address bar…

A3 – Lozana) Actually for us the key use case is about highlighting and showcasing old art works to the users. It is really important to show the original page as it appeared – in the older browsers like Netscape etc.

Q4) This is increadibly exciting. But how difficult is the patching… What does it change?

A4) If you take a good capture and a static image is missing… Those are easy to patch in… If highly contextualised – like Facebook, that is difficult to do.

Q5) Can you do this in realtime… So you archive with Perma.cc then you want to patch something immediately…

A5) This will be in the new version I hope… So you can check other sources and fall back to other sources and scenarios…

Comment –  Lozana) We have run UX work with an archiving organisation in Europe for cultural heritage and their use case is that they use Archive-It and do QA the next day… Crawl might mix something but highly dynamic, so want to quickly be able to patch it pretty quickly.

Ilya) If you have an archive that is not in the public archive list on Github please do submit it as a fork request and we’ll be able to add it…

Leveraging APIs (Chair: Nicholas Taylor)

Fernando Melo and Joao Nobre: Arquivo.pt API: enabling automatic analytics over historical web data

Fernando: We are a publicly available web archive, mainly of Portuguese websites from the .pt domain. So, what can you do with out API?

Well, we built our first image search using our API, for instance a way to explore Charlie Hebdo materials; another application enables you to explore information on Portuguese politicians.

We support the Memento protocol, and you can use the Memento API. We are one of the time gates for the time travel searches. And we also have full text search as well as URL search, though our OpenSearch API. We have extended our API to support temporal searches in the portuguese web. Find this at: http://arquivo.pt/apis/opensearch/. Full text search requests can be made through a URL query, e.g. http://arquivp.pt/opensearch?query=euro 2004 would search for mentions of euro 2004, and you can add parameters to this, or search as a phrase rather than keywords.

You can also search mime types – so just within PDFs for instance. And you can also run URL searches – e.g. all pages from the New York Times website… And if you provide time boundaries the search will look for the capture from the nearest date.

Joao: I am going to talk about our image search API. This works based on keyword searches, you can include operators such as limiting to images from a particular site, to particular dates… Results are ordered by relevance, recency, or by type. You can also run advanced image searches, such as for icons, you can use quotation marks for names, or a phrase.

The request parameters include:

  • query
  • stamp – timestamp
  • Start – first index of search
  • safe Image (yes; no; all) – restricts search only to safe images.

The response is returned in json with total results, URL, width, height, alt, score, timestamp, mime, thumbnail, nsfw, pageTitle fields.

More on all of this: http://arquivo.pt/apis

Q&A

Q1) How do you classify safe for work/not safe for work

A1 – Fernando) This is a closed beta version. Safe for work/nsfw is based on classification worked around training set from Yahoo. We are not for blocking things but we want to be able to exclude shocking images if needed.

Q1) We have this same issue in the GifCities project – we have a manually curated training set to handle that.

Comment) Maybe you need to have more options for that measure to provide levels of filtering…

Q2) With that json response, why did you include title and alt text…

A2) We process image and extract from URL, the image text… So we capture the image, the alt text, but we thought that perhaps the page title would be interesting, giving some sense of context. Maybe the text before/after would also be useful but that takes more time… We are trying to keep this working

Q3) What is the thumbnail value?

A3) It is in base 64. But we can make that clearer in the next version…

Nicholas Taylor: Lots more LOCKSS for web archiving: boons from the LOCKSS software re-architecture

This is following on from the presentation myself and colleagues did at last year’s IIPC on APIs.

LOCKSS came about from a serials librarian and a computer scientist. They were thinking about emulating the best features of the system for preserving print journals, allowing libraries to conserve their traditional role as preserver. The LOCKSS boxes would sit in each library, collecting from publishers’ website, providing redundancy, sharing with other libraries if and when that publication was no longer available.

18 years on this is a self-sustaining programme running out of Stanford, with 10s of networks and hundreds of partners. Lots of copies isn’t exclusive to LOCKSS but it is the decentralised replication model that addresses the long term bit integrity is hard to solve, that more (correlated) copies doesn’t necessarily keep things safe and can make it vulnerable to hackers. So this model is community approved, published on, and well established.

Last year we started re-architecting the LOCKSS software so that it becomes a series of websites. Why do this? Well to reduce support and operation costs – taking advantage of other softwares on the web and web archiving,; to de silo components and enable external integration – we want components to find use in other systems, especially in web archiving; and we are preparing to evolve with the web, to adapt our technologies accordingly.

What that means is that LOCKSS systems will treat WARC as a storage abstraction, and more seamlessly do this, processing layers, proxies, etc. We also already integrate Memento but this will also let us engage WASAPI – which there will be more in our next talk.

We have built a service for bibliographic metadata extraction, for web harvest and file transfer content; we can map values in DOM tree to metadata fields; we can retrieve downloadable metadata from expected URL patterns; and parse RIS and XML by schema. That model shows our bias to bibliographic material.

We are also using plugins to make bibliographic objects and their metadata on many publishing platforms machine-intelligible. We mainly work with publishing/platform heuristics like Atypon, Digital Commons, HighWire, OJS and Silverchair. These vary so we have a framework for them.

The use cases for metadata extraction would include applying to consistent subsets of content in larger corpora; curating PA materials within broader crawls; retrieve faculty publications online; or retrieve from University CMSs. You can also undertake discovery via bibliographic metadata, with your institutions OpenURL resolver.

As described in 2005 D-Lib paper by DSHR et al, we are looking at on-access format migration. For instance x-bitmap to GIF.

Probably the most important core preservation capability is the audit and repair protocol. Network nodes conduct polls to validate integrity of distributed copies of data chunks. More nodes = more security – more nodes can be down; more copies can be corrupted… The notes do not trust each other in this model and responses cannot be cached. And when copies do not match, the node audits and repairs.

We think that functionality may be useful in other distributed digital preservation networks, in repository storage replication layers. And we would like to support varied back-ends including tape and cloud. We haven’t built those integrations yet…

To date our progress has addressed the WARC work. By end of 2017 we will have Docker-ised components, have a web harvest framework, polling and repair web service. By end of 2018 we will have IP address and Shibboleth access to OpenWayBack…

By all means follow and plugin. Most of our work is in a private repository, which then copies to GitHub. And we are moving more towards a community orientated software development approach, collaborating more, and exploring use of LOCKSS technologies in other contexts.

So, I want to end with some questions:

  • What potential do you see for LOCKSS technologies for web archiving, other use cases?
  • What standards or technologies could we use that we maybe haven’t considered
  • How could we help you to use LOCKSS technologies?
  • How would you like to see LOCKSS plug in more to the web archiving community?

Q&A

Q1) Will these work with existing LOCKSS software, and do we need to update our boxes?

A1) Yes, it is backwards compatible. And the new features are containerised so that does slightly change the requirements of the LOCKSS boxes but no changes needed for now.

Q2) Where do you store biblographic metadata? Or is in the WARC?

A2) It is separate from the WARC, in a database.

Q3) With the extraction of the metadata… We have some resources around translators that may be useful.

Q4 – David) Just one thing of your simplified example… For each node… They all have to calculate a new separate nonce… None of the answers are the same… They all have to do all the work… It’s actually a system where untrusted nodes are compared… And several nodes can’t gang up on the other… Each peer randomly decides on when to poll on things… There is  leader here…

Q5) Can you talk about format migration…

A5) It’s a capability already built into LOCKSS but we haven’t had to use it…

A5 – David) It’s done on the requests in http, which include acceptable formats… You can configure this thing so that if an acceptable format isn’t found, then you transform it to an acceptable format… (see the paper mentioned earlier). It is based on mime type.

Q6) We are trying to use LOCKSS as a generic archive crawler… Is that still how it will work…

A6) I’m not sure I have a definitive answer… LOCKSS will still be web harvesting-based. It will still be interesting to hear about approaches that are not web harvesting based.

A6 – David) Also interesting for CLOCKSS which are not using web harvesting…

A6) For the CLOCKSS and LOCKSS networks – the big networks – the web harvesting portfolio makes sense. But other networks with other content types, that is becoming more important.

Comment) We looked at doing transformation that is quite straightforward… We have used an API

Q7) Can you say more about the community project work?

A7) We have largely run LOCKSS as more of an in-house project, rather than a community project. We are trying to move it more in the direction of say, Blacklight, Hydra….etc. A culture change here but we see this as a benchmark of success for this re-architecting project… We are also in the process of hiring a partnerships manager and that person will focus more on creating documentation, doing developer outreach etc.

David: There is a (fragile) demo that you can have a lot of this… The goal is to continue that through the laws project, as a way to try this out… You can (cautiously) engage with that at demo.laws.lockss.org but it will be published to GitHub at some point.

Jefferson Bailey & Naomi Dushay: WASAPI data transfer APIs: specification, project update, and demonstration

Jefferson: I’ll give some background on the APIs. This is an IMLS funded project in the US looking at Systems Interoperability and Collaborative Development for Web Archives. Our goals are to:

  • build WARC and derivative dataset APIs (AIT and LOCKSS) and test via transfer to partners (SUL, UNT, Rutgers) to enable better distributed preservation and access
  • Seed and launch community modelled on characteristics of successful development and participation from communities ID’d by project
  • Sketch a blueprint and technical model for future web archiving APIs informed by project R&D
  • Technical architecture to support this.

So, we’ve already run WARC and Digital Preservation Surveys. 15-20% of Archive-it users download and locally store their WARCS – for various reasons – that is small and hasn’t really moved, that’s why data transfer was a core area. We are doing online webinars and demos. We ran a national symposium on API based interoperability and digital preservation and we have white papers to come from this.

Development wise we have created a general specification, a LOCKSS implementation, Archive-it implementation, Archive-it API documentation, testing and utility (in progress). All of this is on GitHub.

The WASAPI Archive-it Transfer API is written in python, meets all gen-spec citeria, swagger yaml in the repos. Authorisation uses AIT Django framework (same as web app), not defined in general specification. We are using browser cookies or http basic auth. We have a basic endpoint (in production) which returns all WARCs for that account; base/all results are paginated. In terms of query parameters you can use: filename; filetype; collection (ID); crawl (ID for AID crawl job)) etc.

So what do you get back? A JSON object has: pagination, count, request-url, includes-extra. You have fields including account (Archive-it ID); checksums; collection (Archive-It ID); crawl; craw time; crawl start; filename’ filetype; locations; size. And you can request these through simple http queries.

You can also submit jobs for generating derivative datasets. We use existing query language.

In terms of what is to come, this includes:

  1. Minor AIT API features
  2. Recipes and utilities (testers welcome)
  3. Community building research and report
  4. A few papers on WA APIs
  5. Ongoing surgets and research
  6. Other APIs in WASAPI (past and future)

So we need some way to bring together these APIs regularly. And also an idea of what other APIs we need to support, and how to prioritise that.

Naomi: I’m talking about the Stanford take on this… These are the steps Nicholas, as project owner, does to download WARC files from Archive-it at the moment… It is a 13 step process… And this grant funded work focuses on simplifying the first six steps and making it more manageable and efficient. As a team we are really focused on not being dependent on bespoke softwares, things much be maintainable, continuous integration set up, excellent test coverage, automate-able. There is a team behind this work, and this was their first touching of any of this code – you had 3 neophytes working on this with much to learn.

We are lucky to be just down the corridor from LOCKSS. Our preferred language is Ruby but Java would work best for LOCKSS. So we leveraged LOCKSS engineering here.

The code is at: https://github.com/sul-dlss/wasapi-downloader/.

You only need Java to run the code. And all arguments are documented in Github. You can also view a video demo:

YouTube Preview Image

These videos are how we share our progress at the end of each Agile sprint.

In terms of work remaining we have various tweaks, pull requests, etc. to ensure it is production ready. One of the challenges so far has been about thinking crawls and patches, and the context of the WARC.

Q&A

Q1) At Stanford are you working with the other WASAPI APIs, or just the downloads one.

A1) I hope the approach we are taking is a welcome one. But we have a lot of projects taking place, but we are limited by available software engineering cycles for archives work.

Note that we do need a new readme on GitHub

Q2) Jefferson, you mentioned plans to expand the API, when will that be?

A2 – Jefferson) I think that it is pretty much done and stable for most of the rest of the year… WARCs do not have crawl IDs or start dates – hence adding crawl time.

Naomi: It was super useful that a different team built the downloader was separate from the team building the WASAPI as that surfaced a lot of the assumptions, issues, etc.

David: We have a CLOCKSS implementation pretty much building on the Swagger. I need to fix our ID… But the goal is that you will be able to extract stuff from a LOCKSS box using WASAPI using URL or Solr text search. But timing wise, don’t hold your breath.

Jefferson: We’d also like others feedback and engagement with the generic specification – comments welcome on GitHub for instance.

Web archives platforms & infrastructure (Chair: Andrew Jackson)

Jack Cushman & Ilya Kreymer: Thinking like a hacker: security issues in web capture and playback

Jack: We want to talk about securing web archives, and how web archives can get themselves into trouble with security… We want to share what we’ve learnt, and what we are struggling with… So why should we care about security as web archives?

Ilya: Well web archives are not just a collection of old pages… No, high fidelity web archives run entrusted software. And there is an assumption that a live site is “safe” so nothing to worry about… but that isn’t right either..

Jack: So, what could a page do that could damage an archive? Not just a virus or a hack… but more than that…

Ilya: Archiving local content… Well a capture system could have privileged access – on local ports or network server or local files. It is a real threat. And could capture private resources into a public archive. So. Mitigation: network filtering and sandboxing, don’t allow capture of local IP addresses…

Jack: Threat: hacking the headless browser. Modern captures may use PhantomJS or other browsers on the server, most browsers have known exploits. Mitigation: sandbox your VM

Ilya: Stealing user secrets during capture… Normal web flow… But you have other things open in the browser. Partial mitigation: rewriting – rewrite cookies to exact path only; rewrite JS to intercept cookie access. Mitigation: separate recording sessions – for webrecorder use separate recording sessions when recording credentialed content. Mitigation: Remote browser.

Jack: So assume we are running MyArchive.com… Threat: cross site scripting to steal archive login

Ilya: Well you can use a subdomain…

Jack: Cookies are separate?

Ilya: Not really.. In IE10 the archive within the archive might steal login cookie. In all browsers a site can wipe and replace cookies.

Mitigation: run web archive on a separate domain from everything else. Use iFrames to isolate web archive content. Load web archive app from app domain, load iFrame content from content domain. As Webrecorder and Perma.cc both do.

Jack: Now, in our content frame… how back could it be if that content leaks… What if we have live web leakage on playback. This can happen all the time… It’s hard to stop that entirely… Javascript can send messages back and fetch new content… to mislead, track users, rewrite history. Bonus: for private archives – any of your captures could eport any of your other captures.

The best mitigation is a Content-Security-Policy header can limit access to web archive domain

Ilya: Threat: Show different age contents when archives… Pages can tell they’re in an archive and act differently. Mitigation: Run archive in containerised/proxy mode browser.

Ilya: Threat: Banner spoofing… This is a dangerous but quite easy to execute threat. Pages can dynamically edit the archives banner…

Jack: Suppose I copy the code of a page that was captured and change fake evidence, change the metadata of the date collected, and/or the URL bar…

Ilya: You can’t do that in Perma because we use frames. But if you don’t separate banner and content, this is a fairly easy exploit to do… So, Mitigation: Use iFrames for replay; don’t inject banner into replay frame… It’s a fidelity/security trade off.. .

Jack: That’s our top 7 tips… But what next… What we introduce today is a tool called http://warc.games. This is a version of webrecorder with every security problem possible turned on… You can run it locally on your machine to try all the exploits and think about mitigations and what to do about them!

And you can find some exploits to try, some challenges… Of course if you actually find a flaw in any real system please do be respectful

Q&A

Q1) How much is the bug bounty?! [laughs] What do we do about the use of very old browsers…

A1 – Jack) If you use an old browser you may be compromised already… But we use the most robust solution possible… In many cases there are secure options that work with older browsers too…

Q2) Any trends in exploits?

A2 – Jack) I recommend the book A Tangled Book… And there is an aspect that when you run a web browser there will always be some sort of issue

A2 – Ilya) We have to get around security policies to archive the web… It wasn’t designed for archiving… But that raises its own issues.

Q3) Suggestions for browser makers to make these safer?

A3) Yes, but… How do you do this with current protocols and APIs

Q4) Does running old browsers and escaping from containers keep you awake at night…

A4 – Ilya) Yes!

A4 – Jack) If anyone is good at container escapes please do write that challenge as we’d like to have it in there…

Q5) There’s a great article called “Familiarity builds content” which notes that old browsers and softwares get more vulnerable over time… It is particularly a big risk where you need old software to archive things…

A5 – Jack) Thanks David!

Q6) Can you saw more about the headers being used…

A6) The idea is we write the CSP header to only serve from the archive server… And they can be quite complex… May want to add something of your own…

Q7) May depend on what you see as a security issue… for me it may be about the authenticity of the archive… By building something in the website that shows different content in the archive…

A7 – Jack) We definitely think that changing the archive is a security threat…

Q8) How can you check the archives and look for arbitrary hacks?

A8 – Ilya) It’s pretty hard to do…

A8 – Jack) But it would be a really great research question…

Mat Kelly & David Dias: A collaborative, secure, and private InterPlanetary WayBack web archiving system using IPFS

David: Welcome to the session on going InterPlanatary… We are going to talk about peer to peer and other technology to make web archiving better…

We’ll talk about InterPlanatary File System (IPFS) and InterPlanatary WayBack (IPWB)…

IPFS is also known as  the distributed web, moving from location based to content based… As we are aware, the web has some problems… You have experience of using a service, accessing email, using a document… There is some break in connectivity… And suddenly all those essential services are gone… Why? Why do we need to have the services working in such a vulnerable way… Even a simple page, you lose a connection and you get a 404. Why?

There is a real problem with permanence… We have this URI, the URL, telling us the protocol, location and content path… But when we come back later – weeks or months – and that content has moved elsewhere… Either somewhere else you can find, or somewhere you can’t. Sometimes it’s like the content has been destroyed… But every time people see a webpage, you download it to your machine… These issues come from location addressing…

In content addressing we tie content to a unique hash that identifies the item… So a Content Identifier (CID) allows us to do this… And then, in a network, when I look for that data… If there is a disruption to the network, we can ask any machine where the content is… And the node near you can show you what is available before you ever go to the network.

IPFS is already used in video streaming (inc. Netflix), legal documents, 3D models – with Hollolens for instance, for games, for scientific data and papers, blogs and webpages, and totally distributed web apps.

IPFS allows this to be distributed, offline, saves space, optimise bandwidth usage, etc.

Mat: So I am going to talk about IPWB. Motivation here is the persistence of archived web data dependent on resilience of organisation and availability of data. The design is extending the CDXJ format, with indexing and IPFS dissemination procedure, and Replay and IPFS Pull Procedure. So in an adapted CDXJ adds a header with the hash for the content to the metadata structure.

Dave: One of the ways IPFS is making changes in the boundary is in browser tab, in browser extension and service worker as a proxy for requests the browser makes, with no changes to the interface (that one is definitely in alpha!)…

So the IPWB can expose the content to the IPFS and then connect and do everything in the browser without needing to download and execute code on their machine. Building it into the browser makes it easy to use…

Mat: And IPWB enables privacy, collaboration and security, building encryption method and key into the WARC. Similarly CDXJs may be transferred for our users’ replay… Ideally you won’t need a CDZJ on your own machine at all…

We are also rerouting, rather than rewriting, for archival replay… We’ll be presenting on that late this summer…

And I think we just have time for a short demo…

For more see: https://github.com/oduwsdl/ipwb

Q&A

Q1) Mat, I think that you should tell that story of what you do…

A1) So, I looked for files on another machine…

A1 – Dave) When Mat has the archive file on a remote machine… Someone looks for this hash on the network, send my way as I have it… So when Mat looked, it replied… so the content was discovered… request issued, received content… and presented… And that also lets you capture pages appearing differently in different places and easily access them…

Q2) With the hash addressing, are there security concerns…

A2 – Dave) We use Multihash, using Shard… But you can use different hash functions, they just verify the link… In IPFS we prevent issue with self-describable data functions..

Q3) The problem is that the hash function does end up in the URL… and it will decay over time because the hash function will decay… Its a really hard problem to solve – making a choice now that may be wrong… But there is no way of choosing the right choice.

A3) At least we can use the hash function to indicate whether it looks likely to be the right or wrong link…

Q4) Is hash functioning itself useful with or without IPFS… Or is content addressing itself inherently useful?

A4 – Dave) I think the IPLD is useful anyway… So with legal documents where links have to stay in tact, and not be part of the open web, then IPFS can work to restrict that access but still make this more useful…

Q5) If we had a content addressable web, almost all these web archiving issues would be resolved really… IT is hard to know if content is in Archive 1 or Archive 2. A content addressable web would make it easier to be archived.. Important to keep in mind…

A5 – Dave) I 100% agree! Content addressed web lets you understand what is important to capture. And IPTF saves a lot of bandwidth and a lot of storage…

Q6) What is the longevity of the hashs and how do I check that?

A6 – Dave) OK, you can check the integrity of the hash. And we have filecoin.io which is a blockchain [based storage network and cryptocurrency and that does handle this information… Using an address in a public blockchain… That’s our solution for some of those specific problems.

Andrew Jackson (AJ), Jefferson Bailey (JB), Kristinn Sigurðsson (KS) & Nicholas Taylor (NT): IIPC Tools: autumn technical workshop planning discussion

AJ: I’ve been really impressed with what I’ve seen today. There is a lot of enthusiasm for open source and collaborative approaches and that has been clear today and the IIPC wants to encourage and support that.

Now, in September 2016 we had a hackathon but there were some who just wanted to get something concrete done… And we might therefore adjust the format… Perhaps pre-define a task well ahead of time… But also a parallel track for the next hackathon/more experimental side. Is that a good idea? What else may be?

JB: We looked at Archives Unleashed, and we did a White House Social Media Hackathon earlier this year… This is a technical track but… it’s interesting to think about what kind of developer skills/what mix will work best… We have lots of web archiving engineers… They don’t use the software that comes out of it… We find it useful to have archivists in the room…

Then, from another angle, is that at the hackathons… IIPC doesn’t have a lot of money and travel is expensive… The impact of that gets debated – it’s a big budget line for 8-10 institutions out of 53 members. The outcomes are obviously useful but… If people expect to be totally funded for days on end across the world isn’t feasible… So maybe more little events, or fewer bigger events can work…

Comment 1) Why aren’t these sessions recorded?

JB: Too much money. We have recorded some of them… Sometimes it happens, sometimes it doesn’t…

AJ: We don’t have in-house skills, so it’s third party… And that’s the issue…

JB: It’s a quality thing…

KS: But also, when we’ve done it before, it’s not heavily watched… And the value can feel questionable…

Comment 1) I have a camera at home!

JB: People can film whatever they want… But that’s on people to do… IIPC isn’t an enforcement agency… But we should make it clear that people can film them…

KS: For me… You guys are doing incredible things… And it’s things I can’t do at home. The other aspect is that… There are advancements that never quite happened… But I think there is value in the unconference side…

AJ: One of the things with unconference sessions is that

NT: I didn’t go to the London hackathon… Now we have a technical team, it’s more appealling… The conference in general is good for surfacing issues we have in common… such as extraction of metadata… But there is also the question of when we sit down to deal with some specific task… That could be useful for taking things forward..

AJ: I like the idea of a counter conference, focused on the tools… I was a bit concerned that if there were really specific things… What does it need to be to be worth your organisations flying you to them… Too narrow and it’s exclusionary… Too broad and maybe it’s not helpful enough…

Comment 2) Worth seeing the model used by Python – they have a sprint after their conference. That isn’t an unconference but lets you come together. Mozilla Fest Sprint picks a topic and then next time you work on it… Sometimes looking at other organisations with less money are worth looking at… And for things like crowd sourcing coverage etc… There must be models…

AJ: This is cool.. You will have to push on this…

Comment 3) I think that tacking on to a conference helps…

KS: But challenging to be away from office more than 3/4 days…

Comment 4) Maybe look at NodeJS Community and how they organise… They have a website, NodeSchool.io with three workshops… People organise events pretty much monthly… And create material in local communities… Less travel but builds momentum… And you can see that that has impact through local NodeJS events now…

AJ: That would be possible to support as well… with IIPC or organisational support… Bootstrapping approaches…

Comment 5) Other than hackathon there are other ways to engage developers in the community… So you can engage with Google Summer of Code for instance – as mentors… That is where students look for projects to work on…

JB: We have two GSoC and like 8 working without funding at the moment… But it’s non trivial to manage that…

AJ: Onboarding new developers in any way would be useful…

Nick: Onboarding into the weird and wacky world of web archiving… If IIPC can curate a lot of onboarding stuff, that would be really good for potential… for getting started… Not relying on a small number of people…

AJ: We have to be careful as IIPC tools page is very popular, but hard to keep up to date… Benefits can be minor versus time…

Nick: Do you have GitHub? Just put up an awesome lise!

AJ: That’s a good idea…

JB: Microfunding projects – sub $10k is also an option for cost recovered brought out time for some of these sorts of tasks… That would be really interesting…

Comment 6) To expand on Jefferson and Nick were saying… I’m really new… Went to IIPC in April. I am enjoying this and learning this a lot… I’ve been talking to a lot of you… That would really help more people get the technical environment right… Organisations want to get into archiving on a small scale…

Olga: We do have a list on GitHub… but not up to date and well used…

AJ: We do have this document, we have GitHub… But we could refer to each other… and point to the getting started stuff (only). Rather get away from lists…

Comment 7) Google has an OpenSource.guide page – could take inspiration from that… Licensing, communities, etc… Very simple plain English getting started guide/documentation…

Comment 8) I’m very new to the community… And I was wondering to what extent you use Slack and Twitter between events to maintain these conversations and connections?

AJ: We have a Slack channel, but we haven’t publicised it particularly but it’s there… And Twitter you should tweet @NetPreserve and they will retweet then this community will see that…

Apr 092017
 
Digital Footprint MOOC logo

Last Monday we launched the new Digital Footprint MOOC, a free three week online course (running on Coursera) led by myself and Louise Connelly (Royal (Dick) School of Veterinary Studies). The course builds upon our work on the Managing Your Digital Footprints research project, campaign and also draws on some of the work I’ve been doing in piloting a Digital Footprint training and consultancy service at EDINA.

It has been a really interesting and demanding process working with the University of Edinburgh MOOCs team to create this course, particularly focusing in on the most essential parts of our Digital Footprints work. Our intention for this MOOC is to provide an introduction to the issues and equip participants with appropriate skills and understanding to manage their own digital tracks and traces. Most of all we wanted to provide a space for reflection and for participants to think deeply about what their digital footprint means to them and how they want to manage it in the future. We don’t have a prescriptive stance – Louise and I manage our own digital footprints quite differently but both of us see huge value in public online presence – but we do think that understanding and considering your online presence and the meaning of the traces you leave behind online is an essential modern life skill and want to contribute something to that wider understanding and debate.

Since MOOCs – Massive Open Online Courses – are courses which people tend to take in their own time for pleasure and interest but also as part of their CPD and personal development so that fit of format and digital footprint skills and reflection seemed like a good fit, along with some of the theory and emerging trends from our research work. We also think the course has potential to be used in supporting digital literacy programmes and activities, and those looking for skills for transitioning into and out of education, and in developing their careers. On that note we were delighted to see the All Aboard: Digital Skills in Higher Education‘s 2017 event programme running last week – their website, created to support digital skills in Ireland, is a great complementary resource to our course which we made a (small) contribution to during their development phase.

Over the last week it has been wonderful to see our participants engaging with the Digital Footprint course, sharing their reflections on the #DFMOOC hashtag, and really starting to think about what their digital footprint means for them. From the discussion so far the concept of the “Uncontainable Self” (Barbour & Marshall 2012) seems to have struck a particular chord for many of our participants, which is perhaps not surprising given the degree to which our digital tracks and traces can propagate through others posts, tags, listings, etc. whether or not we are sharing content ourselves.

When we were building the MOOC we were keen to reflect the fact that our own work sits in a context of, and benefits from, the work of many researchers and social media experts both in our own local context and the wider field. We were delighted to be able to include guest contributors including Karen Gregory (University of Edinburgh), Rachel Buchanan (University of Newcastle, Australia), Lilian Edwards (Strathclyde University), Ben Marder (University of Edinburgh), and David Brake (author of Sharing Our Lives Online).

The usefulness of making these connections across disciplines and across the wider debate on digital identity seems particularly pertinent given recent developments that emphasise how fast things are changing around us, and how our own agency in managing our digital footprints and digital identities is being challenged by policy, commercial and social factors. Those notable recent developments include…

On 28th March the US Government voted to remove restrictions on the sale of data by ISPs (Internet Service Providers), potentially allowing them to sell an incredibly rich picture of browsing, search, behavioural and intimate details without further consultation (you can read the full measure here). This came as the UK Government mooted the banning of encryption technologies – essential for private messaging, financial transactions, access management and authentication – claiming that terror threats justified such a wide ranging loss of privacy. Whilst that does not seem likely to come to fruition given the economic and practical implications of such a measure, we do already have the  Investigatory Powers Act 2016 in place which requires web and communications companies to retain full records of activity for 12 months and allows police and security forces significant powers to access and collect personal communications data and records in bulk.

On 30th March, a group of influential privacy researchers, including danah boyd and Kate Crawford, published Ten simple rules for responsible big data research in PLoSOne. The article/manifesto is an accessible and well argued guide to the core issues in responsible big data research. In many ways it summarises the core issues highlight in the excellent (but much more academic and comprehensive) AoIR ethics guidance. The PLoSOne article is notably directed to academia as well as industry and government, since big data research is at least as much a part of commercial activity (particularly social media and data driven start ups, see e.g. Uber’s recent attention for profiling and manipulating drivers) as traditional academic research contexts. Whilst academic research does usually build ethical approval processes (albeit conducted with varying degrees of digital savvy) and peer review into research processes, industry is not typically structured in that way and often not held to the same standards particularly around privacy and boundary crossing (see, e.g. Michael Zimmers work on both academic and commercial use of Facebook data).

The Ten simple rules… are also particularly timely given the current discussion of Cambridge Analytica and it’s role in the 2016 US Election, and the UK’s EU Referendum. An article published in Das Magazin in December 2016, and a subsequent English language version published on Vice’s Motherboard have been widely circulated on social media over recent weeks. These articles suggest that the company’s large scale psychometrics analysis of social media data essentially handed victory to Trump and the Leave/Brexit campaigns, which naturally raises personal data and privacy concerns as well as influence, regulation and governance issues. There remains some skepticism about just how influential this work was… I tend to agree with Aleks Krotoski (social psychologist and host of BBC’s The Digital Human) who – speaking with Pat Kane at an Edinburgh Science Festival event last night on digital identity and authenticity – commented that she thought the Cambridge Analytica work was probably a mix of significant hyperbole but also some genuine impact.

These developments focus attention on access, use and reuse of personal data and personal tracks and traces, and that is something we we hope our MOOC participants will have opportunity to pause and reflect on as they think about what they leave behind online when they share, tag, delete, and particularly when they consider terms and conditions, privacy settings and how they curate what is available and to whom.

So, the Digital Footprint course is launched and open to anyone in the world to join for free (although Coursera will also prompt you with the – very optional – possibility of paying a small fee for a certificate), and we are just starting to get a sense of how our videos and content are being received. We’ll be sharing more highlights from the course, retweeting interesting comments, etc. throughout this run (which began on Monday 3rd April), but also future runs since this is an “on demand” MOOC which will run regularly every four weeks. If you do decide to take a look then I would love to hear your comments and feedback – join the conversation on #DFMOOC, or leave a comment here or email me.

And if you’d like to find out more about our digital footprint consultancy, or would be interested in working with the digital footprints research team on future work, do also get in touch. Although I’ve been working in this space for a while this whole area of privacy, identity and our social spaces seems to continue to grow in interest, relevance, and importance in our day to day (digital) lives.

 

Mar 142017
 

Today and tomorrow I’m in Birmingham for the Jisc Digifest 2017 (#digifest17). I’m based on the EDINA stand (stand 9, Hall 3) for much of the time, along with my colleague Andrew – do come and say hello to us – but will also be blogging any sessions I attend. The event is also being livetweeted by Jisc and some sessions livestreamed – do take a look at the event website for more details. As usual this blog is live and may include typos, errors, etc. Please do let me know if you have any corrections, questions or comments. 

Plenary and Welcome

Liam Earney is introducing us to the day, with the hope that we all take some away from the event – some inspiration, an idea, the potential to do new things. Over the past three Digifest events we’ve taken a broad view. This year we focus on technology expanding, enabling learning and teaching.

LE: So we will be talking about questions we asked through Twitter and through our conference app with our panel:

  • Sarah Davies (SD), head of change implementation support – education/student, Jisc
  • Liam Earney (LE), director of Jisc Collections
  • Andy McGregor (AM), deputy chief innovation officer, Jisc
  • Paul McKean (PM), head of further education and skills, Jisc

Q1: Do you think that greater use of data and analytics will improve teaching, learning and the student experience?

  • Yes 72%
  • No 10%
  • Don’t Know 18%

AM: I’m relieved at that result as we think it will be important too. But that is backed up by evidence emerging in the US and Australia around data analytics use in retention and attainment. There is a much bigger debate around AI and robots, and around Learning Analytics there is that debate about human and data, and human and machine can work together. We have several sessions in that space.

SD: Learning Analytics has already been around it’s own hype cycle already… We had huge headlines about the potential about a year ago, but now we are seeing much more in-depth discussion, discussion around making sure that our decisions are data informed.. There is concern around the role of the human here but the tutors, the staff, are the people who access this data and work with students so it is about human and data together, and that’s why adoption is taking a while as they work out how best to do that.

Q2: How important is organisational culture in the successful adoption of education technology?

  • Total make or break 55%
  • Can significantly speed it up or slow it down 45%
  • It can help but not essential 0%
  • Not important 0%

PM: Where we see education technology adopted we do often see that organisational culture can drive technology adoption. An open culture – for instance Reading College’s open door policy around technology – can really produce innovation and creative adoption, as people share experience and ideas.

SD: It can also be about what is recognised and rewarded. About making sure that technology is more than what the innovators do – it’s something for the whole organisation. It’s not something that you can do in small pockets. It’s often about small actions – sharing across disciplines, across role groups, about how technology can make a real difference for staff and for students.

Q3: How important is good quality content in delivering an effective blended learning experience?

  • Very important 75%
  • It matters 24%
  • Neither 1%
  • It doesn’t really matter 0%
  • It is not an issue at all 0%

LE: That’s reassuring, but I guess we have to talk about what good quality content is…

SD: I think materials – good quality primary materials – make a huge difference, there are so many materials we simply wouldn’t have had (any) access to 20 years ago. But also about good online texts and how they can change things.

LE: My colleague Karen Colbon and I have been doing some work on making more effective use of technologies… Paul you have been involved in FELTAG…

PM: With FELTAG I was pleased when that came out 3 years ago, but I think only now we’ve moved from the myth of 10% online being blended learning… And moving towards a proper debate about what blended learning is, what is relevant not just what is described. And the need for good quality support to enable that.

LE: What’s the role for Jisc there?

PM: I think it’s about bringing the community together, about focusing on the learner and their experience, rather than the content, to ensure that overall the learner gets what they need.

SD: It’s also about supporting people to design effective curricula too. There are sessions here, talking through interesting things people are doing.

AM: There is a lot of room for innovation around the content. If you are walking around the stands there is a group of students from UCL who are finding innovative ways to visualise research, and we’ll be hearing pitches later with some fantastic ideas.

Q4: Billions of dollars are being invested in edtech startups. What impact do you think this will have on teaching and learning in universities and colleges?

  • No impact at all 1%
  • It may result in a few tools we can use 69%
  • We will come to rely on these companies in our learning and teaching 21%
  • It will completely transform learning and teaching 9%

AM: I am towards the 9% here, there are risks but there is huge reason for optimism here. There are some great companies coming out and working with them increases the chance that this investment will benefit the sector. Startups are keen to work with universities, to collaborate. They are really keen to work with us.

LE: It is difficult for universities to take that punt, to take that risk on new ideas. Procurement, governance, are all essential to facilitating that engagement.

AM: I think so. But I think if we don’t engage then we do risk these companies coming in and building businesses that don’t take account of our needs.

LE: Now that’s a big spend taking place for that small potential change that many who answered this question perceive…

PM: I think there are saving that will come out of those changes potentially…

AM: And in fact that potentially means saving money on tools we currently use by adopting new, and investing that into staff..

Q5: Where do you think the biggest benefits of technology are felt in education?

  • Enabling or enhancing learning and teaching activities 55%
  • In the broader student experience 30%
  • In administrative efficiencies 9%
  • It’s hard to identify clear benefits 6%

SD: I think many of the big benefits we’ve seen over the last 8 years has been around things like online timetables – wider student experience and administrative spaces. But we are also seeing that, when used effectively, technology can really enhance the learning experience. We have a few sessions here around that. Key here is digital capabilities of staff and students. Whether awareness, confidence, understanding fit with disciplinary practice. Lots here at Digifest around digital skills. [sidenote: see also our new Digital Footprint MOOC which is now live for registrations]

I’m quite surprised that 6% thought it was hard to identify clear benefits… There are still lots of questions there, and we have a session on evidence based practice tomorrow, and how evidence feeds into institutional decision making.

PM: There is something here around the Apprentice Levy which is about to come into place. A surprisingly high percentage of employers aren’t aware that they will be paying that actually! Technology has a really important role here for teaching, learning and assessment, but also tracking and monitoring around apprenticeships.

LE: So, with that, I encourage you to look around, chat to our exhibitors, craft the programme that is right for you. And to kick that off here is some of the brilliant work you have been up to. [we are watching a video – this should be shared on today’s hashtag #digifest17]
And with that, our session ended. For the next few hours I will mainly be on our stand but also sitting in on Martin Hamilton’s session “Loving the alien: robots and AI in education” – look out for a few tweets from me and many more from the official live tweeter for the session, @estherbarrett.

Plenary and keynote from Geoff Mulgan,chief executive and CEO, Nesta (host: Paul Feldman, chief executive, Jisc)

Paul Feldman: Welcome to Digifest 2017, and to our Stakeholder Meeting attendees who are joining us for this event. I am delighted to welcome Geoff Mulgan, chief executive of Nesta.

Geoff: Thank you all for being here. I work at Nesta. We are an investor for quite a few ed tech companies, we run a lot of experiments in schools and universities… And I want to share with you two frustrations. The whole area of ed tech is, I think, one of the most exciting, perhaps ever! But the whole field is frustrating… And in Britain we have phenomenal tech companies, and phenomenol universities high in the rankings… But too rarely we bring these together, and we don’t see that vision from ministers either.

So, I’m going to talk about the promise – some of the things that are emerging and developing. I’ll talk about some of the pitfalls – some of the things that are going wrong. And some of the possibilities of where things could go.

So, first of all, the promise. We are going through yet another wave – or series of waves – of Google Watson, Deepmind, Fitbits, sensors… We are at least 50 years into the “digital revolution” and yet the pace of change isn’t letting up – Moore’s Law still applies. So, finding the applications is as exciting and challenging as possible.

Last year Deep Mind defeated a champion of Go. People thought that it was impossible for a machine to win at Go, because of the intuition involved. That cutting edge technology is now being used in London with blood test data to predict who may be admitted to hospital in the next year.

We have also seen these free online bitesize platforms – Coursera, Udacity, etc. – these challenges to trditional courses. And we have Google Translate in November 2016 adopting a neural machine translation engine that can translate whole sentences… Google Translate may be a little clunky still but we are moving toward that Hitchikers Guide to the Galaxy idea of the Babelfish. In January 2017 a machine-learning powered poker bot outcompeted 20 of the world’s best. We are seeing more of these events… The Go contest was observed by 280 million people!

Much of this technology is feeding into this emerging Ed Tech market. There are MOOCs, there are learning analytics tools, there is a huge range of technologies. The UK does well here… When you talk about education you have to talk about technology, not just bricks and mortar. This is a golden age but there are also some things not going as they should be…

So, the pitfalls. There is a lack of understanding of what works. NESTA did a review 3 years ago of school technologies and that was quite negative in terms of return on investment. And the OECD similarly compared spend with learning outcomes and found a negative correlation. One of the odd things about this market is that it has invested very little in using control groups, and gathering the evidence.

And where is the learning about learning? When the first MOOCs appeared I thought it was extraordinary that they showed little interested in decades of knowledge and understanding about elearning, distance learning, online learning. They just shared materials. It’s not just the cognitive elements, you need peers, you need someone to talk to. There is a common finding over decades that you need that combination of peer and social elements and content – that’s one of the reasons I like FutureLearn as it combines that more directly.

The other thing that is missing is the business models. Few ed tech companies make money… They haven’t looked at who will pay, how much they should pay… And I think that reflects, to an extent, the world view of computer scientists…

And I think that business model wise some of the possibilities are quite alarming. Right now many of the digital tools we use are based on collecting our data – the advertisor is the customer, you are the product. And I think some of our ed tech providers, having failed to raise income from students, is somewhat moving in that direction. We are also seeing household data, the internet of things, and my guess is that the impact of these will raise much more awareness of privacy, security, use of data.

The other thing is jobs and future jobs. Some of you will have seen these analyses of jobs and the impact of computerisation. Looking over the last 15 years we’ve seen big shifts here… Technical and professional knowledge has been relatively well protected. But there is also a study (Frey, C and Osborne, M 2013) that looks at those at low risk of computerisation and automation – dentists are safe! – and those at high risk which includes estate agents, accountants, but also actors and performers. We see huge change here. In the US one of the most popular jobs in some areas is truck drivers – they are at high risk here.

We are doing work with Pearson to look at job market requirements – this will be published in a few months time – to help educators prepare students for this world. The jobs likely to grow are around creativity, social intelligence, also dexterity – walking over uneven ground, fine manual skills. If you combine those skills with deep knowledge of technology, or specialised fields, you should be well placed. But we don’t see schools and universities shaping their curricula to these types of needs. Is there a concious effort to look ahead and to think about what 16-22 year olds should be doing now to be well placed in the future?

In terms of more positive possibilities… Some of those I see coming into view… One of these, Skills Route, which was launched for teenagers. It’s an open data set which generates a data driven guide for teenagers about which subjects to study. Allowing teenagers to see what jobs they might get, what income they might attract, how happy they will be even, depending on their subject choices. These insights will be driven by data, including understanding of what jobs may be there in 10 years time. Students may have a better idea of what they need than many of their teachers, their lecturers etc.

We are also seeing a growth of adaptive learning. We are an investor in CogBooks which is a great example. This is a game changer in terms of how education happens. The way AI is built it makes it easier for students to have materials adapt to their needs, to their styles.

My colleagues are working with big cities in England, including Birmingham, to establish Offices of Data Analytics (and data marketplaces), which can enable understanding of e.g. buildings at risk of fire that can be mitigated before fire fighting is needed. I think there are, again, huge opportunities for education. Get into conversations with cities and towns, to use the data commons – which we have but aren’t (yet) using to the full extent of its potential.

We are doing a project called Arloesiadur in Wales which is turning big data into policy action. This allowed policy makers in Welsh Government to have a rich real time picture of what is taking place in the economy, including network analyses of investors, researchers, to help understand emerging fields, targets for new investment and support. This turns the hit and miss craft skill of investment into something more accurate, more data driven. Indeed work on the complexity of the economy shows that economic complexity maps to higher average annual earnings. This goes against some of the smart cities expectation – which wants to create more homogenous environments. Instead diversity and complexity is beneficial.

We host at NESTA the “Alliance for Useful Evidence” which includes a network of around 200 people trying to ensure evidence is used and useful. Out o fthat we have a serues of “What Works” centres – NiCE (health and care); Education Endowment Fund; Early Intervention Foundation; Centre for Ageing Better; College of Policing (crime reduction); Centre for Local Econoic Growth; What Works Well-being… But bizarrely we don’t have one of these for education and universities. These centres help organisations to understand where evidence for particular approaches exists.

To try and fill the gap a bit for universities we’ve worked internationally with the Innovation Growth Lab to understand investment in research, what works properly. This is applying scientific methods to areas on the boundaries of university. In many ways our current environment does very little of that.

The other side of this is the issue of creativity. In China the principal of one university felt it wasn’t enough for students to be strong in engineering, they needed to solve problems. So we worked with them to create programmes for students to create new work, addressing problems and questions without existing answers. There are comparable programmes elsewhere – students facing challenges and problems, not starting with the knowledge. It’s part of the solution… But some work like this can work really well. At Harvard students are working with local authorities and there is a lot of creative collaboration across ages, experience, approaches. In the UK there isn’t any uniersity doing this at serious scale, and I think this community can have a role here…

So, what to lobby for? I’ve worked a lot with government – we’ve worked with about 40 governments across the world – and I’ve seen vice chancellors and principles who have access to government and they usually lobby for something that looks like the present – small changes. I have never seen them lobby for substantial change, for more connection with industry, for investment and ambition at the very top. The leaders argue for the needs of the past, not the present. That is’t true in other industries they look ahead, and make that central to their case. I think that’s part of why we don’t see this coming together in an act of ambition like we saw in the 1960s when the Open University founded.

So, to end…

Tilt is one of the most interesting things to emerge in the last few years – a 3D virtual world that allows you to paint with a Tilt brush. It is exciting as no-one knows how to do this. It’s exciting because it is uncharted territory. It will be, I think, a powerful learning tool. It’s a way to experiment and learn…

But the other side of the coin… The British public’s favourite painting is The Fighting Temorare… An ugly steamboat pulls in a beautiful old sailing boat to be smashed up. It is about technological change… But also about why change is hard. The old boat is more beautiful, tied up with woodwork and carpentry skills, culture, songs… There is a real poetry… But it’s message is that if you don’t go through that, we don’t create space for the new. We are too attached to the old models to let them go – especially the leaders who came through those old models. We need to create those Google Tilts, but we also have to create space for the new to breath as well.

Q&A

Q1 – Amber Thomas, Warwick) Thinking about the use of technology in universities… There is research on technology in education and I think you point to a disconnect between the big challenges from research councils and how research is disseminated, a disconnect between policy and practice, and a lack of availability of information to practitioners. But also I wanted to say that BECTA used to have some of that role for experimentation and that went in the “bonfire of the quangos”. And what should Jisc’s role be here?

A1) There is all of this research taking place but it is often not used, That emphasis on “Useful Evidence” is important. Academics are not always good at this… What will enable a busy head teacher, a busy tutor, to actually understand and use that evidence. There are some spaces for education at schools level but there is a gap for universities. BECTA was a loss. There is a lack of Ed Tech strategy. There is real potential. To give an example… We have been working with finance, forcing banks to open up data, with banks required by the regulator to fund creative use of that data to help small firms understand their finance. That’s a very different role for the regulator… But I’d like to see institutions willing to do more of that.

A1 – PF) And I would say we are quietly activist.

Q2) To go back to the Hitchhikers Guide issue… Are we too timid in universities?

A2) There is a really interesting history of radical universities – some with no lectures, some no walls, in Paris a short-lived experiment handing out degrees to strangers on buses! Some were totally student driven. My feeling is that that won’t work, it’s like music and you need some structure, some grammars… I like challenge driven universities as they aren’t *that* groundbreaking… You have some structure and content, you have an interdisciplinary teams, you have assessment there… It is a space for experimentation. You need some systematic experimentation on the boundaries… Some creative laboritories on the edge to inform the centre, with some of that quite radical. And I think that we lack those… Things like the Coventry SONAR (?) course for photography which allowed input from the outside, a totally open course including discussion and community… But those sorts of experiments tend not to be in a structure… And I’d like to see systematic experimentation.

Q3 – David White, UAL) When you put up your ed tech slide, a lot of students wouldn’t recognise that as they use lots of free tools – Google etc. Maybe your old warship is actually the market…

A3) That’s a really difficult question. In any institution of any sense, students will make use of the cornucopia of free things – Google Hangouts and YouTube. That’s probably why the Ed Tech industry struggles so much – people are used to free things. Google isn’t free – you indirectly pay through sale of your data as with Facebook. Wikipedia is free but philanthropically funded. I don’t know if that model of Google etc. can continue as we become more aware of data and data use concerns. We don’t know where the future is going… We’ve just started a new project with Barcelona and Amsterdam around the idea of the Data Commons, which doesn’t depend on sale of data to advertisors etc. but that faces the issue of who will pay. My guess is that the free data-based model may last up to 10 years, but then something will change…

How can technology help us meet the needs of a wider range of learners

Pleasing Most of the People Most of the Time – Julia Taylor, subject specialist (accessibility and inclusion), Jisc.

I want to tell you a story about buying LEGO for a young child… My kids loved LEGO and it’s changed a lot since then… I brought a child this pack with lots of little LEGO people with lots of little hats… And this child just sort of left all the people on the carpet because they wanted the LEGO people to choose their own hats and toys… And that was disappointing… And I use that example is that there is an important role to help individuals find the right tools. The ultimate goal of digital skills and inclusion is about giving people the skills and confidence to use the appropriate tools. The idea is that the tools magically turn into tools…

We’ve never had more tools for giving people independence… But what is the potential of technology and how it can be selected and used. We’ll hear more about delivery and use of technology in this context. But I want to talk about what technology is capable of delivering…

Technology gives us the tools for digital diversity, allowing the student to be independent about how they access and engage with our content. That kind of collaboration can also be as meaningful in the context internationally, as it is for learners who have to fit studies around, say, shift work. It allows learners to do things the way they want to do it. That idea of independent study through digital technology is really important. So these tools afford digital skills, the tools remove barriers and/or enable students to overcome the. Technology allows learners with different needs to overcome challenges – perhaps of physical disability, perhaps remote location, perhaps learners with little free time. Technology can help people take those small steps to start or continue their education. It’s as much about that as those big global conversations.

It is also the case that technology can be a real motivator and attraction for some students. And the technology can be about overcoming a small step, to deal with potential intimidation at new technology, through to much more radical forms that keeps people engaged… So when you have tools aimed at the larger end of the scale, you also enable people at the smaller end of the scale. Students do have expectations, and some are involved in technology as a lifestyle, as a life line, that supports their independence… They are using apps and tools to run their life. That is the direction of travel with people, and with young people. Technology is an embedded part of their life. And we should work with that, perhaps even encouraged to use more technology, to depend on it more. Many of us in this room won’t have met a young visually impaired person who doesn’t have an iPhone as those devices allow them to read, to engage, to access their learning materials. Technology is a lifeline here. That’s one example, but there are others… Autistic students may be using an app like “Brain in Hand” to help them engage with travel, with people, with education. We should encourage this use, and we do encourage this use of technology.

We encourage learners to check if they can:

  • Personalise and customise the learning environment
  • Get text books in alternative formats – that they can adapt and adjust as they need
  • Find out about the access features of loan devices and platforms – and there are features built into devices and platforms you use and require students to use. How much do you know about the accessibility of learning platforms that you buy into.
  • Get accessible course notes in advance of lectures – notes that can be navigated and adapted easily, taking away unnecessary barriers. Ensuring documents are accessible for the maximum number of people.
  • Use productivity tools and personal devices everywhere – many people respond well to text to speech, it’s useful for visually impaired students, but also for dyslexic students too.

Now we encourage organisations to make their work accessible to the most people possible. For instance a free and available text to speech tool provides technology that we know works for some learners, for the wide range of learners. That helps those with real needs, but will also benefits other learners, including some who would never disclose a challenge or disability.

So, when you think about technology, think about how you can reach the widest possible range of learners. This should be part of course design, staff development… All areas should include accessible and inclusive technologies.

And I want you now to think about the people and infrastructure required and involved in these types of decisions…  So I have some examples here about change…

What would you need to do to enable a change in practice like this learner statement:

“Usually I hate fieldwork. I’m disorganised, make illegible notes, can’t make sense of the data because we’ve only got little bits of the picture until the evening write up…” 

This student isn’t benefitting from the fieldwork until the information is all brought together. The teacher dealt with this by combining data, information, etc. on the learner’s phone, including QR codes to help them learn… That had an impact and the student continues:

“But this was easy – Google forms. Twitter hashtags. Everything on the phone. To check a technique we scanned the QR code to watch videos. I felt like a proper biologist… not just a rubbish notetaker.”

In another example a student who didn’t want to speak in a group and was able to use a Text Wall to enable their participation in a way that worked for them.

In another case a student who didn’t want to blog but it was compulsory in their course. But then the student discovered they could use voice recognition in GoogleDocs and how to do podcasts and link them in… That option was available to everyone.

Comment: We are a sixth form college. We have a student who is severely dyslexic and he really struggled with classwork. Using voice recognition software has been transformative for that student and now they are achieving the grades and achievements they should have been.

So, what is needed to make this stuff happen. How can we make it easy for change to be made… Is inclusion part of your student induction? It’s hard to gauge from the room how much of this is endemic in your organisations. You need to think about how far down the road you are, and what else needs to be done so that the majority of learners can access podcasts, productivity tools, etc.

[And with that we are moving to discussion.]

Its great to hear you all talking and I thought it might be useful to finish by asking you to share some of the good things that are taking place…

Comment: We have an accessibility unit – a central unit – and that unit provides workshops on technologies for all of the institution, and we promote those heavily in all student inductions. Also I wanted to say that note taking sometimes is the skill that students need…

JT: I was thinking someone would say that! But I wanted to make the point that we should be providing these tools and communicating that they are available… There are things we can do but it requires us to understand what technology can do to lower the barrier, and to engage staff properly. Everyone needs to be able to use and promote technology for use…

The marker by which we are all judged is the success of our students. Technology must be inclusive for that to work.

You can find more resources here:

  • Chat at Todaysmeet.com/DF1734
  • Jisc A&I Offer: TinyURL.com/hw28e42
  • Survey: TinyURL.com/jd8tb5q

How can technology help us meet the needs of a wider range of learners? – Mike Sharples, Institute of Educational Technology, The Open University / FutureLearn

I wanted to start with the idea of accessibility and inclusion. As you may already know the Open University was established in the 1970s to open up university to a wider range of learners… In 1970 19% of our students hadn’t been to University before, now it’s 90%. We’re rather pleased with that! As a diverse and inclusive university accessibility and inclusivity is essential for that. As we move towards more interactive courses, we have to work hard to make fieldtrips accessible to people who are not mobile, to ensure all of our astronomy students access to telescopes, etc.

So, how do we do this? The learning has to be future orientated, and suited to what they will need in the future. I like the idea of the kinds of jobs you see on Careers 2030 – Organic Voltaics Engineer, Data Wrangler, Robot Counsellor – the kinds of work roles that may be there in the future. At the same time of looking to the future we need to also think about what it means to be in a “post truth era” – with accessibility of materials, and access to the educational process too. We need a global open education.

So, FutureLearn is a separate but wholly owned company of the Open University. There are 5.6 million learners, 400 free courses. We have 70 partner institutions, with 70% of learners from outside the UK, 61% are female, and 22% have had no other tertiary education.

When we came to build FutureLearn we had a pretty blank slate. We had EdX and similar but they weren’t based on any particular pedagogy – built around extending the lectures, and around personalised quizzes etc. And as we set up FutureLearn we wanted to encourage a social constructivist model, and the idea of “Learning as Conversation”, based on the idea that all learning is based on conversation – with oursleves, with our teachers and their expertise, and with other learners to try and reach shared understanding. And that’s the brief our software engineers took on. We wanted it to be scalable, for every piece of content to have conversation around it – so that rather than sending you to forums, the conversation sat with the content. And also the idea of peer review, of study groups, etc.

So, for example, the University of Auckland have a course on Logical and Critical thinking. Linked to a video introducing the course is a conversation, and that conversation includes facilitative mentors… And engagement there is throughout the conversation… Our participants have a huge range of backgrounds and locations and that’s part of the conversation you are joining.

Now 2012 was the year of the MOOC, but now they are becoming embedded, and MOOCs need to be taken seriously as part of campus activities, as part of blended learning. In 2009 the US DoE undertook a major meta-study of comparisons of online and face to face teaching in higher education. On average students in online learning conditions performed better than those receiving face to face online, but those undertaking a blend of campus and online did better.

So, we are starting to blend campus and online, with campus students accessing MOOCs, with projects and activities that follow up MOOCs, and we now have the idea of hybrid courses. For example FutureLearn has just offered its full post graduate course with Deakin University. MOOCs are no longer far away from campus learning, they are blending together in new ways of accessing content and accessing conversation. And it’s the flexibility of study that is so important here. There are also new modes of learning (e.g. flipped learning), as well as global access to higher education, including free coures, global conversation and knowledge sharing. The idea of credit transfer and a broader curriculum enabled by that. And the concept of disaggregation – affordable education, pay for use? At the OU only about a third of our students use the tutoring they are entitled to, so perhaps those that use tutoring should pay (only).

As Geoff Mulgan said we do lack evidence – though that is happening. But we also really need new learning platforms that will support free as well as accredited courses, that enables accreditation, credit transfer, badging, etc.

Q&A

Q1) How do you ensure the quality of the content on your platform?

A1) There are a couple of ways… One was in our selective choice of which universities (and other organisations) we work with. So that offers some credibility and assurance. The other way is through the content team who advise every partner, every course, who creates content for FutureLearn. And there are quite a few quality standards – quite a lot of people on FutureLearn came from the BBC and they come with a very clear idea of quality – there is diversity of the offer but the quality is good.

Q2) What percentage of FutureLearn learners “complete” the course?

A2) In general its about 15-20%. Those 15% ish have opportunities they wouldn’t have other have had. We’ve also done research on who drops out and why… Most (95%) say “it’s not you, it’s me”. Some of those are personal and quite emptional reasons. But mainly life has just gotten in the way and they want to return. Of those remaining 5% about half felt the course wasn’t at quite the right level for them, the other half just didn’t enjoy the platform, it wasn’t right for them.

So, now over to you to discuss…

  1. What pedagogy, ways of doing teaching and learning, would you bring in.
  2. What evidence? What would consitute success in terms of teaching and learning.

[Discussion]

Comments: MOOCs are quite different from modules and programmes of study.. Perhaps there is a branching off… More freestyle learning… The learner gets value from whatever paths they go through…

Comments: SLICCs at Edinburgh enable students to design their own module, reflecting and graded against core criteria, but in a project of their own shaping. [read more here]

Comments: Adaptive learning can be a solution to that freestyle learning process… That allows branching off, the algorithm to learn from the learners… There is also the possibility to break a course down to smallest components and build on that.

I want to focus a moment on technology… Is there something that we need.

Comments: We ran a survey of our students about technologies… Overwhelmingly our students wanted their course materials available, they weren’t that excited by e.g. social media.

Let me tell you a bit about what we do at the Open University… We run lots of courses, each looks difference, and we have a great idea of retention, student satisfaction, exam scores. We find that overwhelmingly students like content – video, text and a little bit of interactivity. But students are retained more if they engage in collaborative learning. In terms of student outcomes… The lowest outcomes are for courses that are content heavy… There is a big mismatch between what students like and what they do best with.

Comment: There is some research on learning games that also shows satisfaction at the time doesn’t always map to attainment… Stretching our students is effective, but it’s uncomfortable.

Julia Taylor: Please do get in touch if you more feedback or comments on this.

Dec 052016
 
Image credit: Brian Slater

This is a very wee blog post/aside to share the video of my TEDxYouth@Manchester talk, “What do your digital footprints say about you?”:

You can read more on the whole experience of being part of this event in my blog post from late November.

It would appear that my first TEDx, much like my first Bright Club, was rather short and sweet (safely within my potential 14 minutes). I hope you enjoy it and I would recommend catching up with my fellow speakers’ talks:

Kat Arney

YouTube Preview Image

Ben Smith

YouTube Preview Image

VV Brown

YouTube Preview Image

Ben Garrod

YouTube Preview Image

I gather that the videos of the incredible teenage speakers and performers will follow soon.