Jun 212018
 

Today I am at the Digital Scotland 2018 Conference in Glasgow – if you are along do give me a wave (you’ll find me easily from the glare of my mirrored protractor brooch!). i’ll be liveblogging today, with the usual caveats that I welcome any additions, corrections, etc.

Introduction by Conference ChairAlisdair Gunn, Director, Framewire, & BIMA Scotland Council Member.
Good morning and thanks for coming today to the inaugeral Digital Scotland conference. I’m Alistair Gunn and I’m honoured to be your chair today. Today’s conference brings together suppliers, local and national government, and technologists together. At the beginning of 2017 the Scottish Government published the Digital Scotland strategy, and each of us play a role in delivering that strategy which aims for inclusive economic growth, and to make Scotland one of the world’s leading digital nations. We really encourage tweeting and sharing of today’s conference, using #digitalscotland. The event is also being livestreamed by ProductForge. Due to overwhelming interest today I can also announce that the Digital Scotland event will run again next year.
Keynote: ‘Harnessing technology for the benefit of society’
Chris Yiu, Senior Policy Fellow for Technology, The Tony Blair Institute for Global Change.
I wanted to kick off with a birds eye view of technology and policy themes as we see them at the Institute. So, first I want to start with some numbers. Martha Lane Foxes doteveryone charity surveyed people across the UK: 50% said “the internet has made life a lot better for people like me. But only 12% (one in ten) said “the internet has had a very positive impact on society”. So, something has gone very wrong. What I’d like to talk about today is what those issues might be, and what we might do as policy makers.
Some of the questions here are issues like surveillance – whether we are being monitored, if there will be another data breach. But that can be countered with rights. Manipulation – are my apps, are the tools I use manipulating me. Or will we use technology for wellbeing, for making lives better. Polarisation – fake news, polarised political views. But technology also has the potential to build and enable community. Stagnation vs Prosperity; Automation or Meaningful work and life; Indifference or Fairness; Excess – e.g. bitcoin’s environmental impact vs sustainability; and vulnerability vs security. None of these questions are easily answered but we face them every day, and we have to think about how policy makers and politics address those questions.
Right now politics is no longer left or right. And you can also think about the world against Politics mapped against Tech. We can visualised this as a grid from False Nostalgia; Incremental progress; Exponential progress; and Tech nationalism (moving clockwise from bottom left). We mainly focus on that nostalgia for a pre-tech world adn incremental progress. We should be focusing on the other side of this graph – China a leading on that Tech nationalism area. But the real world concerns is about affordable good quality healthcare, education, social change, and those are unchanging concerns in many ways, we need those to shape our use of technology.
We also have increasing AI possibilities. As a hobby I make a list of exciting AI examples (see deepindex.org). At first these were fun silly things, but now it is supply chain management, risk management, predictive analysis in insurance, diagnostic medicine tools that outperform human specialists. Some see this as a threat to jobs. I think this is about freeing up human time for more important interactions. This stuff is real, the opportunity is there. And actually where you see this technology now is everyday in AirBnB, in Amazon, eBay, Facebook, Google, Instagram, Netflix, Spotify, Twitter, Uber… And these companies do powerful things with AI which raise all sorts of policy channels. Rules, laws and regulations are frequently pre internet, whilst these companies have new business models, new ways of working. No matter what you think of these companies they are delivering services to huge numbers of people, many of whom are satisfied with that experience. Amazon for instance is one of the most loved companies on the planet, transforming the highstreet – with shops closing… But also the customer experience is light years ahead, and it would be good to see more shops doing that which is where service design comes in.
On service design… I try to book an appointment at my family doctor. Maybe once I’d have written a letter, then I could go in in person, or call… As we get further along, maybe I can book online without that telephone queue wait. Sounds good, and it sounds achievable… But those apps and the technology that pervades our lives has changed the rules…. And if you haven’t seen it you should read Jeff Bezos’ letter to shareholders where he says “customers are perfectly discontent”… As user experience becomes better, perfect user experience is also racing away… So nevermind that online booking of appointment – maybe I want to videocall my doctor right now. And this is hugely important. My experience as a consumer looks one way, my experience as a citizen lags behind. Maybe there was always a lag but the wider that gap in experience is, the less appealing and sustainable participation in society. That’s a threat to transactional stuff – moves to private schools or healthcase – but more importantly ceasing to try to make your voice heard…
But I think there are things we can do:
1. A structured dialogue between the change makers and the policy makers
2. A better approach to regulation that is built around the reality of the internet – that means root and branch reform that is built on internet as the norm. That’s harder to do today – it’s a reserved issue. But it’s also a global issue.
3. An ambitious policy platform that gives people hope about the future – be bold enough to see the potential to make a radical difference to the stuff that matters. That needs focus – we can’t do everything – and really requires work across sectors with government and local authorities working with private and third sectors.
I’m going to leave that here and will be around to engage in the conversation throughout the day. Thank you.
Digital Transformation Panel
Join us for a conversation on how digital is transforming Scotland, and what more needs to be done to fulfil the potential of our communities and the nation.Featuring:
  • Alistair Gunn (AG), Chair of session
  • Colin Cook (CC), Director, Digital, The Scottish Government; 
  • Martyn Wallace (MW), Chief Digital Officer, ‎The Digital Office Scottish Local Government; 
  • Polly Purvis (PP), Chief Executive, ScotlandIS; 
  • Cat Leaver (CL), Project Director, Brand Scotland.
CC: Chris was absolutely right in an aspirational way – we really want to be ambitious and transformational. And no-one in Government thinks that transformational change is possible on our own, government has to work together with other sectors with common programmes and pieces of work. Equally important as that is that government doesn’t think that the public sector can be working alone here, and see the value of working with private sector. We work in different ways with the private sector, and CivTech is a part of that, but there is more to do that. We have more than achieved our targets on broadband rollout, we have key programmes on identity management, and we have more to do and are looking forward to what can come out of the discussions today.
MW: Echoing what Colin said, we are facing all kinds of challenges in the public sector. I came from the private sector and thought I knew the public sector, but its so much more complex – the things we do every day saving lives, facing challenges, we have technical and HR skills to address as well as cultural shifts that need to be there. I was at an event on Friday asking if we were tinkering or whether we are transforming. I think in many ways we are still tinkering but we need to be transforming… We have more pilots than RyanAir – we need to be more agile, iterate… We shouldn’t have these big media-covered collapses if we do that. And we need public understanding that we can’t do what we used to as we just don’t have the finances to do it. Sometimes you don’t need a doctor – you need a GP, or an information source, and that saves money in the system. We have to have a balanced approach – investing but also changing the conversation at the top. There is great collaboration across Scottish and local government. We need common data standards and interoperability to get this right – otherwise we are handing over public money to service integrators that we can ill afford.
PP: We represent everyone – from Microsoft, CGI, SkyScanner, to small companies and everything in between. We work closely with government both to help them but also to hold their feet to the fire. We need to really focus on the opportunities that technology provide. There are such bright futures possible here, but we have to get more people involved and engaged. That is a really serious challenge. Lifelong learning has to be for all of us all the time to make this work.
AG: This is part of why agile, service design, and actually you could almost drop the private and public sector labels… It’s all one sector in some ways. But as Martin said, we need to communicate what we do here in Scotland.
CL: I am about to take that journey from private to public sector, like Martyn has. Brand Scotland is about telling our stories, helping those outside of Scotland understand our assets here, our stories, and highlighting innovation, the private sector successes here, the public sector achievements.
AG: I saw a TEDx talk where the speaker spoke of innovation, noting that the public sector is one of the best innovators, and funds universities to innovate too. Looking forward, Colin, what are the sorts of things you are thinking about. We have the Digital Strategy in place. What’s next? Should there be another strategy?
CC: We love a strategy! But that’s not the critical thing. Picking up on what Cat said, we need a clear articulation of the type of digital nation that we want to be in Scotland. We are starting to recognise that and build on that. And actually there are particular aspects of Scotland wanting inclusive growth – broadband is very important to ensuring no one gets left behind. Standards is important, common approaches to online identity is important. We have new powers – social security, employability, etc. which allows us to make changes in what we do, and there is a lot of opportunity. Going back to what Chris said, these are platforms and small changes but the real change will happen where all conferences, all influencers are thinking about digital first. We are seeing that starting to happen and I’m confident that that’s where we are going.
AG: And it’s starting to think about the services side of things… There is strong presence with the health sector – what engagement do you see there?
CL: We ran a youth event alongside the big conference on digital health in Glasgow recently. We had wondered if the youth would engage in the health innovation side of things but there were 600 young people connecting with digital health, and they particularly reflected those shifting expectations and familiarity with digital experiences that Chris talked about.
MW: “Digital” isn’t coming… It’s here in your phones, it’s the self-checkout at the supermarket… It is transforming day to day life already. Service design is crucial here, we promote and adopt in digital local government. We can’t run things as we did in the past – I want to access healthcare through my Alexa, my connected home… I want feedback. That’s proper transformation… Being kept alive and being looked after is crucial for people and to do that in practicce transformation just is digital, it doesn’t need that special label anymore.
AG: Are we missing a skills gap here, a training requirement?
PP: In this room we are on the same page. We have digital skills and enthusiasm but so much of Scotland is not there yet. Some of our agricultural companies and farms know they will lose labour in the future – they are starting to automate with robots. There will be jobs in the future… Young people understand that… But we all need those skills, that understanding of AI, robotics, IoT. We need to understand and upskill… But our school system is so focused on skills education for childhood through universities… But we need more there, we need skills development throughout life to allow us to embrace technology and future opportunities.
MW: I wanted to say that SCVO do have a Digital Participation Charter – look at it, sign up. But it’s not basic skills anymore, its needed skills. We as an organisation need to address this, we need skills development, we need to ensure we are ready for the future, we need to have conversations with staff. The basic most needed skills are a great place to start and build on.
AG: We know that the private sector creates services and products… Do we see that flow between sectors. It is exciting to be in the public sector – big digital transformation projects. Are there things that the public sector can be talking to the private sector about.
CC: I’ve been in the public sector for 16 years. I think my learning now is as great as it has ever been in my personal career – you learn from the sector, from the market… Yes there is a lot we can learn from. As we approach digital transformation we want to encourage co-location, we want to encourage skills transfer – we see change but we see changes in the companies working with us. That’s a hugely inspirational place to be. We all struggle to attract and retain the talent we need but we really need that, we need people to bring in their skills and expertise.
AG: We are seeing products coming out for companies to work on… CivTech is good – how does that scale?
MW: At the launch of a programme in London there was a lot of discussion of learning from the lead of Scottish public sector. We have truly innovative companies in Scotland, how do we engage with that? CivTech’s one of the best examples of that we’ve seen.
AG: How does Scotland do benchmarking internationally in this space?
CL: Surveys show we aren’t known for innovation and technology – lots of whisky, golf and salmon (I’m over simplifying that) – so there is a lot of work to do to tell those stories, and we all need to get involved in that.
Q&A
Q1 – Janet Roberts, EDINA) I would be interested to know how the inclusion agenda runs through your work. My organisation runs the Digimaps for Schools service promoting digital skills and critical thinking with young people. We are heavily involved in the Edinburgh and South East Scotland City Region Deal and wanted to ask you about how Scotland makes our strong inclusion ambitions happen.
A1 – PP) We have to be so careful in this room – we talk jargon that is inaccessible to many. City Deal has huge possibility but also talks about data science, AI… and those won’t be familiar terms for many. So how do we make sure that the City Deal reaches that 21% of people in the City Region living in poverty – mainly in more rural places? How do we make that work relevant to everybody and bring them into these opportunities so there is real local benefit. There are no easy answer there but it’s so important.
A1 – MW) People talk about digital by default – that’s exclusionary – so we talk about digital first. We know there are connectivity challenges. We can push on 5G. We have some local authorities looking at poverty levels, connectivity, and look at how they can ensure communities do have access to the internet. You have to iterate and work on this – you can’t wait until everyone is online, but you have to be aware that not everyone is or will want to engage online.
A1 – CC) We have relaunched the national performance framework which talks about opportunities for all and inclusive development. That doesn’t mean you don’t invest in centres of excellence – like Edinburgh – but it means spreading that knowledge and expertise more widely. The City Region Deal in Edinburgh is a great example of that work to ensure development is inclusive.
A1 – CL) We did workshops with Turning Point, with former homeless people… They were asking what data is, how they can have email – needed for job applications and systems – when they have no access to computers. So we really do need to ensure we don’t leave anyone out of this.
Q2) How do we make free wifi available across Scotland, including rural communities, so that vulnerable people, poorer communities, etc. have access to the internet to be able to participate here.
A2 – MW) Scotland has real challenges around connectivity – we have hills and valleys here that make it complex; small rural communities have few users and have high costs and low return on investment. 5G has great potential but actually it uses lower bandwidth so you need more masts, more backhaul. But we do have shared masts across Scotland where O2/Vodafone boxes are on every mast to enable emergency services connectivity. But you can get 99% of coverage but… I live in a rural community myself and know the challenge there. And we have to work in partnership and in new ways…
A2 – CC) We do have close working with companies – and that’s fairly unique to Scotland. We are working on City Superfast Broadband, and there are 5G announcements today, so it is also always improving.
KNOWLEDGE EXCHANGE MASTERCLASSES AND INTERACTIVE WORKSHOPS: 

Across the day there are three sessions running in parallel for each of these Masterclasses, so for all of these sessions I’ll be blogging from the ones I’ve signed up for. For this session that’s The Economy…

SESSION 1: THE ECONOMY: Supporting growth in the digital sector

Panellists:

  • Polly Purvis (PP), Chief Executive, ScotlandIS; 
  • Colin Cook (CC), Director, Digital, The Scottish Government;
  • Melinda Matthews Clarkson (MMC), Chief Executive, CodeClan; 
  • Maggie Morrison (MM), Vice President, Public Sector, CGI.
    Chaired by: Alisdair Gunn (AG), Director, Framewire & BIMA Scotland Council Member.

MMC: My background is twenty years in software and computer programming – my first application was instant messaging on a mainframe. I run CodeClan, and we are the only certified digital skills and code academy in the country. I am a huge cheerleader for change and I am so excited about what Scotland can and will become.

MM: I’ve over 35 years of experience in IT, starting out in a company working on Telex and Modems but since then have worked with many of the leading multinationals. I spent 14 years at Cisco and HP before joining CGI. My career in tech allowed me to work at the leading edge and travel the world. I worked for companies that still thrive, but also a cluster of tech companies that no longer exist. The pace of change is huge and what keeps me awake at night is ensuring that young people are ready for this, that we are ready for this, that we learn and relearn all the time. It was great to read in The Scotsman that technology adds £3.9B to the Scottish economy, but the pace of change is fast and we have to be ready for that.

AG: Does Scotland have the company base to be competitive?

MM: The scale of start ups in Scotland is increadible – something like 30% of the UK tech start ups come out of Scotland. But we need to get much better at exports – to be more like Ireland at that. As part of the work we do for Glasgow Council we have 52,000 tablets to all Glasgow Schools. Primary 1-6 will have a shared device, above that they will have a personal device. For children in Glasgow that should position them well to explore these devices, to understand what they can do. Educating parents in digital, in the future economy, we have to do all of this.

AG: Melinda, what’s your experience of working with companies?

MMC: The people who engage with us become the biggest fans. There just aren’t enough engaging. And that means communicating what a graduade of CodeClan actually looks like. It’s not (just) a barista training in 16 weeks, so many come in from tech and other sectors with a growth mindset, a really open mind to expand and develop their skills. When people do take on CodeClan graduates they are overwhelmed by the learning potential there. In Scotland there is such innovation and creativity, but there isn’t that culture of competitiveness – and that is where Silicon Valley benefits. That helps you keep up with expectations as they change, so we need some culture change.

AG: What about using

PP: The first thing I want to say is that Britain is a huge tech market so sometimes growth can be Britain, not just internationally. The figures on Scottish tech actually underplays the contribution.  Tech contributes £6B to the Scottish economy. Many of our companies do work internationally, but there is more to do. And we really need to get skills of ecommerce out to other sectors – the death of the highstreet is a challenge, but it is also an opportunity. We need to encourage that. And it is concerning that ecommerce is nowhere in our curriculum in schools and education. That could add hugely. We have some fantastic ingrediants, we just need to build on them… Not every company has an online presence, that’s part of the problem. It’s not about Scotland… Debenhams, House of Fraser, they wouldn’t be in the mess they are in if they’d sorted their ecommerce. It’s a wider problem.

MM: We’ve talked goods and products, but also there is huge opportunity for other types of offers. Y Combinator takes tech businesses (often without a product) through an incubator, time in Silicon Valley, and AirBnB came out of that. We talk too much here about products and goods.

PP: One of the challenges here is that it’s hard to scale services. Products do scale fast… But where I agree with Maggie is that platforms do scale fast. Skyscanner is a product company and a platform company.

MM: It’s not either/or… But it’s “and”. We don’t focus enough or recognise the possibility of non-product offerings.

AG: How do we persuade people that being a backend enabler, a B2B company that helps other companies, is a good thing? We talked about CivTech earlier – how do we scale services out of that to a global thing?

CC: Scaling comes with the challenge of how you keep fast growing and large scale businesses in Scotland, so that the scaling benefits the economy here. We do try to scale services in different ways, we want to scale the CivTech concept in fact.

Q&A

Q1) I wanted to ask about the missing middle. With CivTech… For the public sector how do we procure without “being brought by one of the big boys” being the answer?

A1 – MM) Deals above a certain size have a public benefit threshold. And for big contracts with councils we have a requirement for 25% of work to be undertaken by local businesses, and they have to be paid in 10 days. CGI’s policy it to pay them (not acquire them). That forces multinationals to make sure that 25% of that work comes from local providers – it’s not perfect but it’s the best solution from the public sector so far.

A1 – PP) We have worked with Scottish Government and things are improving. But many of our companies are health organisations but they almost always sell south of the border first, it’s too slow to get into the Scottish market.

A1 – CC) We need to talk earlier in the process before formal procurement. We need new approaches to partnering, some about sharing risk… Some issues are addressed by CivTech, some are addressed elsewhere.

SESSION 2: PUBLIC SERVICES: Designing services around users
Panellists:

  • Cat Macaulay (CM), Head of User Research and Services Design, The Scottish Government;
  • Leah Lockhart (LL), Engagement Consultant;
  • Clare Hillis (CH), Head of Public Sector, Vodafone (chairing)
  • Paul Duffy (PD), Co – Director for IT and Telecommunications, Belfast Health and Social Care Trust.

CH: I’m delighted to be here today to share some of our experiences of designing public sector services for users. We have with us today Cat Macaulay from Scottish Government, Leah Lockhard, engagement consultant and Paul Duffy from Belfast Health and Social Care Trust.

PD: I’m here to represent health and digital really, and improving health and patient care. We look are services across health and social care in Belfast and Northern Ireland. We have too little money and too much expected of us so service design and improvement is crucial to deliver patient care.

CM: I am head of a new department in Scottish Government. We have two big ambitions: to build and manage UCD – user research, content, interaction etc. designers across Scotland; and to be fantastic at designing in inclusive processes that are genuinely centred on the citizen – not only citizens who can read, or see, or hear, or have great mobility etc.

LL: I sit at the intersection of technology, academia, design and art and creative work. Generally I help people understand how to engage people, how to use the internet to engage with people. I’m working with SEPA at the moment on a user research and service design and, breaking news, I’ll be moving with Snook very soon.

CH: I wanted to ask you, Cat, about working with users not neccassarily saying what the challenge is.

CM: Most of the time when service design fails, we are solving the wrong problem. Typically we do things like: the paper form doesn’t work so we digitise the form and wonder why that didn’t work. In Scotland citizens have to drive services and drive policies – so we have to engage people much earlier in the process, and that matters particularly in the digital world where we have to be agile. We can’t put a problem on the back of an envelope, build a huge business case, go to procurement, and then wonder why it doesn’t work for citizens five years later… We need to get away from that.

PD: We don’t want to apply “digital” as the label. We want service improvements where digital is just part of the solution to that. Too often we’ve looked at solutions, and changing working processes to meet the app we’ve brought. That doesn’t work. What you need is for the people who will be changing to help imagine what could work better, how we can improve what we do. So we look at a problem, we look at how we can solve that and where digital does that… The first digital tools will always be the hardest to get in place. What do we mean when we talk about this? Are we going to invest? Are we going to do the hard work to embed it and make it work? If we don’t, if we think digital is done to you, and that it will fix problems alone, then it fails.

LL: I think that change in power dynamics are important. We need to change how we engage… Chris talked earlier about the risk of people opting out because we aren’t being relevant to them.

CH: When you do this, are there challenges around what citizens want and what you can do.

CM: Service design isn’t about asking people what they want. It’s about understanding what we need within the realm of what is possible. We do currently tend to ask people what they want, when we should be asking what they need and how that can drive what we do. The benefit of that approach is that it enables a more mature conversation with citizens.

CH: What has worked best in identifying the right solutions.

PD: Far too many projects start without defining the problem. You have to define the problem. Then identify the ideal solution, and iterate and review that. If you want to fail a digital project, start with unclear objectives! We have actually designed an onboarding process – physicians aren’t managers, they aren’t usually experienced with budgets. I try to avoid pilots, proof of concepts… So our projects are tightly constrained and aim to do something intentionally… We may not always achieve that to time or budget but that constraint is important.

LL: I’ve been through a world of pain when I haven’t properly explained what I do with services, with organisations… So that they understand what we do, so that service design and user research becomes the norm.

Q1) I work for a company that builds products… I wonder how we achieve early access when engaging with authorities and organisations…

A1 – CM) We’ve done lots of work recently to push hard on having users engaged in products throughout the process. We engage users as early as possible in the project, to understand what we should be looking at, getting users to help me understand that we are thinking along the right time… We do research, we want collaborative sense making – compare, contrast, and the differences are where interesting stuff emerges. We currently have over 2000 people on our social care panel, and ways to reach service users for instance.

Q2) Quite often we tend to go to champions or representatives… How do we consult a representative range of citizens.

A2 – CM) We have a bug bear in the public sector about representativeness – this isn’t a thing in the private sector. You will never have a representative of all users. Sometimes you will have lots of people, sometimes it is smaller groups. You will rarely be statistically representative. What you can do is to triangulate, to understand that information in context, to ensure validity of what we do. And this isn’t a science, design is not a science. You need co-design, engagement, and mixed methods.

A2 – LL) You have to be comfortable – and it’s hard for public services to take this on board right now – that this goes on all the time. You don’t then stop, you stay engaged.

Q3) So is this a choice between doing something agile now and long planned projects?

A3) These aren’t separate things. The advantage of agile is that you are rapidly iterating and trying things out. Working on some 3 year project where you are locked away working on a problem isn’t how the private sector works, that agile process is part of those projects.

A3 – PD) If you do agile you need the budget holder engaged – every meeting costs money so you have to stay constrained, you have to actually end up with something. You need to understand how many iterations are appropriate for a given budget. We are currently working on trauma management… Digital can provide a level of non-repudiation and ownership that paper and pager systems don’t allow… But people aren’t stupid so you get resistance, you will have hidden agendas here… And you have to accommodate that in your project budget. But we are about ready to roll that out… Once it’s out there, change is made, and then this service improvement can be applied to other areas. The change has to be owned by the people who will live with that working practice.

Q3) One of the challenges we have is that if your user research shows what you know, then management is reluctant to invest in this work. How we get management buy in for this sort of work?

A3 – CM) Relentless Resilience! We have to change. Our service design has reflected organisations not citizen needs. It’s a cultural change. We build culture change into our organisation but we also have to acknowledge that we don’t know the perfect solution yet. Anyone who says you can apply service design to public services is mad. This is really hard. It’s not the same as designing the experience in a coffee shop here, it’s about “ok, someone fell down the stairs and broke her leg this morning, she’s a single mum of two and one of her kids is in a bit of trouble in school, and she’s behind on her rent”. This isn’t easy stuff to fix.

A3 – PD) It’s about doing this process right, being agile, showing the benefit of working in this way to improve services.

A3 – CM) One way to pick this up is our quality process, and standards. We have to hang in there, its hard.

SESSION 3: HEALTH: Technology enabled health and social care in Scotland
Speakers

  • Michelle Brogan (MB), Home and Mobile Health Monitoring lead, NHS24; 
  • Hazel Archer (HA), Video Conference and Attend Anywhere Lead, NHS24; 
  • Liza McLean (LM), Head of eHealth Strategy and Policy, Scottish Government.
LM: Welcome to this session. I have responsibility for eHealth Strategy and Policy in Scottish Government. The context for this session is Scotland’s Digital Health and Social Care Strategy. As you’ll know, digital disruption is here. Health and Social Care has not, yet, taken full advantage of digital technology. When looking at putting together this strategy I was reminded of Dr Marion Lennon (Strathclyde University) had done on the Delivering Assisted Living Lifestyles at Scale (dallas), particularly around public and consumer readiness, health provider readiness, and key issues around practically putting this into proactive. We have six domains to take forward – do look at the strategy online – but I’m really excited to hear about projects that are already taking place.
MB: I’m here with my colleague Hazel Archer. We are working on technology enabled care programme in Scotland, and our work to date. Then we’ll talk about two examples of video consulting, and of home and mobile health monitoring. Then we’ll have some reflections on key learning.
This TEC (Tech Enabled Care) programme has been working over 3 years with 31 partnerships, 10 of the 14 health boards, and we work on a grants basis to help health boards to transform their services. At the heart of our programme are work strands in: home and mobile health monitoring; video conferencing; digital platforms; expansion of telecare (inc analogue to digital telecare); improvement and support. This isn’t about technology, it’s about transformational change in Scotland. So those areas of work don’t stand alone, it’s about leadership, service redesign, problem solving, etc.
In terms of how many people are using TEC services, we have about 73k people benefitting and are set to have 80k by the end of the project. Our achievements include launch of Attend Anywhere Video Consultation. We have a new National Service Model to support local implementation of Home and Mobile Health Monitoring and largest sale up of blood pressure monitoring in the UK. We are delivering a proof of concept and business case for NHS. And our work has been recognised across Scotland.
Telecare enables a good level of support, in the home, for vulnerable older people who are often live alone. By 2025 analogue phone lines will switch to digital, we need to be ready for that so we have a big piece of work led by Martyn Wallace to make sure we can do this. We also have a role out of cCBT – community based CBT services through a tool called “Beating the Blues” which aims to reach 100% of the population, having started with 10% of people reached by these technology.
We had a review of the TEC programme published in 2018 which has helped us to review what we do.
AH: We developed a video conference suite established across the NHS – we estimate that it saves £5m per year in saved travel. But we’ve not really had that connection to the home. Why do we want to do this? Well lets look at an individual living in the Highlands who is doing home dialysis. They need to be visited daily – thats a 6 hour round trip so that nurse will only see one patient in a day. And Highlands and Islands NHS, travel for patients costs them 5% of their entire budget. So, we’ve tried to do video conferencing in the home for a long time. Right now you can do this through browser tools… This is really a workflow change. Normal NHS workflow includes wait times and flexibility. Video conferencing expects on time delivery but that isn’t real life, so it’s how you you shape that new workflow…
So we have a system (http://www.sctt.org.uk/attendanywhere/) and anyone can go in and start a call – no login or checking, no download etc. There are logins for clinicians that limit what they can access and see, but also limits them to seeing patients with the appropriate conditions. We began roll out in 2016, first patient was 2017. We are now up to almost 600 consultations. User feedback has been really positive – finding this easy to use, and 98% of people said they would “use it again”. And success comes down significantly to what you choose to do – this isn’t the right solution for all issues (e.g. you can’t do an ultrasound over the internet).
Our busiest clinic is Grampian Gastro – one of the biggest challenges for IBS patients is travel. And this is a service that is mostly talking. It works really well. In the Western Isles a Respiratory clinic was involving a specialist flying up for the day, telecare works brilliantly here and saves two days of travel. The latest Western Isles clinic is for hand surgery – again that saves huge travel money.
We also have NHS Near Me (www.sctt.org.uk/attendanywhere/NHSNearMe) which is about the service model around these tools.
MB: I want to quickly talk about another exemplar here. It’s not technology, it’s wrap around service model that adds value; it’s about simple well designed technology. So, health and mobile isn’t that new but we were in “pilot-itis”. The rise in smartphones, increased adoption, and positive interest from health providers have all made big changes. We want better health, better care and better value – remote care at a distance. So we are using home enabled healthcare to really help us get early risk alerts (e.g. diabetic blood levels). One of the cases for change here include the fact that measuring blood pressure is the third commonest reason for attending primary care appointments – 1.2 million appointments And that’s set to double by 2030. In addition measurements in surgery are less accurate compared to home monitoring. Meanwhile 60% of patients don’t take their blood pressure medications regularly/when they should. So we wanted a sustainable TEC enabled model to support automated monitoring, alerts and interventions. The system we use is SMS based but enables real time monitoring by GPs, as well as allowing feedback, interventions, advice, etc. back from the healthcare provider. This is being rolled out to people with initial diagnosis/high reading, through to those with long term issues.
20 out of the 14 health boards use HMHM “simple telehealth” SMS solution integrating into routine care. There is a national blue print digitally enabled pathway for hypertension. Currently reaching 5200 people, will scale to 10,000. This is really scaling up. And the benefits include saving up to 40% of appointments, high level of satisfaction, convenience, treatment compliance, improved blood pressure control and appropriate use of medication.
In summary we hope we’ve demonstrated that TEC is here and happening, delivering real impacts but we need more time, leadership and support for key areas. It’s about designing a new care model, not so much about the technology. It’s iterative scale up informed by data evaluation. We’ve worked strategically to use a “once for Scotland” approach – with standard approaches. But we also need to understand what implementation strategies work best for the future. And we are committed to long term service improvement.
CivTech®update: Alexander Holt, Head of CivTech, The Scottish Government.
CivTech is part of the Digital Directorate in the Scottish Goverment. We want to drive change and innovation that makes people’s lives better. We want to create products and services that make real change. Procurement can be slow – months or years to do and roll out. And that can easily go wrong. Technology moves so quickly… See e.g. the work CENSIS are doing in this very building. The challenge is: how do you procure something that you don’t know exists?
CivTech reframes the whole space.. We look for challenges, and we put those into an ecosystem, there’s a selection process, and an accelerator and one company per challenge gets 20k, then can go on to get 200k. And maybe it also goes global.
An example project here: we worked with SEPA who wanted to improve flood monitoring. Gary came up with a solution using bumper sensors and GPS, and it’s getting rolled out across Scotland. We worked with Transport Scotland and they found a way that driving over potholes could detect and report them – that’s being rolled out. Lumera Health looked at triaging around dermatology in Glasgow. Dogfish mobile helped Stirling Council pull data from multiple services into one platform.
So, we’ve run this twice and we’ve just launched CivTech 3.0 where challenges include illegal trading, social isolation, standards in rented housing, the circular economy around the reuse of medical products, staff satisfaction, young people and their digital footprint with young people at the core of that challenge… There is a real spectrum here.
The lesson learned here is that we don’t labour away on solutions no one needs. You start with the client, the problem and the market. And the benefits of this process is cross-pollination, and this ecosystem of delivering services. We have a workshop programme of 40 workshops over 4 months covering product, business and personal development. And the feedback is that half of these are as useful for the public sector as they are for the private sector. We talk about entrepreneurial government – it’s about skills and transformation of the public sector in a deep immersive experience. It’s a collaboration with everyone.
What is the impact here? Well it’s the roll out of solutions. It’s Stirling Council making 1.5m in cost avoidance. It’s 16/1 8 companies still in operation; 1.5 m in subsequent contracts. There is real economic benefit here. And the Cabinet Secretary for Finance and the Constitution backs this approach – and it’s getting international interest. But I want to come back to something Chris said earlier about delivery and process…
We start with user need, that needs to be met by a public sector organisation, theres an idea and citizens and innovation centres and academia feeding in. Then that product needs to be developed further by public sector organisations… They need investment, and that needs scale, potentially going global. That leads to productivity and trust. Why should this come from Digital Directorate? Well it’s about our policies on procurement, citizen engagement and service design, academic collaboration, business support, reuse, buying and build. We see this as Civtechnomics.
It’s about moving from sandbox to widespread adoption. So a lesson we learnt here is the idea of adopting a 3 to 1 procurement approach – why not do that in other procurement. If you know of entrepreneurs, intrapreneurs then send them our way – applications close on 2nd July. So that’s us: driving daring and innovation in the public sector.
State of the Tech Nation:Gerard Grech, Chief Executive, Tech Nation.
I think Alexander couldn’t have set this up better. I’m going to give you a lot of stats on UK tech right now, and in relation to Europe. Digital Innovation is changing, and it’s changing fast. Tech Nation is a private company that is publicly backed and privately backed, and we’ve worked with 50 companies who have collectively raised $7Bn in 50 months, at all growth stages. We’ve seen what it takes to move from someone in a garage, through to listing on the stock market. As a company with technical and business skills. These companies are growing so quickly – we see companies going from 5 people to 500 or 2000 people in a year. It’s about ambition and Edinburgh, Glasgow, Dundee, has to be ready for that.
So I’m going to start by talking about how innovation is evolving globally. Innovation is open, it’s not linear anymore. There is process… But the more open you are to it, the more succesful you are… But people like to control, this doesn’t feel natural… Even in the UK companies can set up and a few years later be worth a few billion. There are four key drivers of a tech ecosystem – and Scotland has a lot of competition here. (1) You need the talent – without that you will not get investment. That could be hobbiests, that could be people who’ve just been made redundant; (2) You need the ideas of where to disrupt, where the opportunities are; (3) It’s about the infrastructure – what universities do with graduates and what you do with drop outs. Drop outs matter, we stigmatise it, but universities are not the place for everyone, especially if you are a genius. I was in Norwich the other week chatting to a 19 year old who had dropped out of university but they’d understood how to support him, and he had a great network, and his company had benefitted from pivoting four times already, flexing to opportunities and this was in Norwich – the end of the train line;  (4) You need finance – and I know the financial conduct authority enables useful experimental space for FinTech that is being replicated elsewhere.
Each year we publish the most comprehensive research on UK Tecg, using data from partners and 3400+ survey responses (80% outside London). All the data is open at data.world. Find it at TechNation.io. For instance you can find data on popular programming languages broken down by city through the data we have from GitHub.
The UK’s digital tech sector grew 2.6 times faster than the rest of the economy in 2017. Digital tech turnover increased from £170Bn in 2016 to £184 Bn, we are seeing stagging growth in jobs and a new start up every 50 seconds in th e UK. In 2017 UK VC investment exceeded Germany, France and Sweden combined (and even the Rest of Europe). And investment doubled from ~£4bn to just under £8Bn.
There is, globally, a race for tech talent and that continues, London ranks 3rd in latest tech startup ecosystem rankings. Talent is a global race so Scotland aren’t competing with London, they are competing with Austin Texas, or LA… The UK accounts for 13 of Europes Unicorns, with a combined value of about $13 Bn – about 37% of European value. The UK is Europe’s second biggest source of capital in Europe. Many UK tech unicorns are driven through algorithmic control, attracting specialist high paying jobs. When we look at pay levels, the more tech skills you have, the more those jobs pay.
The top 400 UK meetups (Glasgow is a top 10 meet up city with 12,000 members) highlight emerging trends: and they are AI and Blockchain are quickly taking centre stage – and Edinburgh and Glasgow are at the forefront here, especially after the announcement of the new Blockchain incubator at Edinburgh University.
UK tech communities value quality of life, proximity to universities… Women are outnumbered 4 to 1 in digital tech companies. We just launched a new initiative with JP Morgan to increase the number of female tech founders.
The Scottish tech sector is growing quickly with £3.9Bn value to the economy. And different cities face different challenges – for instance in Cambridge the top challenge is Brexit.  Read more in the full report.
Innovation to drive growth for all – Professor Charlie Jeffrey, Vice Principal, Edinburgh University.
In a nutshell the City Region Deal is 6 local authorities, business, universities, and college developing plan for investment in economic growth support by: 2 governments; 20 July 2017 Heads of Terms signed. We have positioned innovation as the driver of economic growth in the region, and a significant theme is that this is inclusive – that growth is for all, including affordable housing, transport infrastructure, and skills – which is where we are particularly concerned.
Edinburgh has a comparative advantage for Data Innovation as an ecosystem through the University of Edinburgh School of Informatics – one of the best in the world and has been for some time. After Informatics’ devastating fire – a sensitive topic here in Glasgow right now but something good came out of this – it was reshaped, they rethought what they do and could do and recognising the skills and talents of those coming through Informatics – there is a real talent flow. And Informatics Ventures came out of this. Then, secondly we have sophisticated tech business incubator/adoption/funding system: the UK’s biggest incubator CodeBase, DataLab, and Engage Invest Exploit. And thirdly we have strong government backing. Scottish Enterprise investment in the School of Informatics – £270 M and further contribution taking it a half billion of investment in skills.
Since 2005 we’ve seen the growth of new unicorns, new start ups, incubators, and R&D capability in Edinburgh of big tech players including Microsoft, Disney research etc. We want to replicate and scale. And we want to become the Data Capital of Europe, and things like the Weyra incubator for Blockchain are a part of that huge interest. That’s all very good but that work we are talking about is disruptive – for places, for employment, for demographics. Gerard talked about the lack of women in tech, we need to address that kind of inequality – how can education ensure girls make the subject choices to take them into those careers for instance. We think of this as data citizens, data workers, and data professionals.
The skills part of City Deal is an 8 year programme. How can we ensure that the 121,000 school leavers over that time come out with better skills than they otherwise would? We think so. For instance the newly build Newbattle High School, in one of the poorest parts of Midlothian, is working with us closely to ensure the currciulum builds data and tech in across the board. Its a pilot but if it works – and we think it will – we can scale it. We are working with colleges on building data skills into the curriculum and alternative routes such as apprenticeships. We have models like “data ambassadors” – how can we scale these ideas, this building of data skills to hard to reach groups? And how can we drive adoption of data technologies – we are working with DataLab and Interface here. Small businesses are weak in adaptation to the data economy – they won’t be tech business but their productivity could be enhanced by engagement with data. Can we do something here? Yes, we think we can.
We are coming up to the announcement of the final details of City Deal. My first meeting was in 2015. We’ve had a lot of meetings with government, local authorities, private companies, etc. We have built relationships and are ready to truly work collaboratively.
Why are we talking about Edinburgh and South East Scotland? Well there was a territorial opportunity, but all of these ideas are applicable nationally or internationally. None of this is bound by the geography of this deal. There is an international capability here.
Questions and discussion
Alistair Gunn (AG): Tech Nation are national… Charlie it was good to hear that the opportunities that are not bounded. Gerard do you want to talk about these changes recently and the bigger changes.
GG: When you talk to big companies like GE, you ask “how do you look at the world” they’d say “talent hubs” – they’d say San Francisco, New York, London, Chennai… That these are hubs of talent and technology. But projects cannot be geographically bound – opportunities may be in Bangalore or Chennai… You have to make a city as welcoming as possible and have to combine UK talent with talent from across the world. And the technologies and skills may not be in your locality – that may be in China. Skyscanner are passionate about bringing that mindset in from across the world.
AG: I’ve seen that accellerators are starting to measure success not by number of start ups but by number of international companies attracted.
AH: It depends on your aims. We wanted public services being done in a different way. The physical space was extremely important – bringing civil servants out of their day to day location and into a new space that enabled culture change, co-locating at CodeBase, bringing our companies in. This wasn’t some weekend course – this is a six month immersive experience. And that’s why Sepa change their procurement process. You see real change. The government of the twenty first century is very different. You have places like Dubai opening their services, if we don’t we’ll be left behind. Charlie talked about that collaboration across sectors – that will create talent, improve services, and attract investment opportunities.
AG: You talked about Data Lab, coming out of a University and thinking with a service design perspective… That’s building talent…
CJ: Talking of boundaries… Perhaps the biggest boundary is between Edinburgh and Glasgow. We are so close together yet we tend to look in different directions… We should be thinking about collaboration in that frame. The FinTech sector is across both cities for instance, if we have capabilities we should collaborate to realise all those potential benefits.
AG: And that’s a good place to finish I think, that need to think internationally and compete globally.
SESSION 4: INFRASTRUCTURE:  Scottish Wide Area Network (SWAN), Masterclass by Capita

Speaker:

  • Mike MacDonald (MM), Head of Digital & Innovation, Capita, 
  • Toni Gribben (TG), Scotland Manager, Cisco
  • Anne Moises (AM)
MM: I get to look at all the cool and new stuff that happens on the SWAN network. A quick recap on SWAN: SWAN was born out o f the McClelland Report in 2011 that recommended shared services and SWAN has saved Scottish Government about £30m. We have 6000 sites across 50 organisations providing bandwidth up to 10 Gbps. We have 250 unbundled exchanges providing cost effective connectivity closer to customers across the length and breadth of Scotland. It’s a huge network which includes local authorities, NHS Scotland, SQA, Forestry Commission, etc.

We have been running 4 years, and we have choices to make – to further connect Scotland, further reduce cost, be technology focused and connected or; to align to digital strategy and connect the technology with that strategy. That latter path is the interesting one for us to take now.

The Scottish Government digital strategy for scotland talks abuot the new digital economy, digital public services, skills, connectivity, cyber security, data and innovation, participation. We have connectivity but we’ve also helped be a cornerstone customer to help push out fibre to new locations and we are happy to do that. We have that top level strategy, but there are also sectorial strategies in Scotland with different needs – so we need to address those differing but related strategic needs. When you look at these, when you look at highly varied technology adoption levels, how do we cater to all of those needs.

We’ve focused on areas with real potential impact: smart healthcare, digital education, data based outcomes, health and social care integration. And we are looking analytics – offering stuff like Titration on a Software As A Service basis at a more affordable level. We see lots of IoT on SWAN, and its all about the data and insights you derive from that. But we have question marks here… About what you need.

The way we want to bring this together is in the SWAN Marketplace. This is bringing together SMEs right up to big multinationals. We are in conversation with Procurement Scotland so we can see how we provide opportunities for SMEs to sell their services and products through the SWAN marketplace. For instance Ajenta came out of the Janet network, as any-to-any teleconferencing. You can bridge connections across legacy conferencing kit through to Skype for Business. They also own the “Push Doctor” as well – and they are actively looking to make that available through us. We are doing a formal launch of vscene 2.0 with these folk next week – do come along.

We see a lot of interest in IoT. We talk about Smart City, Town, Community… We like to think about Smart Scotland. I’ll talk about some examples here. In the highlands they have a big challenge around water. Local authorities have to test the water every month for temperature. That’s usually one person in a van taking measurements. So we’ve put sensors in place to enable that testing to take place more efficiently. And we’ve also put a gateway in place in Angus for use by SEPA.

AM: I am absolutely delighted to be here to make an announcement. We are announcing a proposal – still to be formally agreed by the SWAN management board – for Phase 1 of the IoT enablement of SWAN, for int the region of 50 LoRaWAN Gateways, courtesy of CISCO. Expanding on our pilots, and we are starting from the outside in, from the more rural areas of Scotland, rather than starting in the easier to do cities. We will be learning and developing and scaling up. This is back to the origins of SWAN – as a starting point for innovation, and more than a network. Coverage is being driven by Public Sector IoT projects, and is conjunction with Public Sector customers, CENSIS, Innovation Partners and SMEs to stimulate the market. This is funded from SWAN Innovation Fund. And, once confirmed, it will begin roll out in the summer. There is an accompanying press release – just now.

TG: I am country manager for CISCO in Scotland. We are based in San Jose and we offer software services worth around $50Bn. We pride ourselves on delivering innovation. I’m going to describe LoRaWAN, some case studies, and an additional ask before you leave. CISCO sees IoT as the combination of people, process, information and things. Less than 1% of connectable “things” are connected to the internet. By 2020 estimates vary between there being 15Bn and 50Bn connected things by 2020. That’s a huge opportunity for all of us in this room.

Our previous chief executive was a visionary around IoT, and from that we lead the development of LoRaWAN – Long Range Wide Area Network – connectivity. Never has it been more prevalent that we are connected. We use blue tooth, 3G, 4G and increasingly 5G. The beauty of LoRaWAN is that it takes only low energy and a low data rate to connect to the network. Sensors can give you 5 years worth of power. You implement it once, then leave for 5 years. It’s low cost, runs on unlicensed spectrum, and one gateway can handle up to 10k devices. It’s hugely exciting for all of us.

In terms of use cases… There are 12 listed here and 10 are in place now already: water and gas metering, street lighting, smart building, smart parking, tracking, leak detection and irrigation, water level and flood management, fault management, smoke detectors, smart energy and fast demand response, waste management, traffic management. These represent huge cost savings in waste management and water management for instance. But this takes culture and process change and you need to be open to that.

SESSION 5: SKILLS: A digitally skilled nation
Panellists:

  • Donald McLaughlin (DM), Technology Sector Business Leader and Chair of Scotland’s Digital Technologies Skills Group; 
  • Kirsten Urquhart (KU), Digital & Smart-Tech Director, Young Scot;
  • Joshua Ryan-Saha (JRS), Skills Manager, The Data Lab.
  • Chaired by: Melinda Matthews Clarkson (MMC), Chief Executive, CodeClan.
MM: We are here to talk about skills, so it’s a chance to go wild with all those questions you’ve been building up.
JRS: I am the Skills Manager at Data Lab, developing data skills across Scotland.
KU: I work at Young Scot and, whilst you might be familiar with our discount card, we also engage directly with young people involving them in skills, service design, etc.
DM: The Digital Technologies Skills Group and we work with various organisations – including CodeClan.
MM: Every session I’ve been in has a skills dimension. We have 1200 unfilled roles in Scotland with a tech dimension… How do we address that? How do we boost GDP through skills development. Lets start with young people.
KU: For us working with young people, we try to talk about digital as not being the end point – not just coding, digital, cyber, but instead digital as an element in arts, creativity, transport. How do we focus on problem solving, creativity and softer skills that ready them for that work. A gendered element comes in here as well, in terms of understanding what tech skills are.
MM: Is there any job that doesn’t involve digital skills?
DM: Absolutely not. Those 12800 unfilled jobs are specifically in tech but that demand is growing, and exists outside the tech sector. To make Scotland economially successgul we need to bring the tech sector to life, to reinforce the appeal of these careers, the creative and problem solving aspects.
JRS: We face real challenges here: we face Brexit so need home grown talent if access to the European talent pool reduces. And we need tech skills across professions, including hardcore coding skills. We have to address head on the sheer quantity of computing teachers – there is a third of what was there 5 years ago, and that wasn’t enough then. So how can we increase that, and make those roles attractive.
DM: And doing that doens’t event address those skills across the board – all teachers should be digitally skilled so that is part of inspiring young people across the board.
JRS: Creative sector is really interesting – Scotland has a competitive advantage in games and that’s cretive and digital.
MM: There are jobs and opportunities… So why don’t we have those skills?
KU: I think it’s a confidence thing for many people. We talked about softer skills… We need to change the language a bit, we need to upskill, we need to bolster that confidence…
MM: That goes to the gender issue.
DM: The tech sector has an 80:20 split between male and female workers. There is some bias and unconcious bias about what the tech sector is, what the work looks like. We need more role models in tech and digital – careers, job satisfaction. And we need metaskills – not just coding but computational thinking, creative thinking, ability to work in teams… If we focus on that… Half of the jobs people will do in 20 years haven’t been invented yet – we need broader reusable skills. And I think that will help bring women into the sector.
MM: How do we change that image – marketing or?
JRS: In Scotland we have Swift, we have GirlGeekScotland, we see great stuff everyday. There are specific things we can do too. At Heriot Watt we’ve created a two year data science conversion course – normally for that sort of MSc you need computing background so this course gets you up to speed and then focuses in. In the Met Police they intervened to improve the diversity of police officers, looking critically at every stage – we should do that with colleges, universities etc. Charlie Jeffreys spoke about making Edinburgh the Data Capital of Europe and I think we should make it the best place for women to work in tech – having leaders like you Melinda, or like our CEO Gillian, really helps.
DM: Having those visible role models will help generate excitement in the sector.
KU: I think we have to work with parents and carers around removing stigmas around less traditional learner journeys – not always taking a university route. Apprenticeships and internships can be great for young people to get into this.
MM: Graduates from CodeClan won’t have “5 year” skills, what can we do for next level skills…
DM: I think rather than saying 5 years or 10 years coding skills, we look at diversity – of gender, of age but also of background – into the workplace, To be sustainable we need to think differently about the talent pool. And we need the talent pool to be much more flexible and prepared for industry.
JRS: We have to think about how we support learning and training continuously. Everything changes so quickly – all the predictions see the ability to learn as the key skill for the future. We may also need more risk in hiring. We need more support for training… And i think Scotland is in a great place to do this.
MM: When we look at all the universities training… We have 1000 graduates per year… But we have 12800 empty roles. I think we need marketing and localised examples that show tech is fun and exciting to work on… Who would you pick as a role model?
KU: For me it has to be achievable. So I think it would be young people who’ve come through our programme and had great outcomes from Modern Apprenticeships.
DM: In some ways the best role models are the less known people. Apprenticeships can be brilliant to see the skills pipeline coming through. We need young women role models – not neccassarily CEOs but those working in the sector with great careers.
KU: I think that role models who are women with families, young children etc. would be really helpful as so many role models right now are quite different.
Q&A
Q1) What do you think about traditional degree programmes as they are structured?
A1 – JRS) I think we’ve seen real lack of investment by the private sector in their own staff and skills. We need apprenticeships. We need in-work training. And there need to be a variety of models – e.g. training for those returning to work after having children. There needs to be more investment by the public and private sector.
A1 – KU) That model would work for us as a small charity – the space to train and develop skills.
A1 – DM) There needs a rebalancing of HE and FE and skills and lifelong learning. I think all universities and colleges funding should be conditional on supporting lifelong learning.
A1 – MM) Scotland has the most amazing funding model for higher education. I would change it around so that its free if you are studying degrees that Scotland truly needs. Anyone can study, but you go for free if you are fulfilling a real skills challenge. Invest in the skills you need to grow…
A1 – JRS) I can see that we do need as many people trained up in software engineering etc. as possible. But actually in data science we need people with social sciences and creative backgrounds… We need to incentivise technical skills but we don’t want to penalise subjects with a real impact on working and ethically.
A1 – MM) Maybe that’s about embedding digital skills in social science degrees?
A1 – JRS) Actually in Edinburgh University and Edinburgh Napier University will be doing that under that City Region Deal actually.
Q2) My daughter had almost no computing teaching – she had computing classes in year 1 and 2 but when it came to options it wasn’t even there. And data understanding and analysis wasn’t there across the curriculum. I spoke to computing teachers about what was actually happening in class – no social media as RM won’t allow that, they had students breaking through firewalls… These people will be working for Google and Amazon, etc. they should be encouraged to break through firewalls!
A2 – MM) Social isolation is a real risk and actually social media and community are a key part of educating people with the useful digital skills, especially older people.
Q3) Why isn’t there excitement about data analysis? It’s hard to explain, it’s hard to do., it’s boring.. And those salaries shown earlier aren’t representative.
A3 – JRS) I don’t think it’s boring at all – it’s problem solving and making real change! Those salaries are what we see graduates of data science MScs attracting.
A3 – MM) Data based decision making is key across sectors, even if it’s backing up gut feelings.
A3 – JRS) We need more data literacy across the board, but gut feeling and decisions do come into it… And we have to bride that human and data literacy.

As a final note on this session I’d remind readers of our Digital Footprint MOOC, a free self-led online course which covers managing your digital footprint and encourages critical thinking about social media, privacy and personal data. You can find it on Coursera here: https://www.coursera.org/learn/digital-footprint.

And with that we are (slightly abruptly) done for the day. Thanks to the organisers for a stimulating day of discussion and networking. 

 

May 162018
 

Today I am at the Digital Scholarship Day of Ideas, organised by the Digital Scholarship programme at University of Edinburgh. I’ll be liveblogging all day so, as usual, I welcome additions, corrections, etc. 

Welcome & Introduction – Melissa Terras, Professor of Digital Cultural Heritage, University of Edinburgh

Hi everyone, it is my great pleasure to welcome you to the Digital Day of Ideas 2018 – I’ve been on stage here before as I spoke at the very first one in 2012. I am introducing the day but want to give my thanks to Anouk Lang and Professor James Loxley for putting the event together and their work in supporting digital scholarship. Today is an opportunity to focus on digital research methods and work.

Later on I am pleased that we have speakers from sociology and economic sociology, and the nexus of that with digital techniques, areas which will feed into the Edinburgh Futures Institute. We’ll also have opportunity to talk about the future of digital methods, and particularly what we can do here to support that.

Lynn Jameson – Introduction

Susan Halford is professor of sociology but also director of the institution-wide Web Science Institute.

Symphonic Social Science and the Future of Big Data Analytics – Susan J Halford, Professor of Sociology & Director of Web Science Institute, University of Southampton

Abstract: Recent years have seen ongoing battles between proponents of big data analytics, using new forms of digital data to make computational and statistical claims about the social world, and many social scientists who remain sceptical about the value of big data, its associated methods and claims to knowledge. This talk suggest that we must move beyond this, and offers some possible ways forward. The first part of the talk takes inspiration from a mode of argumentation identified as ‘symphonic social science’ which, it is suggested, offers a potential way forward. The second part of talk considers how we might put this into practice, with a particular emphasis on visualisation and the role that this could play in overcoming disciplinary hierarchies and enabling in-depth interdisciplinary collaboration.

It’s a great pleasure to be here in very sunny Edinburgh, and to be speaking to such a wide ranging audience. My own background is geography, politics, english literature, sociology and in recent years computer sciences. That interdisciplinary background has been increasingly important as we start to work with data, new forms of data, new types of work with data, and new knowledge – but lets query that – from that data. All this new work raises significant challenges especially as those individual fields come from very different backgrounds. I’m going to look at this from the perspective of sociology and perhaps the social sciences, I won’t claim to cover all of the arts and humanities as well.

My talk today is based on work that I have been doing with Mike Savage on “big data” and the new forms of practice emerging around these new forms of data, and the claims being made about how we understand the social world. In this world there has been something of a stand off between data scientists and social scientists. Chris Anderson (in 2008), a writer for Wired, essentially claimed “the data will speak for itself” – you won’t need the disciplines. Many have pushed back hard on this. The push back is partly methodological: these data do not capture every aspect of our lives, they capture partial traces, often lacking in demographic detail (do we care? sociologists generally do…) and we know little of its promise. And it is very hard to work with this data without computational methods – tools for pattern recognition generally, not usually thorough sociological approaches. And present concerning, something ethically problematic, results that are presented as unproblematic. So, this is highly challenging. John Goldthorpe says “whatever big data may have for “knowing capitalism” it’s value to social science has… remained open to questions…”.

Today I want to move beyond that stand out. The divisiveness and siloing of disciplines is destructive for the disciplines – it’s not good for social science and it’s not good for big data analytics either. From a social science perspective, that position marginalises social sciences, sociology specifically, and makes us unable to take part in this big data paradigm which – love it or loathe it – has growing importance, influence, and investment. We have to take part in this for three major reasons: (1) it is happening anyway – it will march forward with or without it; (2) these new data and methods do offer new opportunities for social sciences research and; (3) we may be able to shape big data analytics as the field emerges – it is very much in formation right now. It’s also really bad for data science not to engage with the social sciences… Anderson and others made these claims ten years ago… Reality hasn’t really shown that happen. In commercial contexts – recommendations, behaviour tracking and advertising, the data and analysis is doing that. But in actually drawing understanding from the world, it hasn’t really happened. And even the evangelists have moved on… Wired itself has moved to saying “big data is a tool, but should not be considered the solution”. Jeff Hammerbacker (co-credited for coining the term “data science” in 2008, said in 2013 “the best minds of my generation are thinking about how to make people click ads… that sucks”.

We have a wobble here, a real change in the discourse. We have a call for greater engagement with domain experts. We have a recognition that data are only part of the picture. We need to build a middle ground between those two positions of data science and social science. This isn’t easy… It’s really hard for a variety of reasons. There are bodies buried here… But rather than focus on that, I want to focus on how we take big steps forward here…

The inspiration here are three major social science projects: Bowling Alone (Robert Putnam); The Spirit Level – Richard Wilkinson and Kate Pickett; Capital – Thomas Piketty. These projects have made huge differences, influencing public policy and in the case of Bowling Alone, really reshaped how governments make policy. These aren’t by sociologists. They aren’t connected as such. The connection we make in our paper is that we see a new style of social science argumentation – and we see it as a way that social scientists may engage in data analytics.

There are some big similarities between these books. They are all data driven. Think about sociologists at the end of 20th century was highly theoretical… At the beginning of the 21st century we see data driven works. And they haven’t done their own research generating data here, they have drawn on existing research data. Piketty has drawn together diverse tax data… But also Jane Austen quotes… Not just mixed methods but huge repurposing. These books don’t make claims for causality based on data, their claims for causality is supported by theory. However they present data throughout and supporting their arguments. Data is key, with images to hold the data together. There is a “visual consistency”. The books each have a key graph that essentially summarises the book. Putnam talks about social capital, Piketty talks about the rise and fall of wealth inequality in the 20th century.

In each of these texts data, method and visualisation are woven into a repeat refrain, combined with theory as a composite whole to makes powerful arguments about the nature of social life and social change over the long term. We call this a “Symphonic Aesthetic” as different instruments and refrains build, come in and go… and the whole is greater than the sum of the parts.

OK, thats an observation about the narrative… But why does that matter? We think it’s a way to engage with and disrupt big data. There are similarities: re-purposing multiple and varied “found” data sources; an emphasis on correlation; use of visualistion. There are differences too: theoretical awareness; choice of data; temporality is different – big data has huge sets of data looking at tiny focused and often real time moments. Social Science takes long term comparisons – potentially over 100 years. The role of correlation is different. Big data analytics looks for a result (at least in the early stage), in symphonic aesthetics there is a real interest in correlation through statistical and theoretical understandings. Practice of visualisation varies as well. In big data it is the results, in symphonic aesthetics it is part of the process, not the end of the process.

Those similarities are useful but there is much still to do: symphonic authors do not use new forms of digital data, their methods cannot simply be applied, big data demand new and unfamiliar skills and collaborations. So I want to talk about the prospective direction of travel around data; method; theory; visualisation practice.

So, firstly, data. If we talk about symphonic aesthetics we have to think about critical data pragmatism. That is about lateral thinking – redirection of what data exist already. And we have to move beyond naivety – we cannot claim they are “naturally occurring” mirrors/telescopes etc. They are deliberately social-technical constructions. And we need to understand what the data are and what they are not: socio-technical processes of data construction (eg carefully constructed samples); understanding and using demographic biases (go with the biases and use the data as appropriate, rather than claiming they are representative; or maybe ignore that, look at network construction, flows, mobilities – e.g. John Murrey’s work).

Secondly method. We have to be methodologically plural. Normally we do mixed methods – some quantitative, some qualitative. But most of us aren’t yet trained for computational methods, and that is a problem. Many of the most interesting things about these data – their scale, complexity etc. – are not things we can accommodate in our traditional methods. We need to extend our repertoire here. So social network analysis has a long and venerable history – we can apply the more intensive smaller version of large scale social network analysis. But we also need machine learning – supervised (with training sets) and unsupervised (without). This allows you to seek evidence of different perhaps even contradictory patterns. But also machine learning can help you find the structures and patterns in the data – which you may well not know in data sets at this scale.

We have this quote from Ari Goldberg (2015): “sociologists often round up the usual suspects. They enter the metaphorical crime scene every dat, armed with strong and well-theorised hypotheses about who the murderer should or at least plausibly might be.”

To be very clear I am not suggesting we outsource analysis to computational methods: we need to understand what the methods are doing and how.

Thirdly, theory. We have to use abductive reasoning – a constant interplay between data, method and theory. Initial methods may be informed by initial hunches, themes, etc. We might use those methods to see if there is something interesting there… Perhaps there isn’t, or perhaps you build upon this. That interplay and iterative process is, I suspect, something sociologists already do.

So, how do we bring this all together in practice? Most sociologists do not have a sophisticated understanding of the methods; and most computer scientists may understand the methods but not the theoretical elements. I am suggesting something end to end, with both sociologists and computer scientists working together.

It isn’t the only answer but I am suggesting that visualisation becomes an analytical method, rather than a “result”. And thinking about a space for work where both sociological and computer science expertise are equally valid rather than combatorial. At best visualisations are “instruments for reasoning about quantitative information. Often the most effective way to describe, explore and summarise a set of numbers – even a very large set – is to look at pictures of those numbers” (Tufte 1998). Visualisations as interdisciplinary boundary objects. Beyond a mode of argumentation… visualisation becomes a mode of practice.

An example of this was a visualisation of the network of a hashtag that was collaborative with my colleague Ramin, which developed over time as we asked each other questions about how the data was presented and what that means…

In conclusion, sociology flourished in the C20th. Developing methods, data and theory that gave us expertise in “the social” (a near monopoly). This is changing – new forms of data, new forms of expertise… And claims being made which we may, or may not, think are valid. And that stands on the work of sociologists. But there is some promise in the idea of symphonic aesthetic: for data science – data science has to be credible and there is recognition of that – see for instance Cathy O’Neil’s work on data science, “Weapons of Math Destruction” which also pushes in this direction. ; for sociological research – but not all of it, these won’t be the right methods for everyone; for public sociology – this being used in lots of ways already, algorithm sentencing debates, Cambridge Analytics… There is a real place for sociologists to reshape sociology in the public understanding. There are big epistemological implications here… Changing the data and methods changes what we study… But it has always been like that. Big data can do something different – not necessarily better, but different.

Q&A

Q1) I was really interested in your comments about visualisations as a method… Joanna Drucker talks about visual technology and visual discourse – and issues of visualisations as being biased towards positivistic approaches, and advocates for getting involved in the design of visualisation tools.

A1) I’m familiar with these concepts. That work I did with Ramin is early speculative work… But it builds and is based on classic social network analysis so yes, I agree, that reflects some issues.

Q2 – Tim Squirrel) I guess my question is about the trade off between access and making meaningful critiques. Often sociology is about critiquing power and methods by which power is transmitted. The more data proliferates, the more the data is locked behind doors – like the kind of data Facebook holds. And in order to access that data you ahve to compromise the kinds of critiques you can make. How do you navigate that narrow channel, to make critiques without compromising those…

Q2) The field is quite unsettled… It looks settled a year ago but I think Cambridge Analytica will have major impact… That may make the doors more closed… Or perhaps we will see these platforms – for instance Facebook – understanding that to retain credibility it has to create a segregation between their own use of the data, and research (not funded by Facebook), so that there is proper separation. But I’m not naive about how that will work in practice… Maybe we have to tread a careful line… And maybe that does mean not being critical in all the ways we might be, in every paper. Empirical data may help us make critical cases across the diverse range of scholarship taking place.

Q3 – Jake Broadhurst) Data science has been used in the social world already, how do we keep up and remain relevant?

A3) It is a pressing challenge. The academy does not have the scale or capacity to address data science in the way the private sector does. One of the big issues is ethics… And how difficult it is for academics to navigate ethics of social media and social data. And it is right that we are bound to ethical processes in a way data scientists and even journalists do not need to. But it is also absolutely right that our ethics committees have to understand new methods, and the realities of the gold standard consent and other options where that is not feasible.

The discussion we are having now, in the wake of Cambridge Analytica, is crucial. Two years ago I’d ask students what data they felt was collected, they just didn’t know. And understanding that is part of being relevant.

Q4 – Karen Gregory) If you were taking up a sociology PhD next year, how would you take that up?

A4) My official response would be that I’d do a PhD in Web Science. We have a programme at University of Southampton, taking students from a huge array of backgrounds, and giving them all the same theoretical and methodological backgrounds. They then have to have 2 supervisors, from at least 2 different disciplines for their PhD.

Q5 – Kate Orton Johnson) How do we tackle the structures of HE that prevent those interdisciplinary projects, creating space, time, collaborative push to create the things that you describe?

A5) It’s a continuous struggle. Money helps – we’ve had £10m from EPSRC and that really helps. UKRI could help – I’m sceptical but hopeful about interdisciplinary possibilities here. Having PhD supervision across really different disciplines is a beautiful thing, you learn so much and it leads to new things. Universities talk about interdisciplinary work but the reality doesn’t always match up. Money helps. Interdisciplinary research helps. Collaboration on small scales – conference papers etc. also help.

Q6 – David, research in AI and Law) I found your comments about dialogues between data scientists and social scientists… How can you achieve similar with law scholars and data scientists… Especially if trying to avoid hierachichal issues. Law and data science is a really interesting space right now… GDPR but also algorithmic accountability – legal aspects of equality, protected categories, etc. Very few users of big data have faced up to the risks of how they use the data, and potential for legal challenge on the basis of discrimination. You have to find joint enthusiasm areas, and fundable areas, and that’s where you have to start.

The Economics Agora Online: Open Surveys and the Politics of Expertise – Tod van Gunten, Lecturer in Economic Sociology, University of Edinburgh

Abstract: In recent years, research centres in both the United States and United Kingdom have conducted open online surveys of professional economists in order to inform the public about expert opinion.  Media attention to a US-based survey has centred on early research claiming to show a broad policy consensus among professional economists.  However, my own research shows that there is a clear alignment of political ideology in this survey.  My talk will discuss the value and limitations of these online surveys as tools for informing the public about expert opinion.

Thank you for the invitation to speak today, and for Susan’s great and inspiring talk. I wouldn’t claim the label “symphonic” for this talk, but I think there is something of that spirit in this talk. This project is based on found and repurposed data. It isn’t particularly “big” data… But the “found” aspect of the data raises profound questions. Data never holds the answers on its own, it is always crucial to understand method and context. Visualisation is a big part of this. And it about public sociology – so it hasn’t just been published in journals but in popular press as well.

I am an economist who studies economists as a sociological object in their own right. So, this is a famous moment in 2008 when the Queen, during the midst of the largest global financial crisis since 1929, asked an economist “why did nobody notice it”. Because she is the queen, the British Academy convened a panel to respond to this question. And they said that lots of people did a good job, but it was no-one had it as their job to put everything together. Meanwhile with Brexit we’ve seen economists as a profession receiving substantial criticism.

Economists are hugely influential, we study them because it is the politics of expertise. It is the most politically influential social science. So, I’m going to talk about properties we would like politically influential experts to have:

  1. A high level of professional consensus within the the relevant community of experts. Gold standard here is climate science. If we have a community of experts that all agree, there seems to be a need for action. That’s a good principle.
  2. Form policy opinions independently of their own political ideology. We will receive and have confidence in advice from an independent expert more than someone presenting their own views.
  3. Acknowledge professional debate in expressing their views. That they acknowledge that issues are not settled issues.

So in this paper I want to look at how we may use data to measure these aspects. And I’mm be going through some theory around the cultural structure of belief spaces and how this relates to data, big data in the context of economics – but this theory can be used in other contexts as well).

I want to open on the “economics agora” online. I want to talk about two surveys here – these are open online surveys of economists since the financial crisis. It is no coincidence that these have emerged at this time. These surveys are in the UK and in the USA. And unusually the results include publishing the full responses, and the names of the responders – by their consent. These are famous/well known individuals in their field. This allows us to do more… Bring in data that is not in the survey – the CVs of the respondents for instance so including universities, political activities, their co-authorship network, etc. The survey organisers’ goal is to inform the public, but finding patterns in the data requires aggregation and analysis. This isn’t just individual responses, but understanding the context of the data. And again, this isn’t big data, this is quite small data. But these approaches apply to big data too.

So one of these surveys is the Chicago Booth IGM Economic Experts Panel. Each month they put a question to 40 economists about some issue of the moment – the impact of autonomous cars for instance. The second survey is the Centre for Math and Economics, based in London, and again they ask a panel for responses. Typically the UK/European survey shows much more disagreement than the US survey.

There are a lot of issues with these surveys: they are small (the UK/EU one is expanding) and non-random samples; deliberately elitist samples (US survey – “top 7” economics departments in US universities, mainly Ivy League) – why would you take this sample? Well you wouldn’t really… But you have very high status economists. The UK survey has a much wider range in its samples. I think these surveys are great… But I think they should do a better job! Another problem is that you have a high rate of “softball” questions – in the US survey, not in the UK/EU surveys. For instance “imposing new US tariffs on steel and alumnium will improve Americans’ welfare” – it’s timely but we already know that there is high consensus here. We need to ask harder questions! And finally we need to think about the motivations of the people who produce the data – the survey designers are looking to raise the profile of the profession. In a Wall Street Journal the designers of the US survey talked about wanting to counteract the idea of a lack of consensus in the field – and they are the ones asking the questions.

Gordon and Dahl (2013) looked at views and consensus in the field based on the surveys. They presented this as being a “remarkably high degree of consensus” and little variance across schools and departments. And thus look at how influential this field should be. This got big pick up… the Washington Post picked it up. Nobel winning economist Paul Krugman picked this up in his opinion column in the Economist. He is on record (New York Times 2009) as saying pretty much the opposite – that there is polarisation between the “saltwater” economists in the Keynesian camp, and the “freshwater” economists who are very much the opposite.

So, a bit of theory… What do we mean by consensus, polarisation, factions etc? How do groups of people structure their belief systems? We do have twenty years of literature and theory here around understanding belief systems. This goes back to political scientists in the 1960s. Philip Converse (1964) found that most american voters do not adhere to a coherent political ideology – this is still the case. Their believe systems are disorganised or “unconstrained” – so one belief does not let you predict another belief. So for instance comparing a belief that you should “reduce immigration” and “reduce corporate tax” – could show little correlation, those beliefs don’t automatically go together. Now if you are a voter in the UK in 2018 there probably is more alignment. That pattern is a “constrained or aligned” correlation. If you look at polarisation you see clusters of correlation.

So, that paper on economists looks for clusters. I looked at polarisation to look at latent ideology, noting partisanship (where known involvement in e.g. being part of political left or right leaning think tanks etc. – or marked as “none”), current department (freshwater vs saltwater) and belief dimension. Unsurprisingly those involved in Republican/conservative organisations and those with backgrounds in democratic/liberal organisations were very different, leaning right and left respectively. This is the same data that generated that paper that showed consensus and little variance.

There is a high degree of consensus in this survey but you can also see idealogical alignment. That can be consistent. But it depends on what you think, and what you ask. The UK survey – more recently expanded to Europe – shows much less consensus. This could mean there is more consensus in the US than in Europe; but it could also mean that the questions being asked in the UK survey are harder questions. The UK survey asks very complex questions… e.g. “Do you agree that, in a period of great uncertainty and after a prolonged period of weak real wage growth, monetary policy makers can afford to wait for greater certainty about real wage developments and building inflationary pressure before raising interest rates?”. So, you can’t measure consensus without a comparison with another group. You can see consensus on a question, not of a group/community or set of beliefs.

So, looking at a recent UK/EU survey on  looking at anti-establishment vs monetary conservatism you can see a diversity of views here.

So, back to those qualities. Professional consensus is harder to measure than it first appears.

One of the questions respondents are asked to give is their vote and their level of confidence. So, when experts give an opinion on hot topics you’d really want a low confidence score to show you don’t have a partisan respondent on your hands. Looking at the data here in the US surveys we see a lot of overly confident responses. Respondents with a stronger idealogical disposition (aligned belief structure) exhibit systematic overconfidence. In general, across all questions, when asked politically salient questions they state higher confidence than questions with little/no political salience.

By way of conclusion… Am I joining ranks with Michael Gove “people in this country have had enough of experts”? No. I would say something more nuanced. Arguably professions in general, economists in particular, has lost political legitimacy, then professional over-reach (“look how much consensus we have”) is not the answer. Claiming consensus where none exists is over-reach. Transparency about professional debate is always better than overstating consensus. Political legitimacy is a scarce resource and should be treated as such.

The economics agora online is a useful tool for studying the beliefs of an important community of experts… but survey designers should up their game. If you want an “unbiased” expert, chose someone whose belief structure is unconstrained. You probably want someone in the middle – people whose belief systems are not correlated. You need a theory of how groups form beliefs…. So read cultural sociology!

Q&A

Q1) In thinking about the resistance to “naturally occurring data” and the idea of an “unbiased expert” – do you have a sense that that isn’t possible… Rather than getting that, should we instead shift the conversation to make the politics relevant – to be clear in a way that makes the numbers make sense…

A1) If we chose which experts to listen to, which do we listen to…

Q1) It was interesting to think of economists as “not political” – if that’s the conversation… I think the non-biased expert… That raises issues. We query that that even exists… Maybe we can shift the conversation.

A1) I guess I would want to push back a little bit. I am sympathetic that there is no unbiased expert but… I do a lot of work on economists on how they influence policy. I think the world does need economists, especially for monetary policy, technical aspects of policy. So, having some tools to understand this profession, how they structure beliefs… We need more tools to unpack that set of questions… I’m trying to find ways to study this profession studying quantitative tools and qualitative tools and understand impact on politics and society.

Q2) You mentioned a graph to show polarisation – how did you do that?

A2) This is not based on data, this is based on theoretical patterns… A series of plots using a test data set to illustrate the patterns of the theory – it’s theoretical rather than empirical data.

Q3) A slight follow up… How much have you played with non linear tools… Consensus and confidence… Research on scientific knowledge shows that people who know a little about science have higher confidence than those who know more… That could impact that data on confidence.

A3) We did look an non-linearity – doesn’t make a big difference to some measures here.

Q4) What definition of “expert” are you using, and why?

A4) People with PhDs in economics. In the US case are high status people in the field… In the UK/EU case it is broader. Most work as professors of economics, some work in the private sector in financial sectors. For my purposes it’s holding a PhD in economics… Other work I’ve done on organisations in Latin America you have senior political elites with those credentials, a lot don’t boundary work becomes more important here.

Q5) I think some of the Chicago questions also go to the public. Have you looked at that?

A5) It’s not publicly available… I’ve been thinking about asking for that. But it would be interesting to know if members of the public structure their belief systems differently. There is some work that compares public beliefs to these questions.

Q6)  I work on spatial models around expert agreement and disagreement – interesting measures there and on polarisation. Also dimensionality reduction. Since you are trying to identify latent ideological positions… Not sure if you’ve looked at that. Political behaviour research has

Q7) I wanted to ask about how much the very different types of respondents and samples you have between the US and UK/EU surveys. I was particularly wondering about the high status nature of the US experts and how much that status plays a part… You talked about doing some social network and contextual work here so I was wondering the degree to which their network and co-authorship and professional standing feeds into wanting to be seen to take a particular view, or visibly agree.

A7) The social network part, and co-authorship data is going to lead to a paper. We found people who are closer in co-authoring papers are ideologically closer – not totally surprising… So there is a social approval thing and a selection vias. We think that is the more likely interpretation here – the homophily effect. They co-author non-political papers, they still pick ideologically aligned authors. The status thing is interest… The UK/EU experts is less hierarchical – maybe reflects practice. In terms of monitoring each others responses… I think it’s more contrarian thing… They want to find ways to disagree… They can add comments… So lots of “My colleagues all think this, but if you think about it this other way you get this opposite response”.

Q8) My question/comment is about the “unconstrained” idea space – it feels funny and attractive… But also quite negative… Unconstrained… Disorganised… But you are talking it about a positive quality. But does that suggest they haven’t thought this stuff through?

A8) I’m glad you asked this. This question came up in the 1960s and it was seen as terrible that the ideologies didn’t align to political parties… The field has turned on it’s head now. In the 1960s though this was seen as politically naive. Actually more educated voters are seen to have more constrained beliefs… But with the economists that unconstrained belief system is good as it shows that they are not bring in their partisan/idealogical stand point. There is a contraction there. The idea that the more information you have, the more constrained your belief system should be… But only to a point. There is a really interesting paper by ? de Surrey and Ari Goldberg that compares idealogical voters, the unconstrained voters, and they find a third group that is e.g. politically liberal and economically conservative. This is a really interesting area of the literature. There are a bunch of new methods that are getting us nearer that question…

We broke for lunch and workshops at this point… 

Workshops: Parallel workshop sessions – please see descriptors below.

  • Text Analysis for the Tech Beginner – Suzanne Black, PhD student in LLC
  • An Introduction to Digital Manufacture – Mike Boyd (uCreate Studio Manager, UoE)
  • ‘I have the best words’: Twitter, Trump and Text Analysis – Dave Elsmore (EDINA)
  • An Introduction to Databases, with Maria DB & Navicat – Bridget Moynihan (LLC, UoE)
  • Introduction to Data Visualisation in Processing – Jules Rawlinson (Music, ECA, UoE)
  • Jupyter Notebooks and The University of Edinburgh Noteable service – Overview and Introduction – James Reid (EDINA)
  • Obtaining and working with Facebook Data – Simon Yuill (Goldsmiths)

I attended the Introduction to Data Visualisation in Processing workshop which was really interesting, and left me wanting to have a further play to see where it may potentially be useful. 

Round Table Discussion

  • Melissa Terras (MT), Professor of Digital Cultural Heritage
  • Kirsty Lingstadt (KL), Head of Digital Library and Depute Director of Library and University Collections
  • Ewan McAndrew (EM), Wikimedian in Residence
  • Tim Squirell (TM), PhD Student, Science, Technology and Innovation Studies working on communities and expertise and negotiations of those concepts.

MT: I wanted to start with quite a personal place… I realised last year that I was sort of grieving for the internet. I grew up with the internet, it’s been a big part of my life and friendships… But the internet has taken a different turn… And there is a need to step away from that a bit to stay sane. There is a need to step back and reflect, and think about the University Space. I feel maybe we could have stepped in… The questions of Facebook, Twitter, the use of data… The human nature of trust… And how we use and engage and archive and preserve some of these spaces… I think that makes it interesting to an academic in the digital space right now.

EM: I think the idea of the web was quite sour after Cambridge Analytica. Tim Berners-Lee spoke on Channel 4 News about how it’s not enough to build and run the open web, but we have to look critically at what is being done with it, what people are building. I also thought that the Scottish Referendum, and Glasgow Strathclyde University which called upon all librarians to support political literacy. But that could be “universities” not just “libraries” – there is a need for much more information literacy as a service almost.

KL: The role of the university is about knowledge and supporting and preserving knowledge, with the library central to that… As the digital world changes we need those skills of information literacy, to think critically about what we see on the web, and how we understand that. That’s an important thread the library offers and supports. The arts, humanities and social sciences really support that development of critical engagement, literacy, context and the origins of big data. I was very much chiming with CILIPS work on information literacy – the university library has a really important part to play here…

TS: I want to make three brief points on engagement, expertise and access. One of the things I’ve observed on the web around online communities, is that there is a tendency to only notice a community until something happens. I study some quite extreme communities, including the involuntary celibate community, and you can’t raise interest until people go out and kill people. We really need to see more engagement and understanding, not as an object of interest. The second point is about experts and what that means… I think that reification of expertise is niave at best, and often dangerous. Only engaging with experts, or corroborating your beliefs, or feeling that you only engage with an expert class, overlooks the way most people engage with issues. And finally on access… In light of Cambridge Analytica, Facebook has shut down access for all but their own Facebook programme (with funding councils) of research. Doing that means only people working at the companies, or the elite universities with particular track records…

Comment: Interesting that you mentioned Tim Berners-Lee as he was the reason Web Science got set up at Southampton. The narrative was… I invented the web (discuss) and it has gone wrong (discuss). That was a perspective that didn’t problematise information or communication etc. The idea was that we would reengineer the web (discuss) as if it is technical, not a complex socio-technical network. I’m not being negative but supporting your statements. The restructing of Information Technology GCSE was a travesty – there was no attempt at critical engagement, just at programming. And it is really important that we envision what we want the web to be. There is no fixed idea of the web. We have gone down the rabbit hole of behavioural tracking and advertising as the only economic model… But we could play with that. I would make a pitch for Utopianism… With Donna Harraway: looking at the trouble and thinking about what else we could do.

Comment: I wondered about… that sense of the internet as being what we hoped it could be… But also the issue of the attack on net neutrality in the US, and immediate recognition that that isn’t ok… How do we back away, not engage in the toxic parts of the internet… But also save the parts that are worth saving… Keeping an eye on legislation? Do we protect without participating?

MT: I immediately started to think of how we talk about bitcoin – very utopian visions and turning it into a profit making machine, as has happened in the internet… How do we build structures that can be used to make money… Without that consuming the rest of it… The internet is consuming all the other stuff… I think bitcoin will be the same… The same people who had money 200 years ago, will be the same people who’ll make money now… Partly information literacy, partly being cynical, being civic… Being alive to issues…

TS: I am going to say two contradictory sounding things… So many of these issues seem to be engineering issues to social problems. I was at a conference with someone talking about a blockchain based education network, with a smart contract to validate credentials. Taking the human out of the process, in order to improve the situation. Bitcoin is supposed to be trustless… But at some point you have a human interface, it will fail… You will always face problems you couldn’t spot – unless you spoke to a social scientist. But that goes with us as social scientists is the need for us to engage with the engineering sides of things… Lots of “if only we could have known what would happen with Cambridge Analytica”, but we’ve known about that for years… We struggle to be listened to by policy makers when compared with businesses who have legitimate routes in, and argue for a lack of accountability. Platforms are not neutral, you can engineer the behaviours available in the space. You have to understand the feedback loop between administration and engineering.

EM: Thinking about democratisation.. And thinking about utopian visions… Putting my wikimedian hat on… I think that it has been amazing to see the work done by students here… There is real benefit to having a very transparant space online where you can query or change or contribute to the world. Wikipedia is committed to keeping the human element at its core. One of the ways that Wikipedia checks and balances the data is that you can’t edit a page unless you’ve had an account for four days.

KL: That’s where libraries of all kinds come in – a space or platform to trace the source, the archive materials… And digital data… Data curation and longer term lifecycles.. Digital content being created… To check, to contribute.

Comment: There’s an interesting underlying narrative that the web has gone wrong, and that the economy has gone wrong… As if these structured inequalities are accidental but they are not, they are deliberate. We need a critical historical narrative of the web and how this has taken place…. And the historical narrative of where the web has come from. We need more engagement from the humanities here… There are underlying themes here.

Comment: From literary and fan fiction studies we have for years been talking to a literature and community that exists online and how that interacts online. Fan fiction is often written by women, by BME and LGBTQ and non-binary people… We have a cry of “own the servers” to avoid exploitation… Could anyone comment on that type of utopian vision – the local and the global… Who accesses the data…

KL: From my context of the library, it’s about putting materials out there to access what they need as equitably as possible… But that’s difficult… For archives and personal material there are restrictions and limitations for good reason… We haven’t cracked that perfectly… It is a challenge, there isn’t an easy answer to it…

EM: From a Wikipedia angle… Wikipedia had a conversation within and around the community about where the community is going by 2030… Where they were going, what they needed to do to share and access knowledge around the world… To enable better understanding… To more civic and better societies. But there are huge disparities of access. Out of that came the sense of knowledge not as a product but as a service. And the idea of knowledge equity – in terms of access but recognising only 10% of editors are female, it’s Northern Hemisphere orientated, only 2.5% of geotagged content relates to Africa. It’s not shying away from that, instead trying to address that over time… Which is why Wiki Project Medicine has created “the internet in a box” to enable access to a downloaded medical version of the content to improve access to information.

Comment: From Biological sciences background… My question underpins everything here… We haven’t really touched on digital preservation, it’s a big and worrying thing. I’ve listened to comment on big gaps in digital data, it’s really difficult in the long term. How will that be affected by GDPR and what can be done there in terms of preservation and access. We are looking more and more at the cloud… The carbon footprint of ICT is expected to be 40% by 2040. Thinking about preservation and the more and more carbon intensive nature of the web, what can universities do to tackle these years…

KL: Digital preservation is close and dear to us. It is challenging and not easy. It’s not a commodity you can just buy, there isn’t one way to do this. We are trying to tackle certain areas. We are trying to preserve the university’s history. We also look actively on research data produced by the University. Addressing those two areas, there is still a huge area of web output and web archiving there… There is interest in the University output, but less interest in the wider context. We acknowledge that agenda and push it up in the university – and digital humanities helps here, and that means access to information which helps us make our case. With GDPR does present complexity, it does mean working with encryption… For company/global content that’s broader.

Comment: In terms of the issue of experts… I think it’s interesting to see experts by credentials, or by reputation… And how that relates to the internet… It seems like a great way to be a self-made expert… To promote yourself as an expert because you have a blog. You may have stature and influence… But that’s very different from a PhD or an academic expertise… I’m interested that part of being an expert is admitting when you don’t know something… It seems the public wants experts to tell you the answer right now… What is the role of the internet right now here.

TS: I have a lot of thoughts on this. It’s basically my PhD. If I ramble… Stop me… I think this is fundamentally about the way we reconceptualise expertise.. There is the idea of it being reiified, as rare and based on credentials, and that being in conflict with other types of self-made influential. Steven Taylor has a paper on experts across three types, including this group of self-made experts… They come to represent a much larger group of experts – it hasn’t democratised broadcast but it’s certainly opened up and broadened the field somewhat. When we understand expertise as only credentialed people in specific organisations, we limit communication. We have to be able to engage as compellingly as these people able to weaponise, essentially, nonsense and see how we can be as engaging with them. We have to be provocative and interesting. We can’t expect people to just come and ask the right experts. The burden shouldn’t be on audiences, the burden should be on “experts” to be palatable and appealing as experts.

MT: The anti expertise thing isn’t a new thing too… It goes right back to founding of universities, particularly in the Victorian era… I have a book coming out on professors in childrens literature, and accompanying anthrology, and every single story is “the professor is rubbish”. All of them. All about not trusting experts, just when expertise is being formalised… The general populace ridiculing them… The internet has boosted that again. But a positive thing… Crowd sourcing is a positive development… We did a few crowd sourcing projects that truly changed access and use of information – work that used to only be done by paleographist, looking at Jeremy Bentham’s papers… The internet helped us speed that all up… If we have the right platforms, the right structures, we can do the right things… But we can’t let “expertise is rubbish to perpetuate”.

EM: Again with digital preservation, there is a cost attached… There may be volunteers… If there is a platform or a lack of cost… You can do a lot. And archive a lot in public ways…

KL: I was going to add that the cultural heritage sector has an interesting relationship with working with the community… But there is this tension about how and who can contribute how, and who can do it best. But the crowd is full of enthusiasm… As long as work is provenanced…. That is a really good way to positively use the web.

Comment: In response to the Cambridge Analytica stuff… And why didn’t they listen to the social scientists… Isn’t GDPR an example of the law doing as good a job as it could… And data ownership… Legislative work in Europe on copyright and data ownership… If we want to set the right example, it’s not enough to throw up our hands in horror… You have to engage in legislative process… Laws do have an impact in cyberspace.

Comment: Business models – and how do we change that – it shapes the platform. Investment doesn’t go in equally – and as universities we do start ups, we do engagement with industry. How do we move beyond all of these businesses being set up by young wealthy guys, and opening that up… And reconceptualising success as more than just exit, and data as asset – and that being personal data. I also wanted to note that web archiving does take place – with the Internet Archive who operate in the more permissive US copyright context (and mirrored in Canada – they were concerned that Trump might interfere with the archive). There is a small but politically aware web archiving community but part of making that and any platform work is about acknowledging that there is cost to running platforms, to archiving materials…

Comment: That idea of “an expert” – surely we reconceptualise the expert as a distributed thing.

TS: Yes.

MT: And with that I’d like to thank the panel and draw this to a close. We hope to have some announcements in the next year about expanding this work, and this day takes place in an environment that contributed to my coming to Edinburgh, with the City Deal, and with the work driving Edinburgh to be the Data Driven Innovation capital of Europe.

May 022018
 

This morning I’m at the “Working with the British Library’s Digital Content, Data and Services for your research (University of Edinburgh)” event at the Informatics Forum to hear about work that has been taking place at the British Library Labs programme, and with BL data recently. I’ll be liveblogging and, as usual, any comments, questions, 

Introduction and Welcome – Professor Melissa Terras

Welcome to this British Library Labs event, this is about work that fits into wider work taking place and coming here at Edinburgh. British Library Labs works in a space that is changing all the time, and we need to think about how we as researchers can use digital content and this kind of work – and we’ll be hearing from some Edinburgh researchers using British Library data in their work today.

“What is British Library Labs? How have we engaged researchers, artists, entrepreneurs and educators in using our digital collections” – Ben O’Steen, Technical Lead, British Library Labs

We work to engage researchers, artists, entrepreneurs and educators to use our digital collections – we don’t build stuff, we find ways to enable access and use of our data.

The British Library isn’t just our building in St Pancras, we also have a huge document supply and storage facility in Boston Spa. At St Pancras we don’t just have the collections, we have space to work, we have reading rooms, and we have five underground floors hidden away there. We also have a public mission and a “Living Knowledge Vision” which helps us to shape our work

British Library Labs has been running for four years now, funded by the Andrew Mellow Fund, and we are in our third funded phase where we are trying to make this business as usual… So the BL supports the reader who wants to read 3 things, and the reader who wants to read 300,000 things. To do that we have some challenges to face to make things more accessible – not least to help people deal with the sheer scale of the collections. And we want to avoid people having to learn unfamiliar formats and methodologies which are about the library and our processes. We also want to help people explore the feel of collections, their “shape” – what’s missing, what’s there, why and how to understand that. We also want to help people navigate data in new ways.

So, for the last few years we have been trying to help researchers address their own specific problems, but also trying to work out if that is part of a wider problem, to see where there are general issues. But a lot of what we have done has been about getting started… We have a lot of items – about 180 million – but any count e have is always an estimates. Those items include 14m books, 60m patents, 8m stamps, 3m sound recordings… So what do researchers ask for….

Well, researchers often ask for all the content we have. That hides the failure that we should have better tools to understand what is there, and what they want. That is a big ask, but that means a lot of internal change. So, we try to give researchers as much as we have… Sometimes thats TBs of data, sometimes GBs.. And data might be all sorts of stuff – not just the text but the images, the bindings, etc. If we take a digitised item we have an image of the cover, we have pictures, we have text, we also have OCR for these books – when people ask for “all” the book – is that the images, the OCR or both? One of those is much easier to provide…

Facial recognition is quite hot right now… That was one of the original reasons to access all of the illustrations – I run something called the Mechanical Curator to help highlight those images – they asked if they could have the images – so we now have 120m images on Flickr. What we knew about images was the book, and the page. All the categorisation and metadata now there has been from people and machines looking at the data. We worked with Wikimedia UK to find maps, using manual and machine learning techniques – kind of in competition – to identify those maps… And they have now been moved into georeferencing tools (bl.uk/maps) and fed back to Flickr and also into the catalgue… But that breaks the catalogue… It’s not the best way to do this, so that has triggered conversations within the library about what we do differently, what we do extra.

As part of the crowdsourcing I built an arcade machine – and we ran a game jam with several usable games to categorise or confirm categories. That’s currently in the hallway by the lifts in the building, and was the result of work with researchers.

We put our content out there under CC0 license, and then we have awards to recognise great use of our data. And this was submitted – a video of Hey There Young Sailor official music video using that content! We also have the Off the Map copetition – a curated set of data for undergraduate gaming students based on a theme… Every year there is something exceptional.

I mentioned library catalogue being challenging. And not always understanding that when you ask for everything, that isn’t everything that exists. But there are still holes…. When we look at the metadata for our 19th century books we see huge amounts of data in [square brackets] meaning the data isn’t known but is the best suggestion. And this becomes more obvious when we look at work researcher Pieter Francois did on the collection – showing spikes in publication dates at 5 year intervals… Which reflects the guesses at publication year that tend to be e.g. 1800/1805/1810. So if you take intervals to shape your data, it will be distorted. And then what we have digitised is not representative of that, and it’s a very small part of the collection…

There is bias in digitisation then, and we try to help others understand that. Right now our digitised collections are about 3% of our collections. Of the digitised material 15% is openly licensed. But only about 10% is online. About 85% of our collections cn only be accessed “on site” as licenses were written pre-internet. We have been exploring that, and exploring what that means…

So, back to use of our data… People have a hierachy of needs from big broad questions down to filtered and specific queries… We have to get to the place where we can address those specific questions. We know we have messy OCR, so that needs addressing.

We have people looking for (sometimes terrible) jokes – see Victorian Humour run by Bob Nicholson based on his research – this is stuff that can’t be found with keywords…

We have Kavina Novrakas mapping political activity in the 19th Century. This looks different but uses the same data and the same platform – using Jupyter Notebooks. And we have researchers looking at black abolitionists. We have SherlockNet trying to do image classification… And we find work all over the place building on our data, on our images… We found a card game – Moveable Type – built on our images. And David Normal building montages of images. We’ve had poetic places project.

So, we try to help people explore. We know that our services need to be better… And that our services shape expectations of the data – and can omit and hide aspects of the collections. Exploring data is difficult, especially with collections at this scale – and it often requires specific skills and capabilities.

British Library Labs working with University of Edinburgh and University of St Andrews Researchers

“Text Mining of News Broadcasts” – Dr. Beatrice Alex, Informatics (University of Edinburgh)

Today I’ll be talking about my work with speech data, which is funded by my Turing fellowship. I work in a group who have mainly worked with text, but this project has built on work with speech transcripts – and I am doing work on a project with news footage, and dialogues between humans and robots.

The challenges of working with speech includes particular characteristics: short utterances, interjections; speaker assumptions – different from e.g. newspaper text; turn taking.  Often transcripts miss sentence boundaries, punctuation or missing case distinctions. And there are errors introduced by speech recognition.

So, I’m just going to show you an example of our work which you can view online – https://jekyll.inf.ed.ac.uk/geoparser-speech/. Here you can do real time speech recognition, and this can then also be run through the Edinburgh Geoparser to look for locations and identify their locations on the map. There are a few errors and, where locations haven’t been recognised in the speech recognition they also don’t map well. The steps in this pipeline is speech recognition… ASR then Google Text Restoration, and then text and data mining.

So, at the BL I’ve been working with Luke McKernan, lead curator for news and moving images. I have had access to a small set of example news broadcast files for prototype development. This is too small for testing/validation – I’d have to be onsite at BL to work on the full collection. And I’ve been using the CallHome collection (telephone transcripts) and BBC data which is available locally at Informatics.

So looking at an example we can see good text recognition. In my work I have implemented a case restoration step (named entities and sentence initials) using rule based lexicon lookup, and also using Punctuator 2 – an open source tool which adds punctuation. That works much better but isn’t up to an ideal level there. Meanwhile the Geoparser was designed for text so works well but misses things… Improvement work has taken place but there is more to do… And we have named entity recognition in use here too – looking for location, names, etc.

The next steps is to test the effect of ASR quality on text mining – using CallHome and BBC broadcast data) using formal evaluation; improve the text mining on speech transcript data based on further error analysis; and longer term plans include applications in the healthcare sector.

Q&A

Q1) Could this technology be applied to songs?

A1) It could be – we haven’t worked with songs before but we could look at applying it.

“Text Mining Historical Newspapers” – Dr. Beatrice Alex and Dr. Claire Grover, Senior Research Fellow, Informatics (University of Edinburgh) [Bea Alex will present Claire’s paper on her behalf]

Claire is involved in an Adinistrative Data Research Centre Scotland project looking at local Scottish Newspapers, text mine it, and connect it to other work. Claire managed to get access to the BL newspapers through Cengage and Gale – with help from the University of Edinburgh Library. This isn’t all of the BL newspaper collection, but part of it. This collection of data is also now available for use by other researchers at Edinburgh. Issues we had here ws that access to more reent newspaper is difficult, and the OCR quality. Claire’s work focused on three papers in the first instance, from Aberdeen, Dundee and Edinburgh.

Claire adapted the Edinburgh Geoparser to process the OCR format of the newspapers and added local gazetteer resouces fro Aberdeen, Dundee and Edinburgh from OS OpenData. Each article was then automatically annotated with paragraph, sentence, work mark-up; named entities – people, place, organisation; location; geo coordinates.

So, for example, a scanned item from the Edinburgh Evening News from 1904 – its not a great scan but the OCR is OK but erroneous. Named entities are identified, locations are marked. Because of the scale of the data Claire took just one year from most of the papers and worked with a huge number of articles, announcments, images etc. She also drilled down into the geoparsed newspaper articles.

So for Abereen in 1922 there were over 19 million word/punctuation tokens and over 230,000 location mentions Then used frequency methods and concordances to understand the data. For instance she looked for mentions of Aberdeen placenames by frequency – and that shows the regions/districts of abersteen – Torry, Woodside, and also Union Street… Then Claire dug down again… Looking at Torry the mentions included Office, Rooms, Suit, etc, which gives a sense of the area – a place people rented accommoation in. In just the news articles (not ads etc) then for Torry it’s about Council, Parish, Councillor, politics, etc.

Looking at Concordances Claire looked at “fish”, for instance” to see what else was mentioned and, in summary, she noted that the industry was depressed after WW1; there was unemployment in Aberdeen and the fishing towns of Aberdeenshire; that there was competition rom German trawlers landing Icelandic fish; that there were hopes to work with Germany and Russia on the industry; and that government was involved in supporting the industry and taking action to improve it.

With the Dundee data we can see the Topic Modelling that Claire did for the articles – for instance clustering of cars, police, accidents etc; there is a farming and agriculture topic; sports (golf etc)… And you can look at the headlines from those topics and see how that reflect the identified topics.

So, next steps for this work will include: improving text analysis and geoparsing components; get access to more recent newspapers – but there is issing infrastructure for larger data sets but we are working on this; scale up the system to process whole data set and store text ining output; tools to summarise content; and tools for search – filtering by place, data, linguistic context – tools beyond the command line.

“Visualizing Cultural Collections as a Speculative Process” – Dr. Uta Hinrichs, Lecturer at the School of Computer Science (University of St Andrews)

My research focuses on visualisation and Human Computer Interaction. I am particularly interested in how interfaces can make visible digital collections. I have worked on a couple of projects with Bea Alex and others in the room to visualise texts. I will talk a little bit about LitLong, and the process in developing early visualisations for the project.

So, some background… Edinburgh is a UNESCO City of Literature, with lots of literature about and in the city. And we wanted to automate the discovery of Edinburgh-absed literature from available digitised text. That included a large number of collections – about 380k – from collections including the BL 19th Century Books collection. And we wanted to make results accessible to the public.

There were lots of people involved here, from Edinburgh University (PI, James Loxley), Informatics, St Andrews, and EDINA. And worked both with out of copyright texts, but also we had special permission to work with some in-copyright texts including Irvine Welsh. And a lot of work was done to geoparse the text – and assess it’s Edinburghyness. For each mention we had the author, the title, the year, and snippets of the text from around the mention. This led to visualisations – I worked on LitLong 1.0 and I’ll talk about this, but a further version (LitLong 2.0) launched last year.

So you can explore clusters of places mentioned in texts, you can explore the clustered words and snippets around the mentions. And you can zoom in to specific texts – again you can see the text snippets in detail. When you explore the snippets, you can see what else is there, to explore other snippets.

So in terms of the design considerations we wanted a multi faceted intractive overview of the data – Edinburgh locations; books; extracted snippets; authors; keywords. Maps and lists are familiar and we wanted this tool to be accessible to scholars but also the public. We took an approach that allowed “generous” explorations (Mitchell Whitelaw 2015) so there are suggestions of how to explore further, parts of the data showing… Weighted tag clouds let you get a feel of the data for instance.

As a process it wasn’t like the text mining happened then we magically had the visualisations… It was iterative. And actually we used visualisation tools to actually assess which texts were in scope, and which weren’t going to be relevant – and mark them up to keep or to rule out a text. This interface included information on where in a text the mention occurred – to help identify how much about Edinburgh a text actually was.

We had a creative visualisation process… We launched the interface in 2015, and there was some iteration and that also inspired LitLong 2.0 which is a much more public-friendly way to explore the material in different way.

So, I think it is important to think about visualisation as a speculative process. This allows you to make early computational analysis approached visille and facilitate qa and curatorial process. To promote new interactions transforming a print based culture into something different – thinking about materiality rather than just content is important as we enable exporation. When I look back at my own work I see some similarities in interfaces… You can see the unique qualities of the collections in the data trends but we are doung much more work on designing interfaces  that surface the unique qualities of the collection in new ways.

Q&A

Q1) What did you learn about Edinburgh or literature in Edinburgh from this project?

A1) The literature scholars would be better able to talk about that but I know it has inspired new writers. Used in teaching. And also discovered some characteristics of Edinburgh, and women writers in the corpus… James Loxley (Edinburgh) and Tara Thompson (Edinburgh Napier University) could say more about how this is being used in new literary research.

“Public Private Digitisation Partnerships at the British Library” – Hugh Brown, British Library Digitisation Project Manager

I work as part of the Digital Scholarship team at the British Library, which was founded in 2010 to support colleagues and researchers to make innovative use of BL digital collections and data – and recognising the gap in provision we had there. The team is led by Adam Farquhar – Head of Digital Scholarship, and by Neil Fitzgerald, Head of Digital Research Team. We are cross disciplinary experts in the areas of digitisation, librarianship, digital historu adnd humanities, computer and data sience and we look at how technilogu is transforming research and in turn our services. And we include the British Library Labs, Digital Curators, adn the Endangered Archives Programme (EAP).

So, we help get content online and digitised, we support researchers, and we run a training programme to bridge skills so that researchers can begin to engage with digital resources. We expect that in 10-15 years time those will be core research skills so we might not exist – it will just be part of the norm. But we are a long way off that at the moment. We also currently run Hack and Yack events to experiment and discuss. And we also have a Reading Room to share what’s happening in the world, to share best practice.

In terms of our collections and partnerships, we have historically had a slightly piecemeal digitisation approach, so we now have a joined up strategy that sits under our Living Knowledge strategy and includes partnership, commercial strategy and our own collection strategy. Our partnerships recognise that we don’t always have the skills we need to make content available, whilst our commercial strategy – where I work – allows us to digitise as much as possible, and in a context were we don’t have infinite funding for digitisation.

We have various factors in mind when considering potential partnership. The types of approach include partnerships based on whether materials are in or out of copyright – if in copyright then commercial partners have to clear rights. We do public/private partnership with technology partners. We have non-commercial organisational and/or consortium funding. And we have philanthropic donor funded work. Then we think about content – content strategy, asset ownership, digitisation location. We think about value – audience type/interest/geography, and topicality. We think about copyright – British library owns the rights, rights of reuse. We think about disocverability – the ability to identify and search, and access that maximises exposure. We look at the (BL) benefit – funding, access etc. We look at risk. And we look at contract – whether it is non-exclusive, commercial/non commercial.

So, we have had public-private digitisation partnerships with Gale Cengage Learning, Adam Matthews, findmypast, Google Books, Microsoft books, etc. And looking at examples Google books has been 80m+ images digitised; Microsoft books was 25m images; findmypast has done 23m+ images of newspapers; Gale Cengage Learning has done 18th century collections – 22m images, 19c online 2.2m+ images, and Arabic books, etc.

The process begins with liaison with key publishers. Then there is market and content research. Then we plan and agree plan, including licensing of rights for a fixed term (5-10 years), and royalty arrangements and reading room access. Then digitisation takes place, funded by the partner – either by setting up a satellite studio, or using the BL studio. So our partners digitise content and give us that content, in exchange they get 5-10 years exclusive agreement to use that content on their platform. And revenue  generated for BL helps support what we do, and our curators work around digitisation.

So Findmypast was an interesting example. We had electoral registers and India Office Records – data with real commercial value. So, we put a tender out for a partner for digitisation. Findmypast was selected… Part of that was to do with the challenges of the electoral registers which were inconsistent formats etc. so needed a lot of specific work And we also needed historical country boundaries to be understood to make it work. There was also a lot of manual OCR work to do.

With Gale Cengage they tend to be education/universities focused and they work with researchers. We worked with them to select 19th century materials to fit their themes and interests. They did the early arabic book project – a really complex project. The private case collection consisted of mainly books that had been inaccessible on grounds of obscenity from around 1600 and 1960.

With Adam Mathew Digital we were approaches to contribute material from the electoral registers and india office records. And materials on the East India Company.

Now these are exciting projects but we want 20-30% of content generated in these projects to be available as a corpus for research and that’s important to our agreements.

Challenges in the workflow include ensuring business partners and scannning vendors have a good understanding of the material BL holds in our collections. We have to define and provide metadata requirements the BL needs to supply to the partners. Getting statistics and project plans from information business partners. There are logistical challenges around understanding the impact of digitisation on BL departments supporting the process. We have to manage partners business drivers versus BL curatorial drivers. We have to manage the parters digitisation vendors on site. And ensuring the final digital assets/metadata received meets BL requirements for sign off and ingest.

Q&A

Q1) How can we actually access this stuff for research?

A1) For pure research that can be done. For example we have a company in Brighton who are doing research on the electoral roll. That’s not in competition with what the private partner is doing.

Comment from Melissa) My experience is “don’t ask, don’t get” – so if you see something you want to use in your research, do ask!

“The Future of BL Labs and Digital Research at the Library” – Ben O’Steen

I’ve handed out some personas for users of our digital collections – and a blank sheet on the back. We are trying to build up a picture of the needs of our users, their skills and interests, and that helps us illustrate what we do – that’s a thing to come back to (see: https://goo.gl/M41Pc4/)

So I want to talk about the future of BL Labs. We are a project and our funding is due to finish. Our role has been to engage with researchers and that is going to continue – maybe with that same brand just not as a project. We need to learn what they want to do… We need to collect evidence of demand. And we are developing a business model and support process to make “Business as usual” at the BL. We want to help to create pathway to developing a “Digital Research Suit” at the BL by 2019. But we want to think about what that might be, and we are piloting ideas including small 2 person workrooms for digital projects. And we can control access – so that we can see how this works, and ensure that the users understand what you can and cannot do with the data (that you can’t just download everything and walk out with it).

And many other places are being “inspired” by our model – take a look at the Library of Congress work in particular.

So, at this stage we are looking at our business model and how we can make these scalable services. Our model to date has been smaller scale, about capabilities to get started, etc. That is not scalable at the level we’ve been working. We need a more hands off proess ad to be able to see more people. We also run BL Labs Awards which, instead of working with people, recognises work people have already done. People submit and then in October our advisory board reviews the entries and looks for work that champions our content.

To develop our business model we are exploring, evaluating and implementing a business model. We are using business model canvas. We have internal and external business model development, implementation and evaluation groups, and exploring how this could work in practice. And we are testing, piloting and implementing our business model. That means:

  • developing support service
    • Entry level – about the collection, documentation improvements, case studies that help show what is in there.
    • Baseline – basic enquiry service to enable researchers to understand if a BL project is the right path, any legal restrictions that need addressing, etc. We try to get you to the next stage of developing your idea.
    • Intermediate – Consultation service, which will be written in as part of a bid.
    • Advanced – support 10 projects per year through an application process)
  • Augment data.bl.uk – that was a placeholder for a year, and now a tender has just gone out for a repository type service for 12-18 months
    • e.g. sample datasets, tools, examples of use
    • Pilot use of Jupyter Notebooks / Docker other tools for Open and Onside data
  • Researcher access to BL APIs
  • Reading room services – onside access/compute to digital collections – which means us training staff

This has come about as we’ve seen a pattern in approaches that start with an initial exploration phase, then transition into investigation and then some sort of completion phase. There had been a false assumption (on the data providers part) that data-based work must start at the investigation phase – to have an idea of the project they want to do, to know the data already, to know the collections. What we are piloting is that essential exploratory stage, acknowledging that that happens. And that pattern shifts around – exploration and investigation stages can fork off in different directions, that’s fine.

So, timescales and themes seem to be a phase of quick initial work. A longer and variable transition takes place into investigation – probably months. Then investigation takes months to a year. And crucially that completion stage.

Exploration is about understanding the data in an open-ended fashion. It is about discovering the potential tools to work with the data. We want people to gain awareness of their capabilities and limitations – a reality check and opportunity to understand the need for partners and/or new tools. And it’s about developing a firmer query as that helps you to understand the cost, risk, time you might need. Exploration (e.g. V&A Spelunker) lets you get a sense of what’s there, which gives you a different way in to the keyword or catalogue search. And then you have artists like Mario Klingemann – collating images looking sad… It’s artistic but talks about how women are portrayed in the 19th Century. He’s also done work on hats on the ground – and found it’s always a fight! This is showing cultural memes – an important question… An older example is the Cooper Heritt collection – which lets you see all of tags – including various types of similarity that show new ways into the data.

So, what should a digital exploration service look like? Which apps? Does Jupyter Notebook assume too much?

We’ve found that every time we present the data, it shapes the perception. For instance the On the Road manuscript is on a roll. If you print a book on a receipt roll it’s different and reads and is understood differently.

MIT have a Moral Machine survey (http://moralmachine.mit.edu/) which is the classic trolley issue – crowdsourced for autonomous vehicle. But that presentation shapes and limits the questions, and that is biased. Some of the best questions we’ve seen have been from people who have asked very broad questions and haven’t engaged in exploration in other ways. They are hard to answer (e.g. all depictions of women) but they reveal more. Presenting as a searchable list shapes how we interpret the result… But for instance showing newspaper articles as if in a giant newspaper – not a list of results – changes what you do. And that’s why tools like IIIF seems useful.

So… We have things like Gender API. It looks good, it looks professional… If you try it with a western name, does it work. If you try it with an Indian name, does it work. If you try it with a 19th Century name does it work? Know that marketeers will use this. See also sentiment analysis. Some of these tools are based on Twitter. I found a research working an 18th Century texts for sentiment about war and conflict… Through a tool developed and trained for Tweets. We have to be transparent in what is happening, in understanding what you are doing… Hence thinking about personas.

We are trying to think about how we show what is missing from a collection, rather than what is present so that data can be used in a more informed way. We are looking at what research environments we can provide – we know that people want to use their own but we can sometimes be a bit stuffed by licensing based in a paper era. On site tools can help. Should we enable research environments for open data that can be used off site too. We are thinking about focus – are the query, tooling and collections required well defined; is it feasible – legal, cost, ethical, source data quality, etc; is it affordable – time, people, money; etc.

So, we have, on the BL Labs website, a form – it’s long so do send us feedback on whether that is the right format etc. – to help us understand demand and skills.

Those personas – please fill these in – and let us know the technical part, what you might want, how technical the support you need. We are keen to discuss your needs, challenges and issues.

And with that we are done and moving onto lunch and discussion. Thanks to Ben, Hugh, Alex and Uta we well as Melissa and the Digital Scholarship Team!

 

Mar 232018
 

Today I am back at the Data Fest Data Summit 2018, for the second day. I’m here with my EDINA colleagues James Reid and Adam Rusbridge and we are keen to meet people interested in working with us, so do say hello if you are here too! 

I’m liveblogging the presentations so do keep an eye here for my notes, updated throughout the event. As usual these are genuinely live notes, so please let me know if you have any questions, comments, updates, additions or corrections and I’ll update them accordingly. 

Intro to Data Summit Day 2 – Maggie Philbin

We’ve just opened with a video on Ecometrica and their Data Lab supported work on calculating water footprints. 

I’d like to start by thanking our sponsors, who make this possible. And also I wanted to ask you about your highlights from yesterday. These include Eddie Copeland from Nesta’s talk, discussion of small data, etc. 

Data Science for Societal Good — Who? What? Why? How? –  Kirk Borne, Principal Data Scientist and Executive Advisor, Booz Allen Hamilton

Data science has a huge impact for the business world, but also for societal good. I wanted to talk about the 5 i’s of data science for social good:

  1. Interest
  2. Insight
  3. Inspiration
  4. Innovation
  5. Ignition

So, the number one, is the Interest. The data can attrat people to engage with a problem. Everything we do is digital now. And all this information is useful for something. No matter what your passion, you can follow this as a data scientist. I wanted to give an example here… My background is astrophysics and I love teaching people about the world, but my day job has always been other things. About 20 years ago I was working in data science at NASA and we saw an astronomical – and I mean it, we were NASA – growth in data. And we weren’t sure what to do with it, and a colleague told me about data mining. It seemed interesting but I just wasn’t getting what the deal was. We had a lunch talk from a professor at Stanford, and she came in and filled the board with equations… She was talking about the work they were doing at IBM in New York. And then she said “and now I’m going to tell you about our summer school” – where they take kids from inner city kids who aren’t interested in school, and teach them data science. Deafening silence from the audience… And she said “yes, we teach the staff data mining in the context of what means most for these students, what matters most. And she explained: street basketball. So IBM was working on a software called IBM Advanced Calc specifically predicting basketball strategy. And the kids loved basketball enough that they really wanted to work in math and science… And I loved that, but what she said next changed my life.

My PhD research was on colliding galaxy. It was so exciting… I loved teaching and I was so impressed with what she had done. These kids she was working with had peer pressure not to be academic, not to study. This school had a graduation rate of less than 50%. Their mark of success for their students was their graduation rate – of 98%. I was moved by that. I felt that if this data science has this much power to change lives, that’s what I want to do for the rest of my lives. So my life, and those of my peers, has been driven by passion. My career has been as much about promoting data literacy as anything else.

So, secondly, we have insight. Traditionally we collect some data points but we don’t share this data, we are not combining the signals… Insight comes from integrating all the different signals in the system. That’s another reason for applying data to societal good, to gain understanding. For example, at NASA, we looked at what could be combined to understand environmental science, and all the many applications, services and knowledge that could be delivered and drive insight from the data.

Number three on this list is Inspiration. Inspiration, passion, purpose, curiousity, these motivate people. Hackathons, when they are good, are all about that. When I was teaching the group projects where the team was all the same, did the worst and least interestingly. When the team is diverse in the widest sense – people who know nothing about Python, R, etc. can bring real insights. So, for example my company run the “Data Science Bowl” and we tackle topics like Ocean Health, Heart Health, Lung Cancer, drug discovery. There are prizes for the top ten teams, this year there is a huge computing prize as well as a cash prize. The winners of our Heart Health challenge were two Wall Street Quants – they knew math! Get involved!

Next, innovation. Discovering new solutions and new questions. Generating new questions is hugely exciting. Think about the art of the possible. The XYZ of Data Science Innovation is about precision data, precision for personalised medicine, etc.

And fifth, ignition. Be the spark. My career came out of looking through a telescope back when I lived in Yorkshire as a kid. My career has changed, but I’ve always been a scientist. That spark can create change, can change the world. And big data, IoT and data scientists are partners in sustainability. How can we use these approaches to address the 17 Sustainability Development Goals. And there are 229 Key Performers Indicators to measure performance – get involved. We can do this!

So, those are the five i’s. And I’d like to encapsulate this with the words of a poet…. Data scientists – and that’s you even if you don’t think you are one yet. You come out of the womb asking questions of the world. Humans do this, we are curious creatures… That’s why we have that data in the first place! We naturally do this!

“If you want to build a ship, don’t drum up people to gather wood adn don’t assign them tasks and work, but rather teach them to yearn for the vast and endless sea”

– Antoine de Saint-Exupery.

This is what happened with those kids. Teach people to yearn for the vast and endless sea, then you’ll get the work done. Then we’ll do the hard work

Slides are available here: http://www.kirkborne.net/DataFest2018/

Q&A

Comment, Maggie Philbin) I run an organisations, Teen Tech, and that point that you are making of start where the passion actually is, is so important.

KB) People ask me about starting in data science, and I tell them that you need to think about your life, what you are passionate about and what will fuel and drive you for the rest of your life. And that is the most important thing.

Q1) You touched on a number of projects, which is most exciting?

A1) That’s really hard, but I think the Data Bowl is the most exciting thing. A few years back we had a challenge looking at how fast you can measure “heart ejection fraction – how fast the heart pumps blood out” but the way that is done, by specialists, could take weeks. Now that analysis is built into the MRI process and you can instantly re-scan if needed. Now I’m an astronomer but I get invited to weird places… And I was speaking to a conference of cardiac specialists. A few weeks before my doctor diagnosed me with a heart issue…. And that it would take a month to know for sure. I only got a text giving me the all clear just before I was about to give that talk. I just leapt onto that stage to give that presentation.

The Art Of The Practical: Making AI Real – Iain Brown, Lead Data Scientist, SAS

I want to talk about AI and how it can actually be useful – because it’s not the answer to everything. I work at SAS, and I’m also a lecturer at Southampton University, and in both roles look at how we can use machine learning, deep learning, AI in practical useful ways.

We have the potential for using AI tools for good, to improve our lives – many of us will have an Alexa for instance – but we have to feel comfortable sharing our data. We have smart machines. We have AI revolutionising how we interact with society. We have a new landscape which isn’t about one new system, but a whole network of systems to solve problems. Data is a selleble asset – there is a massive competitive advantage in storing data about customers. But especially with GDPR, how is our data going to be shared with organisations, and others. That matters for individuals, but also for organisations. As data scientists there is the “can” – how can the data be used; and the “should” – how should the data be used. We need to understand the reasons and value of using data, and how we might do that.

I’m going to talk about some exampes here, but I wanted to give an overview too. We’ve had neural networks for some time – AI isn’t new but dates back to the 1950s. .Machine learning came in in the 1980s, deep learning in the 2010s, and cognitive computing now. We’ve also had Moore’s Law changing what is theoretically possible but also what is practically feasible over that time. And that brings us to a definition “Artificial Intelligence is the science of training systems to emulate human tasks through learning and automation”. That’s my definition, you may have your own. But it’s about generating understanding from data, that’s how AI makes a difference. And they have to help the decision making process. That has to be something we can utilise.

Automation of process through AI is about listening and sensing, about understanding – that can be machine generated but it will have human involvement – and that leads to an action being made. For instance we are all familiar with taking a picture, and that can be looked at and understood. For instance with a bank you might take an image of paperwork and passports… Some large banks check validity of clients with a big book of pictures of blacklisted people… Wouldn’t it be better to use systems to achieve that. Or it could be a loan application or contract – they use application scorecards. The issue here is interpretability – if we make decisions we need to know why and the process has to be transparent so the client understands why they might have been rejected. You also see this in retail… Everything is about the segment of one. We all want to be treated as individuals… How does that work when you are one of millions of individuals. What is the next thing you want? What is the next thing you want to click on? Shop Directory, for instance, have huge ranges of products on their website. They have probably 500 pairs of jeans… Wouldn’t it be better to apply their knowledge of me to filter and tailor what I see? Another example is the customer complaint on webchat. You want to understand what has gone wrong. And you want to intervene – you may even want to do that before they complain at all. And then you can offer an apology.

There are lots of applications for AI across the board. So we are supporting our customers on the factors that will make them successful in AI, data, compute, skillset. And we embed AI in our own solutions, making them more effective and enhancing user experience. Doing that allows you to begin to predict what else might be looked at, based on what you are already seeing. We also provide our customers with extensible capabilities to help them meet their own AI goals. You’ll be aware of Alpha Go, it only works for one game, and that’s a key thing… AI has to be tailored to specific problems and questions.

For instance we are working on a system looking at optimising the experience of watching sports, eliminating the manual process of tagging in a game. This isn’t just in sport, we are also working in medicine and in lung cancer, applying AI in similar 3D imaging ways. When these images can be shared across organisations, you can start to drive insights and anomalies. It’s about collaborating, bringing data from different areas, places where an issue may exist. And that has social benefit of all of us. Another fun example – with something like wargaming you can understand the gamer, the improvements in gameplay, ways to improve the mechanics of how game play actually works. It has to be an intrinsic and extrinsic agreement to use that data to make that improvement.

If you look at a car insurer and the process and stream of that, that’s typically through a call centre. But what if you take a picture of the car as a way to quickly assess whether that claim will be worth making, and how best to handle that claim.

I value the application, the ways to bring AI into real life. How we make our experiences better. It’s been attributed to Voltaire, and also to Spiderman, that “with great power comes great responsibility”. I’d say “with great data power comes great responsibility” and that we should focus on the “should” not the “could”.

Q&A

Comment) A correction on Alpha Go: Alpha Zero plays Chess etc. It’s without any further human interaction or change.

Q1) There is this massive opportunity for collaboration in Scotland. What would SAS like to see happen, and how would you like to see people working together?

A1) I think collaboration through industry, alongside academia. Kirk made some great points about not focusing on the same perspectives but on the real needs and interest. Work can be siloed but we do need to collaborate. Hack events are great for that, and that’s where the true innovation can come from.

Q2) What about this conference in 5 years time?

A2) That’s a huge question. All sorts of things may happen, but that’s the excitement of data science.

Socially Minded Data Science And The Importance Of Public Benefits – Mhairi Aitken, Research Fellow, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh

I have been working in data science and public engagement around data and data science for about eight years and things have changed enormously in that time. People used to think about data as something very far from their everyday lives. But things have really changed, and people are aware and interested in data in their lives. And now when I hold public events around data, people are keen to come and they mention data before I do. They think about the data on their phones, the data they share, supermarket loyalty cards. These may sound trivial but I think they are really important. In my work I see how these changes are making real differences, and differences in expectations of data use – that it should be used ethically and appropriately but also that it will be used.

Public engagement with data and data science has always been important but it’s now much easier to do. And there is much more interest from funders for public engagement. That is partly reflecting the press coverage and public response to previous data projects, particularly NHS data work with the private sector. Public engagement helps address concerns and avoid negative coverage, and to understand their preferences. But we can be even more positive with our public engagement, using it to properly understand how people feel about their data and how it is used.

In 2016 myself and colleagues undertook a systematic review of public responses to sharing and linking of health data for research purposes (Aitken, M et al 2016 in BMC medical ethics, 17 (1)). That work found that people need to understand how data will be used, they particularly need to understand that there will be public benefit from their data. In addition to safeguards, secure handling, and a sense of control, they still have to be confident that their data will be used for public benefits. They are even supportive if the benefit is clear but those other factors are faulty. Trust is core to this. It is fundamental to think about how we earn public trust, and what trust in data science means.

Public trust is easy to define. But what about “public benefit”. Often when people call about data and benefits from data. People will talk about things like Tesco Clubcard when they think of benefit from data – there is a direct tangible benefit there in the form of vouchers. But what is the public benefit in a broader and less direct sense. When we ask about public benefit in the data science community we often talk about economic benefits to society through creating new data-driven innovation. But that’s not what the public think about. For the public it can be things like improvements to public services. In data-intensive health research there is an expectation of data learning to new cures or treatments. Or that there might be feedback to individuals about their own conditions or lifestyles. But there may be undefined or unpredictable potential benefits to the public – it’s important not to define the benefits too narrowly, but still to recognise that there will be some.

But who is the “public” that should benefit from data science? Is that everyone? Is it local? National? Global? It may be as many as possible but what is possible and practical? Everyone whose data is used? That may not be possible. Perhaps vulnerable or disadvantaged groups? Is it a small benefit for many, or a large benefit for a small group.  Those who may benefit most? Those who may benefit the least? The answers will be different for different data science projects. That will vary for different members of the public. But if we only have these conversations within the data science community we’ll only see certain answers, we won’t hear from groups without a voice. We need to engage the public more with our data science projects.

So, closing throughts… We need to maintain a social license for data science practices and that means continual reflection on the conditions for public support. Trust is fundamental – we don’t need to make the public trust us, we have to actually be trustworthy and that means listening, understanding and responding to concerns, and being trustworthy in our use of data. Key to this is finding public benefits of data science projects. In particular we need to think about who benefits from data science and how benefits can be maximised across society. Data scientists are good at answering questions of what can be done but we need to be focusing on what should be done and what is beneficial to do.

Q&A

Q1) How does private industry make sure we don’t leave people behind?

A1) BE really proactive about engaging people, rather than waiting for an issue to occur. Finding ways to get people interested. Making it clear what the benefits are to peoples lives There can be cautiousness about opening up debate being a way to open up risk. But actually we have to have those conversations and open up the debate, and learn form that.

Q2) How do we put in enough safeguards that people understand what they consent to, without giving them too much information or scaring them off with 70 checkboxes.

A2) It is a really interesting question of consent. Public engagement can help us understand that, and guide us around how people want to consent, and what they want to know. We are trying to answer questions where we don’t always have the answers – we have to understand what people need by asking them and engaging them.

Q3) Many in the data community are keen to crack on but feel inhibited. How do we take the work you are doing and move sooner rather than later.

A3) It is about how we design data science projects. You do need to take the time first to engage with the public. It’s very practical and valuable to do at the beginning, rather than waiting until we are further down the line…

Q3) I would agree with that… We need to do that sooner rather than later rather than being delayed deciding what to do.

Q4) You talked about concerns and preferences – what are key concerns?

A4) Things you would expect on confidentiality, privacy, how they are informed. But also what is the outcome of the project – is it beneficial or could they be discriminatory, or have a negative impact on society? It comes back to causing public benefits – they want to see outcomes and impact of a piece of work.

 

Automated Machine learning Using H2O’s Driverless AI – Marios Michailidis, Research Data Scientist, H2O.ai

I wanted to start with some of my own background. And I wanted to talk a bit about Kaggle. It is the world’s biggest preictive modelling competition platform with more than a million members. Companies host data challenges and competitors from across the world compete to solve them for prizes. Prizes can be monetary, or participation in conferences, or you might be hired by companies. And it’s a bit like Tennis – you gain points and go up in the ranking. And I was able to be ranked #1 out of a half million members t here.

So, a typical problem is image classification. Can I tell a cat from a dog from an image. That’s very doable, you can get over 95% accuracy and you can do that with deep learning and neural net. And you differentiate and classify features to enable that decision. Similarly a typical problem may be classifying different bird song from a sound recording – also very solvable. You also see a lot of text classification problems… And you can identify texts from a particular writers by their style and vocabulary (e.g. Voltaire vs Moliere). And you see sentiment analysis problems – particularly for marketing or social media use.

To win these competitions you need to understand the problem, and the metric you are being tested on. For instance there was an insurance problem where most customers were renewing, so there was more value in splitting the problem into two – one for renewals, and then a model for others. You have to have a solid testing procedure – really strong validation environment that reflects what you are being tested on. So if you are being tested on predictions for 3 months in the future, you need to test with past data, or test that the prediction is working to have the confidence that what you do will be appropriately generalisable.

You need to handle the data well. Your preprocessing, your feature engineering, which will let you get the most out of your modelling. You also need to know the problem-specific elements and algorithms. You need to know what works well. But you can look back for information to inform that. You of course need access to the right tools – the updated and latest software for best accuracy. You have to think about the hours you put in and how you optimize them. When I was #1 I was working 60 hours on top of my day job!

Collaborate – data science is a team sport! It’s not just about splitting the work across specialisms, it’s about uncovering new insights by sharing different approaches. You gain experience over time, and that lets you focus your efforts on where you can focus your effort for the best gain. And then use ensembling – combine the methods optimally for the best performance. And you can automate that…

And that brings us to H2O’s diverless AI which automates AI. It’s an AI that creates AI. It is built by a group of leading machine learning engineers, academics, data scientists, and kaggle Grandmasters. It handles data cleaning and feature engineering. It uses cutting edge machine learning algorithms. And it optimises and combines them. And this is all through a hypothesis testing driven approach. And that is so important as if I try a new feature or a new algorithm, I need to test it… And you can exhaustively find the best transformations and algorithms for your data. This allows solving of many machine learning tasks, and it is all in parallel to make it very fast.

So, how does it work? Well you have some input data and you have a target variable. You set an objective or success metric. And then you need some allocated computing power (CPU or GPU). Then you press a button and H2O driverless AI will explore the data, it will try things out, it will provide some predictions and model interpretability. You get a lot of insight including most predictive insights. And the other thing is that you can do feature engineering, you can extract this pipeline, these feature transformations, then use with your own modelling.

Now, I have a minute long demo here…. where you upload data, and various features and algorithms are being tried, and you can see the most important features… Then you can export the scoring pipeline etc.

This work has been awarded Technology of the Year by InfoWorld, it has been featured in the Gartner report.

You can find out more on our website: https://www.h2o.ai/driverless-ai/ and there is lots of transparency about how this work, how the model performs etc. You can download a free trial for 3 weeks.

Q&A

Q1) Do you provide information on the machine learning models as well?

A1) Once we finish with the score, we build the second model which is simple to predict that score. The focus on that is to explain why we have shown this score. And you can see why you have this score with this model… That second interpretability model is slightly less automated. But I encourage others to look online for similar – this is one surrogate model.

Q2) Can I reproduce the results from H2O?

A2) Yes. You can download the scoring practice, it will generate the code and environment to replicate this, see all the models, the data generated, and you can run that script locally yourself – it’s mainly Python.

Q3) That’s stuff is insane – probably very dangerous in the hands of someone just learning about machine learning! I’d be tempted to throw data in… What’s the feedback that helps you learn?

A3) There is a lot of feedback and also a lot of warning – so if test data doesn’t look enough like training data for instance. But the software itself is not educational on it’s own – you’d need to see webinars, look at online materials but then you should be in a good position to learn what it is doing and how.

Q4) You talked about feature selection and feature engineering. How robust is that?

A4) It is all based on hypothesis testing. But you can’t test everything without huge compute power. But we have a genetic algorithm to generate combinations of features, tests them, and then tries something else if that isn’t working.

Q5) Can you output as a model as eg a deserialised JSON object? Or use as an API?

A5) We have various outputs but not JSON. Best to look on the website as we have various ways to do these things.

 

Innovation Showcase

This next session showcases innovation in startups. 

Matt Jewell, R&D Engineer, Amiqus

I’m an R&D Engineer at Amiqus, and also a PhD student in Law at Edinburgh University. Firstly I want to talk about Amiqus, and our mission is to make civil justice accessible to the world. And we are engaged in GDPR as a data controller, but also as a trust and identity provider – where GDPR is an opportunity for us. We created amiqusID to enable people to more easily interact with the law – with data from companies house, driving licenses, etc.

As a PhD student in law there is some overlap in my job and my PhD research, and I was asked about in data ethics. So I wanted to note GDOR Article 22 (3) which states that

“the data controller shall implement suitable measures to safeguard the data subject’s rights and frredoms and legitimate interests, at least the right to obtain human intervention on he part of the controller, to express his or her point of view and to the contest the decision.”

And that’s across the board. GDPR recommits us to privacy, but also embeds privacy as a public good. And we have to think about what that means in our own best practices, because our own practices will shape what happens – especially as GDPR is still quite uncertain, still untested in law.

Carlos Labra, CEO & Co-Founder, Particle Analytics

I come from a mechanical engineering background, so this work is about simulation. And specifically we look at fluids simulation in aircraft. Actually particle simulation is the next step in industry, and that’s because it has been incredibly difficult to do this simulation with computers. We can do basic computer models for large scale materials but not appropriate for particles. So in Particle Analytics we are trying to address this challenge.

So, a single simulation for a silo, and my model for a silo, has to calculate the interactions between every single particle (in the order of millions), in very small time intervals. That takes huge computing power. So for instance one of our clients, Astec, works on asphalt dryer/mixer technology and we are using particle analytics to enable them to establish and achieve new energy-based KPIs (Key Performance Indicators) that could make enormous savings per machine per year, purely by optimising to different analytics.

So we look at spatial/temporal filters, multiscale analysis, and reduce data size/noise. The Data operators generate new insights and KPIs. So the cost of simulation is going down, and the insights are increased.

Steven Revill, CEO & Co-Founder, Urbantide

I’m here to talk to you about our platform USmart which is making smart data. How do we do this? Well, when we started a few years ago we recognised that our businesses, organisations, and places, would be helped by artificial intelligence based on data. That requires increased collaboration around data and increasing reuse of data. Too often data is in silos, and we need to break it out and share it. But we also need to be looking at real time data from IoT devices.

So, our solution is USmart. It collects data from any source in real time, and we create value with automatic data pipelines with analytics, visualisation and AI ready. And that enables collaboration – either with partners in a closed way, or as open data.

So, I want to talk about some case studies. Firstly Smartline, which is taking housing data to identify people at risk of, or in, fuel poverty. We have 80m data points so far, and we expect to reach up to 700m+ soon. This data set is open and when it goes live we think it will be the biggest open data set in the UK.

Cycling Scotland is showing the true state of cycling, helping them to make their case for funding and gain insght.

And we are working with North Lanarkshire Council on business rates, which could lead to saving of £18k per annum, but can also identify incorrect rates of £!00k+ value.

If you want to find out more do come and talk to me, take a look at USmart, and join the USmart community.

Martina Pugliese, Data Science Lead, Mallzee

I am data science lead for Mallzee – proudly established and run from Edinburgh. Mallzee is an app for clothes, allowing you to like or dislike a product. We show you 150+ brands. We’ve had 1.4m downloads, 500m ratings on products, 3m products rated. The app allows you to explore products, but it also acts as a data collection method for us and for our B2B offering to retailers. So we allow you to product test, very swiftly, your products before they hit the market.

Why do this? Well there are challenges that are two sides of the same coin: Overstock where you have to discount and waste money; and Understock where you have too little of the best stock and that means you don’t have tine to make the best return on your products.

As well as gathering data, we also monitor the market for trends in pricing, discounting, something new happening… So for instance only 50.8% of new products last quarter were sold at full price. We work to help design, buying and merchandising teams improve this rate by 6-10% through customer feedback.

So, data is our backbone. For the consumer we enable discovery, we personalise the tool to you – it should save you time and money. At the same time the data also enables performance prediction. We have granular user segmentation. And it goes back to you – the best products go on the market. And long term that should have a positive environmental impact in reducing waste.

Maggie Philbin: Thank you. I’m going to ask you to feedback on each others ideas and work.

Carlos: I’m new to the data science world, so for me I need to learn more – and these presentations are so useful for that.

Martina: This is really useful for me, and great to see that lots of different things going on.

Matt: My work focuses on smart cities, so naturally interested in Steven’s presentation. Less keen on problematising the city.

Steven: Really interesting to discuss things backstage, but also exciting to hear Martina talking about how central data is for your business right now.

Maggie: And that is part of the wonderful things about being at Data Fest, that opportunity to learn from and hear from each other, to network and share.

We are back from lunch with a video on work in the Highlands and Islands using ambient technologies to predict likelihood of falls etc. 

Transforming Sectors With Data-Enabled Innovation – Orsola De Marco, Head of Startups, Open Data Institute

I’m going to talk about transforming sectors with data. The ODI, founded by Tim Berners-Lee and Nigel Shadbolt, focuses on data and what data enables.We think about data as infrastructure. If you think of data as roads you see that the number of roads do not matter as much as how they are connected… In the context of data we need data that can be combined, that is structured for connection and combination. And we look at data through open data and open innovation. What the ODI’s work has in common is that open innovation is at the core. This is not just about innovating, but also about making your organisation more porous, bringing in the outside. And I love the phrase “if you are the smartest person in the room, then you are in the wrong room”, because so often innovation comes from collaboration and from the outside.

Open innovation has huge potential value. McKinsey in 2013 predicted $3-5 trillian impact of open data; Lateral Economics (2014) puts that at more like $20 tn.

When we talk about open innovation and collaboration, we can talk about the corporate-startup marriage. We used to see linear solution having good returns, but that is no longer the case. Problems are now much more complex, and startups are great at innovation, at thinking laterally, at finding new approaches. But corporates have scale, they have reach, and they have knowledge of their industries and markets. If you bring these two together, it’s clear you can bring a good opportunity to live.

As example I wanted to share here is Transport for London who wanted to release open data to enable startups and SMEs to use it. CityMapper is one of the best known of these tools built on the data. Last year, after several years of open data, they commissioned a Deloitte report (2017) that this release had generated huge savings for TfL.

Another example is Arup. Historically their innovation had been taking place in house. They embraced a more open approach, and worked with two of our start ups Macedon C and Smart Sensors. Macedon C helped Arup explore airport data so that Arup didn’t need to do that processing. Smart Sensors installed 200 IoT sensors, sharing approaches to those sensors, what it means to implement IoT in buildings, how they could use this technology. And they rolled them out to some of their services.

Those are some examples. We’ve worked with 120 startups across the world. And they have generated over £37.2M in sales and investment. These are real businesses bringing real value – not just a guy in a shed. The major challenge is on the supply side of the data. A lot of companies are reluctant to share, mentioning three blockers: (1) it feels very risky to open data up – that issue feels highly relevant this week; (2) its expensive to do especially if you don’t know the value coming back; (3) perceived lack of data literacy and skills. Those are all important… But if you lead and innovate, you get to set the tone for innovation in your sector.

The idea of disruption is raised a lot, but it is real. But to actually disrupt you do really need a culture of open innovation is essential to lead. It needs to be brought in at senior level and brought into the sector.

Data infrastructure can transform sectors. And joining forces between data suppliers and users are important there. For instance we are working on a project called Open Active, with Sport England. A lack of information on what was going on in different areas was an issue for people getting active. We were involved at the outset and could see that data was the blocker here… If you tried to aggregate information it was impossible. So, in the first year of the programme we brought providers into the room, agreed an open standard, and that enabled aggregation of data. We are now in the second phase and, now that the data is consistent and available, we are bringing start ups in to engage and do things with that data. And those start ups aren’t all in sports, some are in healthcare sector – using sports data to augment information shared by medics. And from leisure companies helping individuals to find things to do with their spare time.

Another example is the Open Banking sector. Over 60% of UK banking customers haven’t changed their bank account in 5 years. And many of those haven’t changed them in 20 years. So this initiative enables customers to grant secure access to your banking details for e.g. mortgage lenders, or to enable marketplaces to offer energy switching companies. Our experience in this programme was to facilitate these banks, and took that experience of data portability… And now we are working with Mexico on a FinTech law that requires all banks to have an open API.

In order to innovate in sectors it’s important to widen access to data. This doesn’t mean not taking data privacy seriously, or losing competitive advantage.

And I wanted to highlight a very local programme. Last year we began a project in the peer to peer accommodation market. The Scottish expert advisory panel noted that whilst a lot of data is generated, no real work is looking at the impact of the sharing economy in accommodation. That understanding will enable policy decisions tied to real concerns. We will be making recommendations on this very soon. If you are interested, do get in touch and be part of this.

Q&A

Q1) You talked a lot about the value of data. How do you measure that economic value like that?

A1) We base value on sales and investment generated, and/or time or money saves in processes. It’s not an exact science but it looks for changes to the status quo.

Q2) What is the most important and valuable thing from your experience here?

A2) I think I’ll approach that answer in two ways. We do innovate work with data but we often facilitate conversations between data provider and start ups. For making data available we remove those blockers; for start ups it’s helping that facilitate those conversations, it’s helping them grow and develop and tailoring that support.

Q3) What next?

A3) Our model is a sector transformation model. We talk to a sector about sharing and opening up, and then we have start ups in an accelerator so that data will find a use. That’s a huge difference from just publishing the data and wondering what will happen to it.

Designing Things with Spending Power – Chris Speed, Chair of Design Informatics, University of Edinburgh

I have a fantastic team of designers and developers, and brilliant students who ask questions, including what things will be like in Tomorrow’s World!  We look at all kinds of factors here around data. So I want to credit that team.

Many of you in the room will be aware that data is about value constellations, rather than value chains. These are complex markets, many players – which may be humans but also which may be bots. That changes our capacity to construct value, since we have agents that construct value. And so I will talk about four objects to look at the disruption that can be made, and what that might mean, especially as they gain agency, to gain power. One of the things we thought was, what happens when we give things spending power.

See diagram from Rand organisation comparing centralised with decentralised and distributed – we see this model again and again… But things drift back occasionally (there’s only one internet banking platform now, right?). I’m going to show this 2014 bitcoin blockchain transaction video – they move too fast to screengrab these days! So… what happens when we have distributed machines with spending power? And when transactions go down to absolutely tiny transactions and amount of money.

So, we run BlockExchange workshops, with lego, to work on the idea of blockchain, what it means to be a distributed transaction system.

Next we have the fun stuff… What happens when we have things like Ethereum… And smart contracts. What could you do with digital wallets. If the UN gives someone a digital password, do they need sovereignty. So, we undertake bodily experiments with this stuff. We ran a physical experiment – body storming – with bitcoin wallets and smart contracts… A bit like Pokemon Go but with cash – if you hit a hotspot the smart contract assigns you money, Or when you enter a sink, you lose bitcoin. So, here is video of our GeoCoin app and also an experiment running in Tel Aviv.

These three banking volunteers design to design a new type of cinema experience… They enter the cinema by watching two trailers that are pickupable in the street… Another colleague decides not to do this… They gain credit by tweeting about trailers… bodystorming allows new ideas to be developed (confusingly, there is no cinema… This is, er, a cinema of the mind – right Chris?). 

Next we have a machine with a bitcoin wallet. Programmable money allows us to give machines buying power… Blockchain changes the history to things, adding value to value… So, we set up a coffee machine Bitbarista, with an interface that asks the coffee drinker to make decisions about what kind of coffee they want, what values matter… Mediating the space between values and value.

We have hairdryers – these are new and have just gone to the Policy Unit this week. We have Gigbliss Plus hairdryer… That allows you to buy and trade energy and to dry your hair when energy is cheaper… What happens when you do involve the public in balancing energu. And we have another hairdryer… That asks whether you want unethical energy now, or whether you want to wait for an ethical source – the hairdryer switches on accordingly. And then we have Gigbliss Auto, which has no buttons. You don’t have control, only the bitcoin wallet has decision powers… You don’t know when it comes on… But it will. But it changes control. Of those three hairdryers, which are we happy to move to… Where do we feel happy here.

And then we have KASH cups, with chips in them. You can only but coffee when you put two cups down. So you get credit, through the cups digital wallet, to encourage network and development. You don’t have to get copy – you can build up credit. We had free coffee in the other room… But we had a very fancy barista for the KASH cups, and people queued for this for 20 minutes – coffee with social value.

Questions for us… We give machines agency, and credit… What does that mean for value and how we balance value.

Maggie: It’s at this point I wish Tomorrow’s World still existed!

Q&A

Q1) where is this fascinating work taking you?

A1) I think this week has been so disruptive in terms of data and technologies disruption of social, civic, political values. I think understanding that we can’t balance value, or fair trade, etc. on our own is helpful and I’m really excited by what bots can offer here…

Q2) I was fascinated by the hairdryers… I’ve been in the National Grid’s secret control room and seeing that, that thing of Eastenders finishes and we make a cup of tea means bringing a whole power station on board… But waiting 10 minutes might avoid that need. It’s not trivial it’s huge.

A2) Yes, and I think understanding how that waiting, or understanding consequences of actions would have a real impact. The British public are pretty conscious and ethical I think, when they have that understanding…

Q3) Have you thought about avoiding queues with blockchain?

A3) We don’t want to just play incentives to get people out of queues. People are there for different reasons, different values, some people enjoy the sociability of a queue… Any chance to open it up, smash it up, and offer the opportunity to co-construct is great. But we need to do that with people not just algorithms.

Maggie: At this point I should be introducing Cathy O’Neil, but she has been snowed in by 15 inches of snow on the East Coast of the US. So, she will come over at a later date and you’ll all be invited. So, in place of that we have a panel on the elephant in the room, the Facebook and Cambridge Analytica scandal, with a panel on data and ethics.

Panel session: The Elephant in the Room: What Next? – Jonathan Forbes (JF), CTO, Merkle Aquila (chair); Brian Hills (BH), Head of Data, The Data Lab; Mark Logan (ML), Former COO Skyscanner, Investor and Advisor to startups and scale ups; Mhairi Aitken (MA), Research Fellow, University of Edinburgh. 

JF: So, thinking of that elephant in the room.. That election issue… That data use. I want to know what Facebook could have done better?

ML: It has taken them a long time to respond, which seems strange… But I see it as a positive really. They see this as a much bigger issue rather than the transactional elements here. In that room you look at risk and you look at outrage. I think Facebook were trying to figure out why outrage was so high, I think that’s what has surprised them. I think they took time to think about what was happening to them. I don’t think it’s just about electing a game show host to president… The outrage is different. Cambridge Analytica is a bad actor, not just on data but on their advocacy for other problematic tactics. Facebook shouldn’t be bundled into that. I think aspects here is that you have a monopoly. Facebook is an advertising company – they need to generate data and pass it onto app developers. Those two things don’t totally aligned. And I think the outrage is about trust and expectation of users.

JF: You are closest to the public in your research. The share price is dropping significantly right now… How, based on past experience, do you see this playing out.

MS: I’m used to talking to people about public sector use of data. Often people talk about Facebook data and make two points: firstly that they contribute their own data and control  that and know how it’s used; but they also have very high expectations of use for public sector organisations and don’t have that for private sector organisations – they think someone will generate ads and profit but when used in politics that’s very different, and that changes expectations.

JF: I enjoyed your comment about the social license… and I think this may be a sign that the license is being withdrawn. The GDPR legislation certainly changes some things there. I was interested to see Tim Berners Lee’s response, taking Mark Zuckerberg’s perspective… I was wondering, Brian, about the commercial pressures and the public pressures here. Are they balancing that well?

BH: No. When we look back I think this will be a pivotal moment. I kind of feel like GDPR piece is like being in a medieval torture chamber… We have a countdown but the public don’t know much about it. With Facebook it’s like we have a firework in the sky and people are asking what on earth is going on… And we have an opportunity to have a discussion about the use of data. As we leave today we have a challenge around communicate our work with data, what are our responsibilities here. The big data thing, many business cases seem like we’ve failed – we’ve focused on the technology and only that. And I feel we now have an opportunity and a window here.

JF: I’d like to take the temperature of the room… How many of you had Facebook on their phone, and don’t this week? None.

ML: I think that’s the point. The idea of not doing to others data what you wouldn’t want done to your own… But the reality is that legislation is playing catch up to practice. Commercially it’s hard to do the right thing. I think Mark Zuckerberg has reasonably good intentions here… But we have this monopoly… The parallel here is banking. And monopoly legislation hasn’t kept pace with the monopolies we have. I think it would be great if you could export your data, friends data, etc. to another platform. But we can’t.

Comment: I think you asked the wrong question… Who here doesn’t Facebook on their phone at all. Actually quite a lot. I think actually we have that sense that power corrupts and absolute power corrupts absolutely. And I don’t feel I’m missing out, I’m sure others feel that too. And I’m unsurprised about Facebook, I could see where it was going.

JF: OK, so moving towards what we can do, should we have a code of conduct, a hypocratic oath to data, a “do no harm”.

BH: I don’t see ethics featuring in data models. I think we have to build that in. Cathy O’Neil talks about Weapons of Math Destruction… We have to educate our data science students how to use these tools ethically, to think about who they will work with. Cathy was a Quant and didn’t like that so she walked away. We have to educate our students about the choices they make. We talk about optimisation, optimisation of marketing. In optimising STEM stuff… And we are missing stuff… I think we need to move towards STEAM, where A is for Arts. We have to be inclusive for arts and humanities to work with these teams, to think about skills and diversity of skills.

JF: Particularly thinking about healthcare

MA: There is increasing drive to public engagement, to public response. That has to be much more at the heart of training for data scientists and how it relates to the society we want to create. There can be a sense of slowing momentum, but it’s fundamental to getting things right, and shaping directions of where we are going…

JF: Mark, you mentioned trust, and your organisation has been very focused on trust.

ML: These multifacet networks are built on trust. For Skyscanner trust was so much more important than favouring particular clients. I think Facebook’s error has been to not be more transparent in what they do. We have had comments about machine learning as hype, but actually machine learning is about machines learning to do something without humans. We are moving to a place where decisions will be made by machines. We have to govern that, and to police machines with other machines. And we have to have algorithms to ensure that machine learning is appropriate and ethical.

JF: I agree. It was interesting to me that Weapons of Math Destruction is the top seller in algorithms and programme – a machine generated category – but that is reassuring that those working in this space are reading about this. By show of hands how many here working in data science are thinking about ethics. Some are. But unclear who isn’t working with data, or who isn’t working ethical. So, to finish I want your one takeaway for this week.

BH: I think it’s up to us to decide how to do things differently, and to make the change here. If we are true data warriors driving societal benefit then we have to make that change ourselves.

ML: We do plenty to mess up the planet. I think machine learning can help us sort out the problems we’ve created for ourselves.

MA: I think its been a wonderful event, particularly the variety and creativity being shared. And I’m really pleased to open up these conversations and look at these issues.

JF: I’m optimistic too. But don’t underestimate the ability of a small group of committed people to change the world. So, Data Warriors, all of you… You know what to do!

Maggie: Thank you all for your conversation, your enthusiasm. One message I really want to give you is that when you look at the use of data, the capacity to do good… The vast majority of young people are oblivious. They could miss out on an amazing career. But as the world changes, they could miss out on a decent career without these skills. Don’t underestimate your ability as one person with knowledge of that area to make a difference, to influence and to inspire. A few years back, in Greenock, we ran an event with Teen Tech and the support of local tech companies made all the difference… One team went to the finals in London, won and went to Silicon Valley… And that had enormous impact on that school and community, and now all S2 students do that programme, local companies come in for a Dragon’s Den type set up. Any moment that you can inspire and support those kids will make all the difference in those lives, and can make all the difference, especially if family, parents, community don’t know about data and tech.

Closing Comments – Gillian Docherty, CEO, The Data Lab

Firstly thank you to Maggie for being an amazing host!

I have a few thank yous to make. It has been an outstanding week. Thank you all for participating in this event. This has been just one event of fifty. We’ve had another 3000 data warriors, on top of you 450 data warriors for Data Summit. Thank you to our amazing speakers, and exhibitors. The buzz has been going throughout the event. Thank you to our sponsors, and to Scottish Government and Scottish Enterprise. Thank you to our amazing volunteers, to Grayling who has been working with the press. To our venue, events team and caterers. Our designer from two fifths design. And the team at FutureX who helped us organise Data Talent and Data Summit – absolutely outstanding job! Well done!

And two final thank yous. Firstly the amazing Data Lab team. We have thousands of new people being trained, huge numbers of projects. I also want to specifically mention Craig Skelton who coordinated our Fringe events; Cecilia who runs our marketing team; and Fraser and John who were behind this week!

My final thank you is to all of you, including the teams across Scotland participating. It is a fantastic time to be working in Scotland! Now take that enthusiasm home with you!

 March 23, 2018  Posted by at 10:48 am Events Attended, LiveBlogs Tagged with: , , ,  No Responses »
Mar 222018
 

Today I am at the Data Fest Data Summit 2018, two days of data presentations, showcases, and exhibitors. I’m here with my EDINA colleagues James Reid and Adam Rusbridge and we are keen to meet people interested in working with us, so do say hello if you are here too! 

I’m liveblogging the presentations so do keep an eye here for my notes, updated throughout the event. As usual these are genuinely live notes, so please let me know if you have any questions, comments, updates, additions or corrections and I’ll update them accordingly. 

Intro to the Data Lab – Gilian Doherty, The Data Lab CEO

Welcome to Data Summit 2018. It’s great to be back, last year we had 25 people with 2000 people, but this year we’ve had 50 events and hope to reach over 3500 people. We’ve had kids downloading data from the space station, we’ve had events on smart meters, on city data… Our theme this year is “Data Warrior” – a data warrior is someone with a passion and a drive to make value from data. You are data warriors. And you’ll see some of our data warriors on screen here and across the venue.

Our whole event is made possible by our sponsors, by Scottish Enterprise and Scottish Government. So, let’s get on with it!

Our host for the next two days is the wonderful and amazing Maggie Philbin, who you may remember from Tomorrow’s World but she’s also had an amazing career in media, but she is also chair of UK Digital Skills and CEO of Teen Tech, which encourages young people to engage with technology.

Intro to the Data Summit – Maggie Philbin

Maggie is starting by talking to people in the audience to find out who they are and what they are here for… 

It will be a fantastic event. We have some very diverse speakers who will be talking about the impact of data on society. We have built in lots of opportunities for questions – so don’t hesitate! For any more information do look at the app or use the hashtag #datafest18 or #datasummit18.

I am delighted to introduce our speaker who is back by popular demand. She is going to talk about her new BBC Four series Contagion, which starts tonight.

The Pandemic – Hannah Fry

Last year I talked about data for social good. This year I’m going to talk about a project we’ve been doing to look at pandemics and how disease spreads. When we first started to think about this, we wanted to see how much pandemic disease is in people’s minds. And it turns out… Not many.

Hannah’s talk was redacted from this post yesterday but, as Contagion! has now been broadcast, here we go: 

Influenza killed 100 million people in the 20th Century. The Spanish Flu killed more people in one year than both World Wars. Which seems surprising but that may be partly because Pandemic Flu is very different from Seasonal Flu. Pandemic Flu is where a strain of flu jumps from animals to humans and spreads so fast that we can’t vaccinate fast enough. For that reason Pandemic Flu is the top of the UK Government’s Risk Register.

So, what we decided to do was essentially a TV stunt with a real purpose. We built a simple smart phone app. The App captures where people are, and how many people they are with. That allows us to see how disease might spread. Firstly to do that for TV of course, but secondly this is proper citizen science for real research. So, I spent a year calling in lots of favours, getting on all sorts of media, asking people to download an app.

But we also needed a patient zero, and we also needed a ground zero. We picked Haselmere in Surrey, which is a sort of Goldilocks town, just big enough, well connected.. A beautiful English town… Just the type you’d like to destroy with an imaginary virus. And I was patient zero… So I went there, went to the gym, went to the shops, went to the pub,,, But unknown to me I also walked past others with the app… So when I stood need to one of these , it was for enough time to infect that person… And so now there were two people and then many more… A pharmacist got infected early on and continued infecting out…

These patterns are based on our best mathematical models for infection… And you can quickly see pockets of infection developing and growing. Spreading quickly to a whole town. But those dots on a map are all real people…

Looking at some real infection sites…. So, in Petersfield there is a school were a few kids from Haselmere attend, commuting by train. Three kids running our app… By day three, two were infected, one wasn’t. They went to the break room, and outside, and the third person got infected… And then infected their family…

I wanted to also talk about a person from Haselmere who work in London on Day Two. Two people from the town don’t know each other, but they took the train home, and the one infected the other…

Now, this is just the Haselmere experiment, but we did a nationwide experiment…

We persuaded 30,000 people to download the app and take part… Again, it starts with me walking around Haselmere. By a month in, London is swamped. Two months in it sweeps Scotland. By three months it’s in North Ireland. Really by then only the North of Scotland was safe! What is startling isn’t the speed of the spread, but also how many people get infected… This is the most accurate model we have to date. The most accurate estimate for a Spanish Flu type virus, is a staggering 43,343,849. A conservative fatality rate of 2% would be 886,877 deaths. But that’s worst case scenario… That’s no interventions… Which is why this data and this model are so important as they allow you to understand and trial intervention. Generally most people infect the same small number of people, but some super spreaders have a much bigger impact. If you target super spreaders with early vaccination – just vaccinating a targeted 10% – makes a huge difference. It really slows the spread, giving yourself a fighting chance to overcoming infection.

We know these pandemics can and will happen, but it’s about what you plan for and how you intervene. The only way to answer those big questions and to know how to intervene, is to understand that data, to understand that spread. So we are anonymising this data set and releasing it to the academic community – as a new gold standard for understanding infection. Data really does save lives.

Q&A

Q1) So, Shetland is safe…. Unless the infection started there.

A1) When we spoke to one person about what they’d do in a pandemic, they said they’d get in a car with their kids and just

Q2) I’m from the NHS and there has been a lot of work of super spreaders, closing schools… Has there been work on the most efficient, mathematically effective patterns to minimise infection.

A2) Schools are an interesting one… Closing schools sounds like it makes everything simple. Sometimes shutting schools means kids share in an unpredictable manner as they will go places too. And then you reopen schools and reinfect potentially… And that’s without the economic impact. These are all questions we are thinking about.

Q3) That’s awesome and scary. What about people developing immunity.

A3) Our model is no immunity, and no-one recovers. But you can build that data in later, adding rish assumptions. And some of the team working on this are looking at infection transmitted through the air – some viruses can stick around a few hours.

Q4) I remember the SARS book. I’m very paranoid… Brought suits, gloves, bleach… In New Zealand you need a two week supply of stuff in your house… If we did that, how would that make a difference.

A4) Yes… So for instance the government always pushes messages about hand washing whenever flu is taking place. It doesn’t feel that that would make a big difference… But at a population level it really does…

Q5) My question is whether you will make the data available for other people – for epidemiology but also for transport, for infrastructure.

A5) Yes, absolutely. We wanted to make this as scientifically rigorous as possible. The BBC gives us the scale to get this work done. But we are now in the process of cleaning the data to share it. Julia Gog at Cambridge is the lead here so look out for this.

Q6) What about data privacy here?

A6) At a national level the data is accurate to 1 km squared, with one pin every 24 hours. Part of the work to clean the data is checking if it can be reverse engineered to make sure that privacy is assured. For Haselmere there is more detail… We are looking at skewing location, at just sharing distance apart rather than location, and at whether there is any way you can reverse engineer the dataset if you’ve seen the TV programme, so we are being really careful here.

Business Transformation: using the analytics value chain – Warwick Beresford-Jones, Merkle Aquila

I’ll be talking about the value chain. This is:

Data > Insight > Action > Value (and repeat)

Those two first aspects are “generation” and the latter two are “deployment”. We are good at the first two, but not so much the action and value aspects. So we take a different approach, thinking right to left, which allows faster changes. Businesses don’t always start with an end in mind, but we do have accessible data, transformatic insights, organisational action, and integrated technology. In many businesses much of the spend is on technology, rather than the stage where change takes place, where value is generated for the business. So that a business understands why they are investing and what the purpose of this.

I want to talk more about that but first I want to talk about the NBA and the three point line, and how moving that changed the game by changing basket attempts…And that was a tactical decision of whether to score more points, or concede fewer points, enabling teams to find the benefit in taking the long shot. Cricket and Football similar use the value chain to drive benefit, but the maths work differently in terms of interpreting that data into actions and tactics.

Moving back to business… That right to left idea is about thinking about the value you want to derive, the action required to do that, and the insights required to inform those actions, then the data that enables that insight to be generated.

Sony looked at data and customer satisfaction and wanted to reduce their range down from 15 to 4 handsets. But the data showed the importance of camera technology – and many of you will now have Sony technology in the cameras in your phones, and they have built huge value for their business in that rationlisation.

BA wanted to improve check in experiences. They found business customers were frustrated at the wait, but also families didn’t feel well catered for. And they decided to trial a family check in at Heathrow – that made families happier, it streamlined business customers’ experience, and staff feedback has also been really positive. So a great example of using data to make change.

So, what questions you should be asking?

  • What are the big things that can change our business and drive value?
  • Can data analytics help?
  • How easy will it be to implement the findings?
  • How quickly can we do?

Q&A
Q1) In light of the scandal with Facebook and Cambridge Analytica, do you think that will impact people sharing their data, how their data can be used?

A1) I knew that was coming! It’s really difficult… And everyone is also looking at the impact of GDPR right now. With Facebook and LinkedIn there is an exchange there in terms of people and their data and the service. If you didn’t have that you’d get generic broadcast advertising… So it depends if people would rather see targeted and relevant advertising. But then with some of what Facebook and Cambridge Analytica is not so good…

Q2) How important is it for the analysts in an organisation to be able to explain analytics to a wider audience?

A2) Communication is critical, and I’d say equally important as the technical work.

Q3) What are the classic things people think they can do with data for their business, but actually is really hard and unrealistic?

A3) A few years ago I was meeting with a company, and they gave an example of when Manchester United had a bad run, and Paddy Power had put up a statue of Alex Ferguson with a “do not break glass sign” and they asked how you can have that game changing moment. And that is really hard to do.

Q4) You started your business at your kitchen table… And now you have 120 people working for you. How do you do that growth?

A4) It’s not as hard as you think, but you have to find the right blend of raw talent with experience – lots of tricky learning.

Project Showcase

How will you make a difference? I’m going to talk about how I’ve made major change for one of Scotland’s biggest organisation. I was working for Aggreko, the leader of mobile modular power and temperature solutions. They provide power for the Olympics, the World Cup, the Superbowl… A huge range of events across the world.
We are now watching a short video on how Aggreko supplies large scale mobile power (30 MW set up in 17 days) to cover local demand in Macha Pichu when a hydroelectric plant has to be shutdown for maintenance. 
In the dark old days Aggreko was a reactive organisation. A customer would ring with an issue, then Aggreko would send an engineer out. And then they moved to monitoring the mobile power kit to help monitor equipment across the world on a 24/7 basis. My team build the software to undertake that monitoring, to respond to every alert, alarm, any issue customers might face. And in fact in many cases to fix an issue before a customer ever became aware of it. And that meant far greater reliability and efficiency. And doing that we wondered how we might be able to predict issues, to predict how eqyuipment might fail. We didn’t know how to do that and we weren’t afraid to ask…
So we went to the Data Lab, took my idea to their board, and they funded a year long pilot to work with University of Strathclyde and Microsoft, as well as needing to build a team of engineers, technicians, specialists to be part of the team to take this far. This was a group of massively smart group, but also some big egos… A lot of what I had to do was to ensure there was good collaboration across those teams. The collaboration is really what made this project a real success. We created an advanced analytics team which allowed us to put models into use, some of which could predict an issue 2 weeks ahead of any issue, and being able to manage those issues for our customers.
The guys at Data Lab helped me to make a difference, they were brilliant and all that help is available to you too. So what are you waiting for?  
There are various ways to resolve this, but they are not easy. There is work for the 1% of large companies, but that leaves SME out. And 50k SMEs go out of business every year in the UK. So, what is the solution? Well, let me tell you about Previse and what we do. We think we have a unique solution. David Brown, one of our co-founders, had experience in the sector, and he didn’t want to accept the status quo. Accounting the oldest processes and data that a company is, but no-one is using that in this sort of way. So what do we do?
Previse finds data, engages with data, pulls in other data… And looks at what can work. We can look at all data on every invoice from every supplier. We then determine a score, and a threshold…. So that when invoices come in they can be prioritised and mostly approved and paid immediately. The process is the same for the buyer but it makes a huge difference for the supplier. Placing an invoice through Previse you can send and have approved invoices very swiftly, and without chasing and additional work. That is a huge difference in cost and time. The large corporates we’ve been talking with – including 70% of large FTSE companiess – are really enthusiastic and want us to help them.
And our experience in Scotland has been incredible. The Data Lab helped us throughout, finding the right universities to work with. We work with Heriot Watt (Mike Chantler) and with MBN to find the right resources, and Scottish Enterprise have helped us make Scotland our hub for data science and software engineers. We’ve employed 5 people in the last 6 months, and we’ll double that by the end of the year. We can generate growth, but it’s also about making real change with data.
If SMEs are paid on time, that allows them to thrive and grow. It’s a huge problem and we think it can be resolved.
Our platform consists of four modules: sustainability; mapping; reporting and advanced. But I’ll talk about our mapping module and some projects we’ve worked on:
  • Mapping the water footprint of your crops – a project with the University of Edinburgh, funded by Data Lab. This brings together a wide range of crop data layers. We have an overlay based on water for crop growing, and overlays of gray water, or the erosion potential – for instance there is high erosion potential on the west coast of Scotland, mmostly low erosion in the east of Scotland.
  • Forests 2020 is a Mexican application supported by the UK Space Agency, and we work with University of Edinburgh, University of Leicester, and Carbomap. Here we can see deforestation patterns, and particular crop areas.
  • Innovate UK: farm data, which is a collaboration with Rothamsted Research, Environment Systems, and Innovate UK – this is at an early stage looking at crop rotation data for UK and export markets. And you can also see the soil you are growing on, what can be planted, what sort of fertilisers to use.
  • Sustainability risk – supports  understanding of risks such as water depletion, and the various factors impacting and shifting that.
  • We also have tools for government to know how to plan what type and locations they should be building power plants in.

So, in conclusion, layering data allows us to gain new insights and understanding.

After a good lunch and networking session we are now back in the main hall, starting with a video on the use of data in Heineken production process. And an introduction to Stefaan Verhulst, a former Glasgow graduate now based in New York.

Data Driven Public Innovation In Partnership With The Private Sector: The Emerging Practice Of Data Collaboratives – Stefaan Verhulst, Co-founder and Chief Research and Development Officer, The Gov Lab

I’m delighted to be back in Scotland for this event looking at how data can be help society, and how society can be. That is also the focus of The Gov Lab in New York. And we also look at how we can unleash data for good.

An example  want to give you is the earthquake in Nepal a few years ago. It was a terrible event but it was also inspiring too, because Ncell, a cell phone operator, and Flowminder (based in Sweden and the UK) worked together to map the flow of people to intervene, to save lives. It is a great example of using data in the public good. And it’s an example of the growth of available data, including web crawling/scraping/search analysis; social media; retail data etc. all collected by the private sector. But we also have new data science to address this data, to gain meaning from this data. And often that expertise to extract meaning is sitting in the private sector.

So, the real question is how we extract value and engage with the private sector around data they collect. That’s a whole different ballgame from open government data. It’s not just about data sharing, but about new kinds of public-private sharing around data for the public good. So we have set up new programmes of Data Collaboratives. So we set up the Data Collaboratives Explorer allows you to explore those collaborations taking place – there are over 100 in there already. From that collaborative work we have gained some insights that I will share today.

So, firstly, data collaboratives are important across the policy lifecycle:

  • That starts with situation analysis. Corporations in the US have worked together in the US to understand the scale of the opioid epidemic, for instance.
  • Our second value proposition is about knowledge creation. For instance, post hurricane season how does the mosquito population change and how does that change mosquito born diseases.
  • Our third value proposition is prediction, fr instance projects to predict suicide risk from search results – a project in Canada and also in India.
  • And then we have evaluation and impact assessment. An example here is Vision Zero Labs looking at traffic safety and experiments in spatial composition to influence and reduce risk of accidents.

In those collaboratives we see different models in use. These include: data pooling – enabling sharing and analysis across the collaboration; prizes and challenges – opening some data as a source of generating new insights through innovative ideas and projects that benefit both public and private sector, e.g. BBVA’s Innova challenge; research partnerships – with collaboration across private sector and public or academic sector – such as work on fake news on Twitter; intelligence products – JP Morgan Chase has an institute to extract insights from their own data and actually that can be hugely detailed and valuable; API – for instance Zillow allows you to access real time mortgage and housing market data; trusted intermediary – for instance Dalberg who acts between telecommunications companies and others.

So, there are many ways to set up a data collaborative. But why would the private sector want to do this? Well, they may be motivated by reciprocity – sharing data may lead to access to specialist expertise; research and insights; revenue; regulatory compliance; reputation and retainment of talent – often corporations need to retain talent through solving harder or more interesting problems; responsibility.

But there are challenges too. For instance the taxi and limousine agency in New York regulates all taxi operations, including Uber. In their wisdom they shared the data… But that exposed some celebrity locations (and less salubrious locations). The harm here wasn’t huge but that data in a different cultural contexts could present a much higher risk. So, some of the concerns around sharing data include:

  • privacy and security
  • generalisability and data quality (e.g. not everyone has a cell phone)
  • competitive concerns
  • cultural challenges – there is something of a culture of hoarding data within organisations.

So, to move towards data responsibility we really need risk and value assessment that recognises data as a process, and part of a wider value chain. We need fair information practices and processes – our principles are about 30 years out of date and we urgently need new principles and processes. GDPR helps, but not all the challenges we may have. We need new methods and approaches. And that means having a decision tree across the data cycle.

There are risks in sharing data, but there are also risks in not sharing the data. If we had not have used the NCell data in Nepal, we would have had more deaths. So we have to respond not just to risks, but also to opportunity cost of not sharing data. What is your responsibility as a corporation?

I’ve given lots of examples here… But how do we make data driven public innovation systemic? We need data stewards in organisations so there is someone who can sign off on data collaboratives, we need that profession in place in organisations to enable work with the public sector. We need methods – like the Unicef collaboratory around childhood obesity, that’s a new methology. We also need new evidence, of how data can be used and what impact it will have. And finally we need a movement – this all won’t happen without a movement to establish data collaboratives, and I’m delighted to be here today as part of this movement, and ultimately use data to improve peoples lives.

Q&A

Q1) In light of Cambridge Analytica and Trump, aren’t we misusing data?

A1) I think use is part of that value chain and we have to have a debate about what kind of use we are comfortable with, and which we are not. And that case also raises questions about freedom of expression, and a need to regulate against deceptive behaviours.

Q1) Several years ago hashtags brought down governments in the Middle East, and now we have governments in those countries controlling the public through hashtags. It’s scary.

A1) I’ve been working in privacy for many years, and I really encourage a comparison of risks and value. And to do a cost-benefit analysis. We need to rebalance that.

Gillian is introducing our special guest… Minister Derek MacKay

Message from the Scottish Government – Derek Mackey, MSP, Cabinet Secretary for Finance & Constitution, the Scottish Government

I’m not sure that I’ve thought of myself as a data warrior before, but I did teach the Social Security Minister how to use Instagram the other week! I say that partly as I have an appeal and a plea for you… The First Minister has a huge set of followers on Twitter, but I’m stuck just below 18k… Maybe you are the audience to take me over that line!

There’s a lot I want to cover in terms of the excitement of this event. We have a strong reputation and record in Scotland. With responsibility for the budget and internationalisation, this is really exciting. I’m particularly enthused by the international representation including Brazil, Singapore, USA, and Ireland too. This event allows us to put the spotlight on data science in Scotland. It is a natural place for people to come and do business. And this is a great event with business leaders here, with experience to share with others.

Our government, Scottish Enterprise and Data Lab are working together to build innovation and business in Scotland. We are fortunate in Scotland to have world class data resources. Scotland has Universities, 5 of which are in the top 100, and we have 70% of reseach rated as excellent in the last REF. We can feel this group. Data Driven Innovation has the potential to deliver £20bn value to Scotland in the next five years. This buzz can be harnessed to make Scotland the Data Capital in Europe. I paricularly support the growth in FinTech. Many people describe themselves as disruptors – that would have once been seen as a negative but is now a real positive, about opening new opportunities. And data helps us deliver our work, one example of which is the Cancer Challenge which is helping us understand how best to use our resources for the best outcomes.

The Scottish Government Innovation Action Plan seeks to build a sustainable economy, with skills crucial to that, including funding for business growth, innovation, etc. We’ve also launched the Scottish Digital Academy and the Data Science Accellerator to look at how things are changing, to innovate working methods – such as CivTech’s innovative models. We are really serious about business growth, the economy and skills. We have invested in innovation, education and internationalisation. We are the strongest part of the UK outside London and the SouthEast.

So, the Scottish Government supports your enthusiasm for data, for what can be done with data. High tech, low carbon is the future we see that, and we want to be country welcome in Europe and the rest of the world – we don’t support the UK government’s view on Europe.

I commend your work and hope that you have a fruitful and enjoyable time here. And we hope the collaboration of our agencies helps to bear fruit now and in the future.

Improving Transparency In The Extractives Industry Using Data Science – Erin Akred, Lead Data Scientist, DataKind

I am a data scientist from DataKind where we harness data for the improvement of humanity. We exist to use data to see the kind of world we want to see. The challenge we face is that many not for profits, charities, government agencies etc. do not have the resources to do the types of datascience that the private sector (e.g. Netflix) can. So we link pro bono data scientists with organisations with a social mission.

Last year we did a project looking at automating detecting mines from earth observation imagery. We are used to using this data for other purposes, but this is a challenging problem. I will talk more about this but I wanted to talk more about DataKind.

Our founder, Jake, was working at the New York Times on data science, and saw people volunteering and attending hack events at the weekend, giving back on their talents… So he thought perhaps I could partner with a mission driven organisation, could I organise a similar event and make this happen… He started DataKind and we’ve been developing what we can offer these mission-driven organisations who also want to benefit from Data Science. So we now pair data scientists with mission driven projects. We have over 18k community members worldwide, 6 chapters in 5 countries (US, Bangalore, Singapore, Dublin, London, San Francisco, Washington DC), we have chapter applicants in 40+ global cities; 228 events worldwide; and we’ve worked on over 250 projects generating about $20m value generated in volunteer effort.

On example project has been with the Omidyar Network to look at data science solutions that might enable social actors to operate more effectively and efficiently in their efforts to combat corruption in the extractives industry. Now we don’t start with the data that is out there. Our funders really want impact, and we think of that as impact per dollar. So, anyway, the context of this work was illegal mining which can cause conflict in Eastern Demographic Republic of Congo, it includes poor environmental outcomes, and social challenges. As data scientists we partner with other organisations to ensure we know how to get value out of data insights.

To understand illegal mining we have to know where it is taking place. So we did work on machine learning from images. We worked with Global Forest Watch and IPIS.

Now, not all of our projects are successful… Usually projects fails because of issues in:

  • Problem statement – a well thought through problem statement is really important.
  • Datasets
  • Data Scientists
  • Funding
  • Subject Matter Expertise
  • Social Actors

Now, I spoke to someone last night who has run lots of Kaggle projects – crowdfunded data science challenges. Now in those projects you have data, data scientists but you don’t have subject matter experts – and that’s crucisl knowledge and skills to have on board. For instance when looking at malaria, there was a presumption that mosquito nets would be helpful, but the way they work looks like a shrine, like death… And they don’t want to sleep in them. So they used them as fiishing nets.

When we work with an organisation we do want a data set, but we also want an organisation open to seeing what the data reveals, not trying to push a particular agenda. And we have subject matter experts that add crucial context and understanding of the data, of any risks or concerns with the data as well.

We start with, e.g.:

We want to create image classification models

Using publicly available earth satellite imagery

So that those owrking in the transparancy sector can be made aware of irregular mining activity

So that they can improve environmental and conflict issues due to mining. 

Some of the data we use is open – and a lot of data I’ve work with is open – but also closed data, data generated by mission-driven organisational apps, etc.

And the data scientists on these projects are at the top of their game, who these organisations could not afford to work with or recruit earlier.

So, for this project we used a random forest analyser on the data, to find mine locations. We had had generated training data for this project which determined that we can pick out where illegal mining work has occured with good accuracy.

To find out more and get involved – and I’d encourage you to do that – go to: dataking.org/getinvolved

Q&A

Q1) Where do you see DataKind going?

A1) We do a lot with not a lot of money. I had assumed that DataKind was 100 people when I joined, it was less than 10. I would love to see this model replicated in other countries. And conferences… Bringing volunteer data scientists together with providers enables us to increase the opportunity for these things to happen. Bringing these people together, those conferences are rich experiences that amplify the impact of what we are doing.

Q2) For the mining project you can access the data online. The US Federal Government is hosting the data, and we used Google Earth engine in this work.

From Analytics To AI: Where Next For Government Use Of Data? – Eddie Copeland, Director of Government Innovation, Nesta

I’ve been talking to anyone who will listen over the last 5 years about the benefits of public sector data. We have been huge proponents of using open data, but often data has been released in a vague hope that someone else might do something with it. And we have the smart cities agenda, generating even more data that often we have no idea how to use. But there is a missing link there… The idea that public organisations should be the main consumer of their own data, for improving their own practice.

Now you’ll have read all those articles asking if data is the new “oil”, the new “fuel”, the new “soil”! I don’t much care about the analogy but the key thing is that data is valuable. Data enables the public sector to work better, it enables many of the tried and tested ways of working better. Doing more and better with less. But that’s hard to do. For a public sector organisation with lots of amazing data on opportunities and challenges in my area, but not the next door area, how can I understand that bigger picture. We can target resources to the most vulnerable areas, but we need data to tell us where those are. Without visibility across different organisations/parts of the public sector (e.g. in family and child services), how can that data be used to understand appropriate support and intervention?

Why do we focus on data issues? Well, there is a technology challenge as so many public sector organisations have different IT services. And you have outrageous private sector organisations who charge the public sector to access their own data – they should be named and shamed. Even when you get the data out the format can be inconsistent, it’s hard to use. Then there is what we can do with the data – we often urge on the side of caution, not what is useful. Historically the main data person in public sector organisations was the “data protection officer” – the clue is in the title!  It takes an organisational leap to collaborate on issues where that makes sense.

I used to work for a think tank and I got bored of that, I really wanted to be part of a “do tank”, to actually put things into action. And I found this great organisation called Nesta and we have set up the London Office of Data Analytics:

  • an impactful problem – it takes time, backing, support you have to have a problem that matters
  • a clearly defined intervention – what would you do differently if you had all the information you could want about the problem you want to solve (data science is not the innovation)
  • what is the information asset you would need to undertake that intervention?
  • what intervention do you need to undertake to solve that issue?

So when we looked at London the issue that seemed to fit these criteria was unlicensed Houses of Multiple Occupancy, and how we might predict that. We asked housing officers how they identified these properties, we looked at what was already known, we looked at available information around those indicators. And then developing machine learning to predict those unlicensed HMOs – we are now on the third version of that.

We have also worked on a North East Data Pilot to join up data across the region to better understand alcohol harms. But we didn’t know what intervention might be used, which has made this harder to generate value from.

And we are now working on the Essex Centre for Data Analytics, looking at the issue of modern slavery.

Having now worked through many of these examples, we’ve found that data is the gateway drug to better collaboration between organisations. Just getting all the different players in the room, talking about the same problem in the same way, is hugely valuable. And we see collaborations being set up across the place.

So, things we have learned:

  1. Public sector leaders need to create the space and culture for data to make a difference – there is no excuse for not analysing the data, and you’ll have staff who know that data and just need the excuse to focus and work on this.
  2. Local authorities need to be able to link their own data – place based and person based data.
  3. We need consistent legal advice across the public sector. Right now lots of organisations are all separately getting advice on GDPR when they face common issues…

So, what’s next? Nesta is an innovation organisation. There is excitement about technologies of all types. For this audience AI probably is overhyped but nonetheless that has big potential, particularly algorithmic decision making out in the field. Policy makers talk about evidence based decision making, but AI can enable us to take that out into the field. Of course algorithms could do great things, but we also have examples that are bad… Companies hiring based on credit records is not ok. Public sector bodies not understanding algorithmic bias is not ok. For my own part I published 10 principles for a code of conduct for public sector organisations to use data centres – I’d love your feedback at bit.ly/NestaCode.

It is not OK to use AI to informa a decision if the person using it could not reasonable understand its basic objectives, function and limitations. We would face a total collapse of trust that could set us back a decade. And we’ve seen over the last week what that could mean.

Q&A

Q1) Aren’t the problems you are talking about are surely people problems?

A1) Public organisations are being asked to do more with less, and that makes it difficult for that time to be carved out to focus on these challenges, that’s part of why you need buy in and commitment at senior level. There is a real challenge here about finding the right people… The front line workers have so much knowledge but you have organisations who

Q2) Your comment that you have to understand the AI, GDPR require a right to explanation to use of data and that’s very hard to do unless automated.

A2) Yes, that’s a really untested part of GDPR. If local authorities buy in data they have to understand where that data is from, what data is being used and what that means. In the HMO example local front line staff can look at those flags from the prediction and add their own knowledge of the context of, for instance, a local landlord’s prior record. But that understanding of how to use and action that data is key.

Data Driven Business. It’s Not That Hard.- Alex Depledge, Founder Resi.co.uk,, Former CEO Hassle.com

That’s a deliberately provocative title – I knew that this would be a room full of intellectuals and I’m going to bring back down to earth. I’m known for setting up hassle.com, and I think it’s fitting that I am following Eddie talking about the basics and the importance of getting the basics right. So many companies that say they are running a data driven business, and they are not… Few are actually doing this.

I started my professional life at Accenture. I met my co-founder there. About 7 years into our friendship she emailed me and said “I’ve got it. I need a piano teacher, I’ve been Googling for four hours, we need a place to find music teachers”. And I said “that’s a rubbish idea”. And then I needed a wysteria trimmed… And we decided we wanted to build a marketplace for local services… We had a whole idea, a powerpoint deck, and thought that great, we’ll get a team in India or Singapore to build it… Sounded great, but nothing happened.

And then Jules quit her well paid job and she said “it’s ok, I’ve brought a book!” – and it was a Ruby on Rails book… She started coding… And she built a thing. And that led to us going through a Springboard process… We had some data but I was trying to pull in money. We were attracting some customers, but not a lot of service providers… We were driven by intuition or single conversations… So one day I said that I’m quitting and going back to the day job… And I was frustrated… And a collague said “maybe we should look at what the data says?”… And so they looked. And they found that 1 in 4 people coming to the website wants a cleaner. And we were like “holy shit!”. Because we didn’t have any cleaners. So we threw away what we had, we set up a three page site. We went all in so you could put a postcode in, find a cleaner, and book them. We got 27 bookings, then double that… And we raised some funding – £250k just when we desperately needed it. We found cleaners, we scaled up, we got much bigger investment. And we scaled up to 100 people.

Then we really turned into a data driven business, building what people want, try it, check the data, iterate. Our VC at Axel pushed us to use mobile… We weren’t convinced. We checked the data that actually people booked cleaners from their desk at lunchtime. At our pinnacle we moved 10k cleaners around London at one point. We had to look at liquidity and we needed cleaners to have an average of 30 hours of work per week… too few and cleaners weren’t happy, too high and jobs weren’t taken up. So at 31 hours we’d start recruiting.

From there we looked at expansion and what kind of characteristics were needed. We needed cities like a donut – clients in the middle, cleaners at the outside. We grew but then we got some unwanted attention and chose to sell. For £32 million. And the company that brought us had 80 engineers.. And they migrated 16 countries onto our platform which had been built by 8 engineers.

So, we sold our business…. And I thought I’m not going to do that again…

And then I wanted a new kitchen… So I had an architect in… spent £@500… 45 days later I got plans… and 75 days later I had an illustration of how it would look so I could make a decision. And so I started Resi, the first online architect. And it took me just 4 months to be convinced that this could be a business. We set up a page of what we thought we might do. I spent £10 per day on Facebook A/B testing ads. And we’ve had a huge amount of business…. We wanted to find the sweet spot for achitects and how long the work would take. Again we needed to know how much time was needed for each customer. So 3 hours is our sweet spot. Our business is now turning over £1 million a year after one year. And only one person works with data, he also does marketing. He looked at our customers and when they convert and how our activities overlaid. After 10 days we weren’t following up, and adding some intervention (email/text etc.) tripled our conversions.

We’ve also been able to look at hotspots across the UK, and we can target our marketing in those areas, and also understand that word of mouth… We can take advantage of that.

I’m a total data convert. I still don’t like spreadsheets. Data informs our decisions – not quite every decision as instinct matters too. But every piece of data analysis we did was doable in a spreadsheet by someone in high school… It doesn’t take machine learning, or AI, or big data. Even simple analysis can create tremendous results.

Q&A

Q1) What next?

A1) I always said I didn’t want to dine out on one story… Like Hassle. But I don’t know the end for Resi yet… Invite me back in a few years!n

Q1) The learning for a few hours of work was huge.

A1) Our entire business was based on a single piece of analysis – what were our customers looking for led to £32m.

The AI Race: Who’s Going To Win? – Vicky Brock (VB – chairing), CEO, Get Market Fit; Alex Depledge (AD), Founder Resi.co.uk, Former CEO Hassle.com; Joel KO (JK), Founding CEO, Marvelstone Ventures; Chris Neumann (CN), Early Stage Investor

CN: I’m a recovering entrepreneur. As an investor I’ve had a global purview on what’s going on in the AI race. And I think it’s interesting that we see countries and areas which haven’t always been at the cutting edge of technology, really finding the opportunities here. Including Edinburgh.

JK: We are funders based in Singapore and investing in FinTech. The AI technology has been arising… I’m hoping to invest in AI start ups and incubators.

AD: You already know who I am. In my brief hiatus between companies I was an entrepreneur in residence in Index Ventures, and I saw about 300 companies come in saying they were doing AI or Machine Learning so I have some knowledge here. But also knowing a leading professor in data ethics I don’t care who wins, but I care that Pandora isn’t let out of her box until governments have a handle on this because the risks are great.

VB: I’m a serial entrepreneur around data. And machine learning or AI can kind of be the magic words for getting investment. There is obvious hype here… Is it a disruptor?

CN: I’ve seen a lot of companies – like Alex – say they use ML or AI… In some ways its the natural progression from being data driven. I do think there will be an incredible impact on society over the next 10 years from AI. But I don’t think it will be the robots and tech from science fiction, it will probably be in more everyday ways.

VB: Is AI the key word to get funding…

JK: I see many AI start ups… But often actually it’s a FinTech start up… But they present themselves that way as funders like to hear that… There is so much data… And AI does now spread into data lives… Entrepreneurs see AI as a way to sell themselves to investors.

VB: At one stage it was “big data” then “AI” but you’ve had some little data… What did you see when you were entrepreneur in residence?

AD: No disrespect to investors but they focus on financials and data, but actually I’d often be asking about what was happening under the bonnet… So if they were were using machine learning, ask about that, ask about data sets, ask where it’s coming from… But often they do interesting data work but it’s a good algorithm or calculation… It’s not ML or AI. And that’s ok – that’s something I wanted to bring out in my presentation.

VB: What’s looking exciting now?

CN: We see really interesting organisations starting to do fascinating work with AI and ML. I focus on business to business work, but that often looks less exciting to others. So I am excited about an investment I’ve made in a company using BlockChain to prove GDPR compliance. I spoke with a cool company here using wearables and AI for preventing heart attacks, which is really amazing.

JK: I have been here almost a week, met start ups, and they were really really practical. They have the sense to make a revenue stream from the technology. And these very new start ups have been very interesting to me personally.

VB: You’ve started your next company, did you cross lots of ideas off first…

AD: Jules and I had a list of things we wouldn’t do… Chris talked about B2B… We talked about not doing large scale or consumer ideas. We whittled our list of 35 ideas down to 4 each and they were all B2B… But they bored us. We liked solving problems we’ve experienced. My third business I hope will be B2B as getting to £10m is a bit more straightforward than in B2C.

VB: AI requires particular skillsets… How should we be thinking about our skillsets and our talents.

CN: Eddie talked earlier about needing to know what the point in. It can be easy to get lost in the data, to geek out… And lose that focus. So Alex just asking that question, finding out who gives a damn, that’s really important. You have to do something worthwhile to somebody, there’s no point doing it .

JK: With AI… In ten years… Won’t be coding. AI can code itself. So my solution is that you should let your kids play outside. In Asia lots of parents send kids to coding schools… They won’t need to be engineers… Parents’ response to the trend is too early and not thought through…

AD: I totally agree. Free play and imagination and problem solving is crucial. There aren’t enough women in STEM. But you can over focus on STEM. It’s data and digital literacy from any angle, it could be UX, marketing, product management, or coding… In London we hav ethis idea that everyone should be coding, but actually digital literacy is the skills we need to close. And actually that comes down to basic literacy and numeracy. It’s back to basics to me.

VB: I’d like to make a shout out for arts and social sciences graduates. We learn to ask good questions…

AD: Looking at recent work on where innovation comes from, it comes from the intersectionality of disciplines. That’s when super exciting stuff happens…

Q&A

Q1) Mainly for Alex… I’m machine learning daft… And I love statistics. And I know the value of small scale statistics. And the value of machine learning and large scale data – not so much AI. How do you convey that to business people?

AD) We don’t have a stand out success in the UK. But with big corporates I tell them to start small.. Giving engineers space to play, to see what is interesting… That can yield some really interesting results. You can’t really show people stuff, you need to just try things.

VB) Are you trying to motivate people to use data in your company?

JK) Yes, with investors you see patterns… I tell kids to start start ups as early as possible… So they can fail earlier… Because failures then lead to successful businesses next time.

CN) A lot of folk won’t be aware that for many organisations there is a revenue stream around innovation… It’s a really difficult thing to try to bring in innovative practices into big organisations, or collaborate with them, without squishing that. There are VCs and multinationals who will charge you a lot of money to behave like a start up… But you can just start small and do it!

The Revolutionary World Of Data Science – Passing On That Tacit Knowledge! – Shakeel Khan, Data Science Capability Building Manager, HM Revenue & Customs

I’ve been quite fortunate in my role in that I’ve spend quite a lot of time working with both developed and developing economies around data science. There is huge enthusiasm across the world from governments. But there is also a huge fear factor around rogue players, and concerns about the singularity – machines exceeding humans’ capabilities. But there are genuine opportunities there.

I’ve been doing work in Pakistan, for DFID, where they have a huge problem with Dengy Fever. They have tracked the spread with mobile phone data, enabling them to contain it at source. That is saving lives. That’s a tremendous outcome. Closer to home, John Bell at Cambridge University has described AI as the saviour of our health services, as AI can enable us to run our services more effectively and more economically.

In my day job at HMRC, you can’t underestimate what the work that we do enables in terms of investment in the country and its services.

I want to talk about AI at three stages: Identify; Adopt; Innovate.

In terms of data science and what is being done around the world… The United Arab Emirates have set up their Ministry of AI and a 2031 Articificial Intelligebce Strategy. We have the Alan Turing Institute looking at specific problems but across many areas, some really interesting work there. In Edinburgh we have the amazing Data Lab, and the research that they are doing for instance with cancer, and we have the University of Edinburgh Bayes Centre. Lots going on in the developed world. But what about the developing world? I’ve just come back from Rwanda, who had a new Data Revolution Policy. I watched a TED talk a few weeks back that emphasised that what is not needed in sub0-saharan Africa is help, what they need is the tools and means to do things themself.

Rwanda is a hugely progressive country. They have more women in parliament (62.8%) than any country in the world. Their GDP is $8.3bn. They have a Data Revolution Policy. They are at the start of their journey. But they are trying to bring tacit knowledge in, to leapfrog development… Recognising the benefit of that tacit knowledge and of those face to face engagements.

For my role I am split about 50/50 between international development and work for HMRC. So I’ll say a bit more about the journey for developed economies…

Defining Data Science can be quite abstract. You have to make a benefits case, to support the vision, to share a framework and some idea of timeline, with quick wins, to build teams, to build networks. Having a framework allows organisations to build capabilities in a manageable way…

A new Data Science Centre going up in Kigali, Rwanda, will house 200 data scientsists – thats a huge commitment.

The data science strategic framework is about data; people skills; cultural understanding and acceptance – with senior buy in crucial for that… And identifying is also about data ethics, skills development – we have been developing frameworks for years that we can now share. For Rwanda we think we can reduce the time to develop data capabilities from maybe 5 years to perhaps 3. Similarly in Pakistan.

When you move to the adopt phase… You really need to see migrationa cross sectors. I started my career in finance. When I came to HMRC I did a review of machine learning and how that was being used, how that machine learning was generating benefit. We managed to bring in £29 bn that would otherwise be lost, partly through machine learning. One machine learning model can, effetively, bring in tens or hundreds of millions of pounds so they have to be well calibrated and tested. So, I developed the HMRC Predictive Analytics Handbook (from June 2014), which we’ve shared across HMRC but also DWP, across collaeagues in government.

In terms of Innovate, it is about understanding the field and latest developments. However HMRC are risk averse, so we want to see where innovation has worked elsewhere. So I did some work with Prof David Hand at Imperial College London about 20 years ago, and I got back in touch, and we developed a programme of data science learning. Not about Imperial providing training, it was a partnership between HMRC and Imperial. We looked closely at the curriculum and demonstrate value added, and look at how we could innovate what we do.

University of Edinburgh Informatics is a really interesting one. I read a document a few years ago by the late Prof. Jon Oberlander about the way that the academic and public and private sectors working together could really benefit the Scottish economy. Two years of work led to a programme in natural language processing that was the result of close collaboration in HMRC. Jon Oberlander was hugely influential, and passionate about conversational technology and the scourge of isolation. And was able to ask lots of questions about AI, and when that will be truly conversational. I hope to continue that work with Bayes, but also wanted to say thank you to Jon for that.

AI is increasingly touching our lives. Wherever we are in the world, sharing our tacit knowledge will be incredibly important.

Q&A

Q1) Rwanda has clearly made a deep impression. What were the most suprising things?

A1) People have stereotypes about sub saharan Africa that just aren’t true. For instance when you get off the plane you cannot take plastic bags in – they are an incredibly environmental country. I saw no litter anyway in the country. The people of Rwanda are truly committed to improving the lives of people.

Q2) Do you use the same machine learning methods for low income and high income tax payers/avoiders?

A2) There are some basic machine learning methods that are consistent, but we are also looking at more novel models like boosted trees.

Q3) I worked in Malawi and absolutely back up your comment about the importance of visiting. You talked about knowledge from yourself to Rwanda, how was the knowledge exchange the other way?

A3) Great question. It wasn’t learning all from developed to developing. We learnt a great deal from our trip. That includes cultural aspects. I terms of the foundations of data science, we in the UK have used machine learning in financial services and retail for 30 – 40 years, that isn’t really achievable in these countries at the moment and there it is learning going from developed to developing.

Closing comments – Maggie Philbin

I’ve been reflecting on the (less serious) ways data might influence my life. My son in law is in a band (White Lies) and that has given me such an insight into how the music industry use data – the gender and age of people who access your music, whether they will go to gigs etc. And in fact I was very briefly in a band myself during my Swap Shop days… We made a mock up Top of the Pops… Kids started writing in… And then BBC records decided to put it out… We had long negotiations about contracts… But I was sure no-one would buy it… It reached number 15… So we went from parodying Top of the Pops to being on Top of the Pops. And thank you to Scotland – we made number 9 here! But I hadn’t negotiated hard – we just got 0.5%. And if we’d had that data understanding that White Lies have, who knows where we would have been.

So, day one has been great. Thank you to The Data Lab, and to all the sponsors. And now we adjourn for drinks.

 March 22, 2018  Posted by at 10:53 am Events Attended, LiveBlogs Tagged with: , ,  No Responses »
Nov 172017
 

Today I am at the Scottish Government for the Digital and Information Literacy Forum 2017.

Introduction from Jenny Foreman, Scottish Government: Co-chair of community of practice with Cleo Jones (who couldn’t be here today). Welcome to the 2017 Digital and Information Literacy Forum!

Scottish Government Digital Strategy – Cat Macaulay, Head of User Research and Service Design, Scottish Government

I am really excited to speak to you today. For me libraries have never just been about books, but about information and bringing people together. At high school our library was split between 3rd and 4th year section and a 5th and 6th year section, and from the moment I got there I was desperate to get into the 5th and 6th year section! It was about place and people and knowledge. My PhD later on was on interaction design and soundscapes, but in the context of the library and seeking information… And that morphed into a project on how journalists yse information at The Scotsman – and the role of the library and the librarian in their clippings library. In Goffman terms it was this backstage space for journalists to rehearse their performances. There was talk of the clippings library shutting down and I argued against that as it was more than just those clippings.

So, that’s the personal bit, but I’ll turn to the more formal bit here… I am looking forward to discussions later, particularly the panel on Fake News. Information is crucial to allowing people to meaningfully, equally and truly participate in democracy, and to be part of designing that. So, the imporatnce of digital literacy is crucial to participation in democracy. And for us in the digital directorate, it is a real priority – for reaching citizens and for librarians and information professionals to support that access to information and participation.

We first set out a digital strategy in 2011, but we have been refreshing our strategy and about putting digital at the heart of what we do. Digital is not about technology, it’s a cultural issue. We moved before from agrarian to industrial society, and we are now in the process of moving from an industrial to a digital society. Aiming to deliver inclusive economic growth, reform public services, tackle inequalities and empower communities, and prepare people for the future workplace. Digital and information literacy are core skills for understanding the world and the future.

So our first theme is the Digital Economy. We need to stimulate innovation and investment, we need to support digital technologies industr, and we need to increase digital maturity of all businesses. Scotland is so dependent on small businesses and SMEs that we need our librarians and information professionals to be able to support that maturity of all businesses.

Our second theme is Data and Innovation. For data we need to increase public trust in holding data securely and using/sharing appropriately. I have a long term medical issue and the time it takes to get appointments set up, to share information between people so geographically close to each other – across the corridor. That lack of trust is core to why we still rely on letters and faxes in these contexts.

In terms of innovation, CivTech brings together the public sector teams and tech start-ups to develeop solutions to real problems, and to grow and expand services. We want to innovate and learn from the wider tech and social media context.

The third theme is Digital Public Services, the potential to simplify and standardise ways of working. Finding common technologies/platforms build and procured once. And design services with citizens to meet their needs. Information literacy skills and critical questioning are at the heart of this. You have to have that literacy to really understand the problems, and to begin to be looking at addressing that, and co-designing.

The fourth theme is Connectivity. Improving superfast broadband, improving coverage in rural areas, increasing the 4G coverage.

The fifth theme is Skills. We need to build a digitally skilled nation. I spent many years in academia – no matter how “digital native” we might assume them, actually we’ve assumed essentially that because someone can drive a car, they can build a car. We ALL need support for finding information, how to judge it and how to use it. We all need to learn and keep on learning. We also need to promote diversity – ensuring we have more disabled people, more BAME people, more women, working in these areas, building these solutions… We need to promote and enhance that, to ensure everyone’s needs are reflected. Friends working in the third sector in Dundee frequently talk about the importance of libraries to their service users, libraries are crucial to supporting people with differing needs.

The sixth theme is Participation. We need to enable everybody to share in the social, economic and democractic opportunities of digital. We need to promote inclusion and participation. That means everyone participating.

And our final theme (seven) is Cyber Security. That is about the global reputation for Scotland as a secure place to work, learn and do business. That’s about security, but it is also about trust and addressing some of those issues I talked about earlier.

So, in conclusion, this is a strategy for Scotland, not just Scottish Government. We want to be a country that uses digital to maximum effect, to enable inclusion, to build the economy, to positively deliver for society. It is a living document and can grow and develop. Collective action is needed to ensure nobody is left behind; we all remain safe, secure and confident about the future. We all need to promote that information and digital literacy.

Q&A
Q1) I have been involved in information literacy in schools – and I know in schools and colleges that there can be real inconsistency about how things are labeled as “information literacy”, “digital literacy”, and “digital skills”. I’m slightly concerned there is only one strand there – that digital skills can be about technology skills, not information literacy.

A1) I echo what you’ve just said. I spent a year in a Life Sciences lab in a Post Doc role studying their practice. We were working on a microscopy tool… And I found that the meaning of the word “image” was understood differently by Life Scientists and Data Scientists. Common terminology really matter. And indeed semantic technologies enable us to do that in new ways. But it absolutely matters.

Q2, Kate SVCO) We are using a digital skills framework developed that I think is also really useful to frame that.

A2) I’m familiar with that work and I’d agree. Stripping away complexity and agree on common terms and approaches is a core focus of what we are doing.

Q3) We have been developing a digital skills framework for colleges and for the student lifecycle. I have been looking at the comprehensive strategy for schools and colleges by Welsh Government’s… Are there plans for similar?

A3) I know there has been work taking place but I will take that back.

Q4) I thought that the “Participation” element was most interesting here. Information literacy is key to enabling participation… Say what you like about Donald Trump but he has made the role of information literacy in democracy very vital and visible. Scotland is in a good place to support information literacy – there are many in this room have done great work in this area – but it needs resourcing to support it.

A4) My team focuses on how we design digital tools and technologies so that people can use them. And we absolutely need to look at how best to support those that struggle. But is not just about how you access digital services… How we describe these things, how we reach out to people… I remember being on a bus in Dundee and hearing a guy saying “Oh, I’ve got a Fairer Scotland Consultation leaflet… What the fuck is a Consultation?!”. I’ve had some awkward conversations with my teenage boys about Donald Trump, and Fake News. I will follow up with you afterwards – I really welcome a conversation about these issues. At the moment we are designing a whole new Social Security framework right now – not a thing most other governments have had to do – and so we really have to understand how to make that clear.

Health Literacy Action Plan Update – Blythe Robertson, Policy Lead, Scottish Government

The skills, confidence, knowledge and understanding to interact with the health system and maintain good health is essentially what we mean in Health Literacy. Right now there is a huge focus in health policy on “the conversation”. And that’s the conversation between policy makers and practitioners and people receiving health care. There is a model of health and care delivery called “More than Medicine” – this is a memorable house-shaped visual model that brings together organisational processes and arrangements, health and care professionals, etc. At the moment though the patient has to do at least as much as the medical professional, with hoops to jump through – as Cat talked about before…

Instructions can seem easy… But then we can all end up at different places [not blogged: an exercise with paper, folding, eyes closed].

Back when computers first emerged you needed to understand a lot more about computer languages, you had to understand how it worked… It was complex, there was training… What happened? Well rather than trianing everyone, instead they simplified access – with the emergence of the iPad for instance.

So, this is why we’ve been trying to address this with Making it easy: A health literacy action plan for Scotland. And there’s a lot of text… But really we have two images to sum this up… The first (a woman looking at a hurdle… We’ve tried to address this by creating a nation of hurdlers… But we think we should really let people walk through/remove those hurdles.

Some statistics for you: 43% of English working age adults will struggle to understand instructions to calculate a childhood paracetamol dose. There is lot bound up here… Childhood health literacy is important. Another stat/fact: Half of what a person is told is forgotten. And half of what is remembered is incorrect. [sources: several cited health studies which will be on Blythe’s slides]. At the heart of issue is that a lot of information is transmitted… then you ask “Do you understand?” and of course you say “yes”, even if you don’t. So, instead, you need to check information… That can be as simple as rephrasing a question to e.g. “Just so I can check I’ve explained things clearly can you tell me what you’ve understood” or similar.

We did a demonstrator programme in NHS Tayside to test these ideas… So, for instance, if you wander into Nine Wells hospital you’ll see a huge board of signs… That board is blue and white text… There is one section with yellow and blue… That’s for Visual Impairment, because that contrast is easier to see. We have the solution but… People with visual impairment come to other areas of the hospitals. So why isn’t that sign all done in the same way with high contrast lettering on the whole board? We have the solution, why don’t we just provide it across the board. That same hospital send out some appointment letters asking them to comment and tell them about any confusion… And there were many points that that happened. For instance if you need the children’s ward… You need to know to follow signs for Paediatrics first… There isn’t a consistency of naming… Or a consistency of colour. So, for instance Maternity Triage is a sign in red… It looks scary! Colours have different implications, so that really matters. You will be anxious being in hospital – consistency can help reduce the levels of anxiety.

Letters are also confusing… They are long. Some instructions are in bold, some are small notes at the bottom… That can mean a clinic running 20 minutes late… Changing what you emphasise has a huge impact. It allows the health care provision to run more smoothly and effectively. We workshopped an example/mock up letter with the Scottish Conference for Learning Disability. They came up with clear information and images. So very clear to see what is happening, includes an image of where the appointment is taking place to help you navigate – with full address. The time is presented in several forms, including a clock face. And always offer support, even if some will not need it. Always offer that… Filling in forms and applications is scary… For all of us… There has to be contact information so hat people can tell you things – when you look at people not turning up to appointments was that they didn’t know how to contact people, they didn’t know that they could change the appointment, that they wanted to contact them but they didn’t want to make a phone call, or even that because they were already in for treatment they didn’t think they needed to explain why they weren’t at their outpatients appointment.

So, a new action plan is coming called “Making it easier”. That is about sharing the learning from Making it Easy across Scotland. To embed ways to improve health literacy in policy and practice. To develop more health literacy responsive organisations and communities. Design supports and services to better meet people’s health literacy levels. And that latter point is about making services more responsive and easier to understand – frankly I’d like to put myself out of a job!

So, one area I’d like to focus on is the idea of “Connectors” – the role of the human information intermediary, is fundamental. So how can we take those competancies and roll them out across the system… In ways that people can understand… Put people in contact with digital skills, the digital skills framework… Promoting understanding. We need to signpost with confidence, and to have a sense that people can use this kind of information. Looking at librarians as a key source of information that can helps support people’s confidence.

In terms of implementation… We have at (1) a product design and at (3) “Scaled up”. But what is at step (2)? How do we get there… Instead we need to think about the process differently… Starting with (1) a need identified, then a planned structured resources and co-developed for success, and then having it embedded in the system… I want to take the barriers out of the system.

And I’m going to finish with a poem: This is bad enough by Elspeth Murray, from the launch of the cancer information reference group of the South East Scotland Cancer Network 20 January 2016.

Q&A

Q1) I’m from Strathclyde, but also work with older people and was wondering how much health literacy is part of the health and social care integration?

A1) I think ultimately that integration will help, but with all that change it is challenging to signpost things clearly… But there is good commitment to work with that…

Q2) You talked about improving the information – the letters for instance – but is there work more fundamentally questioning the kind of information that goes out? It seems archaic and expensive that appointments are done through posted physical letters… Surely better to have an appointment that is in your diary, that includes the travel information/map….

A2) Absolutely, NHS Lothian are leading on some trial work in this area right now, but we are also improving those letters in the interim… It’s really about doing both things…

Cat) And we are certainly looking at online bookings, and making these processes easier, but we are working with older systems sometimes, and issues of trust as well, so there are multiple aspects to addressing that.

Q3) Some of those issues would be practically identical for educators… Teachers or lecturers, etc…

A3) I think that’s right. Research from University of Maastrict mapped out the 21 areas across Public and Private sectors in which these skills should be embedded… And i Think those three areas of work can be applied across those area… Have to look at design around benefits, we have some hooks around there.

Cat) Absolutely part of that design of future benefits for Scotland.

Panel Discussion – Fake News (Gillian Daly – chair; Lindsay McKrell (Strathclyde); Sean McNamara (CILIPS); Allan Lindsay (Young Scott))

Sean: CILIPS supports the library and information science community in Scotland, including professional development, skills and ethics. Some years ago “information literacy” would have been more about university libraries, but now it’s across the board an issue for librarians. Librarians are less gatekeepers of information, and more about enabling those using their libraries to seek and understand information online, how to understand information and fake news, how to understand the information they find even if they are digitally confident in using the tools they use to access that information.

Allan: Young Scot is Scotland’s natural charity for information literacy. We work closely with young people to help them grow and develop, and influence us in this area. Fake News crops up a lot. A big piece of work we are involved in is he 5 Rights projects, which is about rights online – that isn’t just for young people but significantly about their needs. Digital literacy is key to that. We’ve also worked on digital skills – recently with the Carnegie Trust and the Prince’s Trust. As an information agency we reach people through our website – and we ensure young people are part of creating content in that space.

Lindsay: I’d like to talk about digital literacy as well as Fake News. Digital literacy is absolutely fundamental to supporting citizens to be all that they can be. Accessing information without censorship, and a range of news, research, citizenship test information… That is all part of public libraries service delivery and we need to promote that more. Public libraries are navigators for a huge and growing information resource, and we work with partners in government, in third sector, etc. And our libraries reach outside of working hours and remote areas (e.g. through mobile levels) so we have unique value for policy makers through that range and volume of users. Libraries are also well placed to get people online – still around 20% of people are not online – and public libraries have the skills to support people to go online, gain access, and develop their digital literacy as well. We can help people find various source of information, select between them, to interpret information and compare information. We can grow that with our reading strategies, through study skills and after school sessions. Some libraries have run sessions on fake news, but I’m not sure how well supported thse have been. We are used to displaying interesting books… But why aren’t our information resources similarly well designed and displayed – local filterable resources for instance… Maybe we should do some of this at national level,  not just at local council level. SLIC have done some great work, what we need now is digital information with a twist that will really empower citizens and their information literacy…

Gillian Daly: I was wondering, Allan, how do you tackle the idea of the “Digital Native”? This idea of inate skills of young people?

Allan: It comes up all the time… This presumption that young people can just do things digitally… Some are great but many young people don’t have all the skills they need… There are misconceptions from young people themselves about what they can and cannot do… They are on social media, they have phones… But do they have an understanding of how to behave, how to respond when things go wrong… There is a lot of responsibility for all of us that just because young people use these things, doesn’t mean they understand them all. Those misconceptions apply across the board though… Adults don’t always have this stuff sorted either. It’s dangerous to make assumptions about this stuff… Much as it’s dangerous to assume that those from lower income communities are less well informed about these things, which is often not correct at all.

Lindsay: Yes, we find the same… For instance… Young people are confident with social media… But can’t attach a document for instance…

Comment from HE org: Actually there can be learning in both directions at University. Young people come in with a totally different landscape to us… We have to have a dialogue of learning there…

Gillian: Dialogue is absolutely important… How is that being tackled here…

Sean: With school libraries, those skills to transfer from schools to higher education is crucial… But schools are lacking librarians and information professionals and that can be a barrier there… Not just about Fake News but wider misinformation about social media… It’s important that young people have those skills…

Comment: Fake News doesn’t happen by accident… It’s important to engage with IFLA guide to spot that… But I think we have to get into the territory of why Fake News is there, why it’s being done… And the idea of Media and Information Literacy – UNESCO brought those ideas together a few years ago. There is a vibrant GATNO organisation, which would benefit from more Scottish participation.

Allan: We run a Digital Modern Apprenticeship at Young Scot. We do work with apprentices to build skills, discernment and resiliance to understand issues of fake news and origins. A few weeks back a young person commented on something they had seen on social media… At school for me “Media Studies” was derided… I think we are eating our words now… If people had those skills and were equipped to understand that media and creation process. The wider media issues… Fake News isn’t in some box… We have to be able to discern mainstream news as well as “Fake News”. Those skills, confidence, and ability to ask difficult questions to navigate through these issues…

Gillian: I read a very interesting piece by a journalist recently, looking to analyse Fake News and the background to it, the context of media working practice, etc. Really interesting.

Cat: To follow that up… I distinctly remember in 1994 in The Scotsman about the number of times journalists requested clippings that were actually wrong… Once something goes wrong and gets published, it stay there and repopulates… Misquotations happen that way for instance. That sophisticated understanding isn’t about right and wrong and more about the truthfulness of information. In some ways Trump is doing a favour here, and my kids are much more attuned to accuracy now…

Gillian: I think one of the scariest things is that once the myth is out, it is so hard to dispel or get rid of that…

Comment: Glasgow University has a Glasgow Media Group and they’ve looked at these things for years… One thing they published years ago, “Bad News”, looked at for instance the misrepresentation of Trade Unionists in news sources, for a multitude of complex reasons.

Sean: At a recent event we ran we had The Ferret present – those fact checking organisations, those journalists in those roles to reflect that.

Jenny: The Ferret has fact checking on a wonderful scale to reflect the level of fakeness…

Gillian: Maybe we need to recruit some journalists to the Digital and Information Literacy Forum.

And on that, with many nods of agreement, we are breaking for lunch.

Information Literacy & Syrian New Scots – Dr Konstantina Martzoukou, Postgraduate Programme Leader, Robert Gordon University

This project was supposed to be a scoping study of Syrian New Scots – Syrian Refugees coming to Scotland. The background to this is the Syrian Civil War since 2011, which has led to an enormous amount of refugees, mainly in the near region. Most research has been on Asylum seekers in the camps near Syria on basic survival and human rights, on their needs and how to respond to them. The aim of this project was different: a scoping study to examine the information related experiences and information literacy practices of Syrian new Scots during their resettlement and integration. So this is quite different as the context is relatively settled, and is about that resettlement process.

In September 2015 the Prime Minister announced an expansion of the refugee programme to take up to 2000 Syrian Refugees. And the first place Syrian Refugees came was Glasgow. Now, there have been a lot of changes since then but there is the intent to resettle 2000 Syrian Refugees by 2020.

Primary research was done with 3 refugee resettement officers, as well as focus groupd with Syrian new Scots. These groups were in both urban (1 group) and rural (2 groups), and included 38 people from across Syria, having been in camps in Lebanon, Turkey and Iraq and Jordan. I didn’t know what to expect – these people had seen the worst horrors of war. In reality the focus groups were sometimes loud and animated, sometimes quiet and sad. And in this group they came from a huge range of professional backgrounds, though most of the women did not work.

So, our work looked at included English language and community integration; Information provisions, cultural differences and previous experiences; Financial security. Today I want to focus on libraries and the role of libraries.

One of the most crucial aspects were language barriers and sociocultural. The refugees were given ESOL classes; a welcome pack with key information for finding the resources in their neighbourhood; a 24 hour Arabic hotline, set up with the mosque for emergencies so that families could receive help outside core working hours; In-house translation services. But one of the challenges across the support given was literacy as a whole – not all of the refugees could read and write in any language. But it was also about understanding interchangable words – “doctor” has a meaning but “GP” not so much. There was also a perception that learning English would be really difficult.

The refugees wanted to know how to learn English, and they were anxious about that. The support officers had different approaches. The ESOL classes were there, but some officers were really proactive, taking refugees to the train station, having mock job interviews… That was really really valuable. But some groups, even a year after arriving, weren’t speaking English. But sometimes that was about the families… Some were confident and really well travelled, but some had lived in one place, and not travelled, and found communication and networking much more difficult. So the language learning was very tied to socio-cultural background.

Many of these families have complex health needs – they were hand picked to come here often because of this – and that causes it’s own challenge. Some had no experience of recycling and of how to correctly put their bins out. Someone felt the open plan kitchen was difficult – that her child was burned because of it. One reported a neighbour telling him not to play with his son outside – the boundaries of danger and expectations of childhood was rather different from their new neighbours. Doctors appointments were confusing. Making bus change was expensive – buying something unneeded because the buses don’t give change. Many wanted family reunion information and support.

Technology is used, but technology is not the key source of information. They used mobile phones with pasy as you go sim cards. They used WhatsApp and were sharing quite traumatic memories and news in this way.

The library is there… But actually they are perceived as being for books and many refugees don’t go there. Community classes, meals etc. may be better. Computer classes can be useful, especially when refugees can participate in a meaningful way. And there are real challenges here – computer classes in the library didn’t work for this group as there were too few computers and the internet connections were too small.

For me the key thing is that we need to position the library as a key place for communication, learning and support for the families.

Q&A
Q1) Alamal(?) is running events in our libraries – we have an event with films and telling their story – and we have had huge interest in that.

A1) We really want to show case that to the wider community. There are some great examples from England, from other EU countries, but we want more Scottish examples so do please get in touch.

A User Study Investigating the Information Literacy of Scotland Teenagers – David Brazier, Research Assistant, Northumbria University

This is an ILG funded project looking at the Information Literacy of Scottish Teenagers. I’ll introduce the concepts, going through some related works, and then some of the methodology we’d like to implement. So, information literacy is about the ability to seek, understand, assess information. They are crucial to integrating with society as a whole, and is crucial to our modern society. We need to empower students to learn, so they can integrate themselves into modern society.

As the panel talked about earlier, the idea of the “Digital Native” is misleading. Young people have a poor understanding of their information needs. That leads to young people taking the top ranked documents/sites or cite that. And that needs to be counteracted early in their learning so that it doesn’t carry through and right into University (Rowlands 2008). In recent research (Brazier and Harvey 2017) ESOL post graduates were unable to perceive their performance correctly, often judging high performance when the opposite was true. In the “Not Without Me” report this inability to assess their own skills was also highlighted in the wider range of young people. These groups are highly educated, so they should be able to be more reflective on their own practice.

So, in our research, we are using a Mixed Methods approach to do a quantitative analysis of secondary school-aged children’s information gathering behaviour. Triangulated with qualitative assessments of the participants own assessment. It is around a simulated work task.

The search system is based on the TREC AQUAINT collection – large set of over a million documents from three large news agencies collected between 196 and 2000. Pre-defined search topics associated with the project. The initial 15 topics were reduced down to 4 topics selected by school representatives (librarian and 2 teachers from Gracemount High School in Edinburgh).

So, we start with a pre-task questionnaire. The search task is “Tropical strms: What tropical storms (hurricanes and typhoons) have caused significant property damage and loss of life?”. They can then search through a Google-style search of the documents. They click on those sources that seem relevant. And then they get a questionnaire to reflect on what they’ve done.

A pilot was conducted in December 2016. Tasks were randomly selected, using a Latin Square design to ensure no 2 students had the same two tasks. In total 19 students were involved, from S3 (13-14 years old). The study was on PCs rather than handheld devices. No other demographic data was collected. The school representative did provide a (new) unique id to match the task and the questionnaires. The id was known only to the school rep. No further personal data was taken.

We could then look at the queries each student submitted, and were able to ask why they did that and why they selected the article they did.

This is a work in progress… We are interested in how they engage with the study as a whole. We have used the findings of the pilot to adapt the study design and interface, including a task description relocated to a more prominent location; and an instruction sheet (physical) i.e. browser page, interpret interface.

The main study takes place next week, with 100 students (none of whom were part of the pilot). From this we want to get recommendations and guidelines for IL teaching; to inform professional practice; feedback to participants (pamphlet) for reflective purposes; academic publications in the field of information literacy, information retrieval, education and pedagogy.

Q&A

Q1) Why such a controlled space was selected – presumably students would normally use other places to search, to ask friends etc. So I wondered why you selected such a controlled space like this.

A1) In the previous study we allowed students to look anywhere on the web… But it is much harder to judge relevance in that… These have already been judged for relevance… It’s a wide arc… It adds complexity to the whole process… And someone has to transcribe and mark that footage… For my study there were 29 students and it took 7 months. For 100 students that’s just too large. Test collection is also standardised and replicatable.

The Digital Footprint MOOC – Nicola Osborne, Digital Education Manager, EDINA

This was me… No notes but slides to follow. 

Wikipedia & Information Literacy: the importance of reliable sources – Sara Thomas, Wikimedian in Residence, SLIC

Hi, I’m Wikimedian in Residence at SLIC. The role of a Wikimedian in residence is to work with cultural heritage organisations and Wikimedia and bring the two together. In this role we are working with three local libraries right now but we will be expanding it to a wider Scottish context.

I am passionate about open knowledge and open data. Open data and open knowledge leads to better society, it allows us to make better decisions – I am sick of us being asked to make big decisions about big issues without appropriate information.

Now, I want to introduce you to Bassel Khartabil who was an open source software developer and advocate for open data and knowledge. Knowledge is power… He was detained by the Syrian government and, before he was killed by the government, he wrote very movingly about the impact of open knowledge, that it is so important and a matter of life and death in some contexts.

I want to talk about production of knowledge and what that can teach us about information literacy. Jim Groom at #OER16, said “Wikipedia is the single greatest Open Education Resource the world has ever known”, and he’s not wrong. Wikipedia is more accurate than you may think. There are groups who just edit and work on improving the quality of articles. Women in Red is a group dedicated to having more women’s biographies on Wikipedia. 17% of biographies are women now, that’s 2% more than was the case 2 years ago – and they also work on bringing those biographies up to “featured article” quality.

Quality and ratings scale. Vandalism is picked up quickly – by bots and by people. Wikipedia is neutral in it’s point of view. Nature, in 2005, found that Wikipedia was nearly as accurate as Britannica (2.92 errors per article compared to 3.86 on Wikipedia). The Journal of Clinical Oncology, 2010, found Wikipedia as accurate as Physician Data Query (a premium database). The medical information there is huge – 80% of medical students will use it; ~50% of GPs will use it as a first point in their search. It is the most popular health resource on the web.

Wikipedia is generally the seventh most popular site on the internet. And we have a basic Notability guidance that means an article must be notable, there must be a reason for it being there. The information but be verifiable – the information must come from credible checkable verifiable sources. And we have to use reliable third party publiches sources with a reputation for fact checking and accuracy.

On the subject of media literacy… The Daily Mail didn’t like that Wikipedia doesn’t treat it as reliable – there is no ban but you will get a trigger to ask you if that’s the right source. Brilliantly, they got loads of errors in their own outraged article.

Manipulation is really obvious… The community spots when people are trying to whitewash their own biographies, to promote their company, to try to remove claims of misconduct. And Wikipedia gets it – there is an article on “Wikipedia is not a credible source” – we get it. We are a starting point, a jumping off and discovery point. And in fact we have Wiki Ed (https://wikiedu.org/) which works to combat fake news, to support information literacy. If you want to teach information literacy, wiki can help you. We have a Wiki Education Dashboard – mainly in the US, but lots in the UK. Our guides include: Instructor Basics and Case Studies for using Wikipedia in teaching. Some lovely projects here…

I did some work with Chris Harlow, at University of Edinburgh, a few years ago… He found a medical term that wasn’t in Wikipedia, gave them guidance on how to create a Wikipedia page, taught them how to use a medical database, and sent them away to write a section in simple language… Then we show them how to edit an article. It’s really really easy to edit an article now… The students write their section, put it in… And write a page, it goes live… Five minutes later it’s on the front page of Google. It is gratifying to find work so immediately valued and used and useful.

Translation studies at UoE also use Wikipedia in the classroom. Queen Mary’s University of London use Wikipedia in their film classes. They trialled it, it’s now a compulsory part of the programme. It’s a way to teach digital skills, information synthetis. Imperial College London are working to engage undergraduate students involved in synthesising and sharing university. Greg Singh in Sterling University who uses WikiBooks… Which is a project that seeks to create collaboratively produced text books… To produce a text book, a chapter, on what they’ve been doing… It’s about developing collaboration, track that, instill that within a student…

So I have a vide here of Aine Kavanagh from Reproductive Biology at the University of Edinburgh, who authored an article that has been read 20,000 times in the last year. Aine was looking for some extra work, and she wanted to develop her skills. She asked Chris (Harlow) what she could do… She wrote about one of the most common sorts of cancers which there was very little information about. To be able to see the value of that, the impact of that work, that this has been hugely gratifying to do.

To conclude: open knowledge is important, open knowledge gives us a better society, not just being able to find this information but also be able to produce that knowledge is hugely powerful. And Wikipedia is more accurate than you think!

Closing

Gillian: I just want to thank all of our speakers, to thank all of you for coming, and to thank the Scottish Government for hosting us.

Oct 042017
 

This afternoon I’m at the Keynote Session for Information Security Awareness Week 2017 where I’ll speaking about Managing Your Digital Footprint in the context of security. I’ll be liveblogging the other keynotes this afternoon.

The event has begun with a brief introduction from Alistair Fenemore, UoE’s Chief Information Security Officer, and from his colleague David Creighton Offord, the organiser for today’s event.

Talk by John Whitehouse, PWC Cyber Security Director Scotland covering the state of the nation and the changing face of Cyber Threat

I work at PWC, working with different firms who are dealing with information security and cyber security. In my previous life I was at Standard Life. I’ve seen all sorts of security issues so I’m going to talk about some of the things I’ve seen, trends, I’ll explain a few key concepts here.

So, what is cybersecurity… People imagine people in basements with balaclavas… But it’s not that at all…

I have a video here…

(this is a Jimmy Kimmel comedy segment on the Sony hack where they ask people for their passwords, to tell them if it’s strong enough… And how they construct them… And/or the personal information they use to construct that…)

YouTube Preview Image

We do a lot of introductions for boards… We talk about technical stuff… But they laugh at that video and then you point out that these could all be people working in their companies…

So, there is technical stuff here, but some of the security issues are simple.

We see huge growth due to technology, and that speaks to businesses. We are going to see 1 billion connected devices by 2020, and that could go really really wrongly…

There is real concern about cyber security, and they have concerns about areas including cloud computing. The Internet of Things is also a concern – there was a study that found that the average connected device has 25 security vulnerabilities. Dick Cheney had to have his pacemaker re programmed because it was vulnerable to hacking via Bluetooth. There was an NHS hospital in England that had to pause a heart surgery when the software restarted. We have hotel rooms accessible via phones – that will come to homes… There are vulnerabilities in connected pet feeders for instance.

Social media is used widely now… In the TalkTalk breach we found that news of the breach has been leaked via speculation just 20 seconds after the breach occurs – that’s a big challenge to business continuity planning where one used to plan that you’d perhaps have a day’s window.

Big data is coming with regulations, threats… Equifax lost over 140 million records – and executives dumped significant stock before the news went public which brings a different sort of scrutiny.

Morrisons were sued by their employees for data leaked by an annoyed member of staff – I predict that big data loss could be the new PPI as mass claims for data loss take place. So maybe £1000 per customer per data breach for each customer… We do a threat intelligence service by looking on the dark net for data breach. And we already see interest in that type of PPI class suit approach.

The cyber challenge extends beyond the enterprise – on shore, off shore; 1st through to 4th parties. We’ve done work digging into technology components and where they are from… It’s a nightmare to know who all your third parties are… It’s a nightmare and a challenge to address.

So, who should you be worried about? Threat actors vary…. We have accidental loss, Maware that is not targeted, and hacker hobbyists in the lowest level of sophistication, through to state sponsored attacks at the highest level of sophistication. Sony were allegedly breached by North Korea – that firm spends astronomical amounts on security and that still isn’t totally robust. Target lost 100 million credit card details through a third party air conditioner firm, which a hacker used to get into the network, and that’s how the loss occured. And when we talk organised crime we are talking about really organised crime… One of the Ukrainian organised crime groups were offering a Ferrari for their employee of the month prize for malware. We are talking seriously Organised. And serious financial gain. And it is extremely hard to trace that money once its gone. And we see breaches going on and on and on…

Equifax is a really interesting one. There are 23 class action suits already around that one and that’s the tip of the iceberg. There has been a lot of talk of big organisations going under because of cyber security, and when you see these numbers for different companies, that looks increasingly likely. Major attacks lead to real drops in share prices and real impacts on the economy. And there are tangible and intangible costs of any attack…. From investigation and remediation through to DEO and CTO’s losing their jobs or facing prison time – at that level you can personally liable in the event of an attack.

In terms of the trends… 99% of exploited vulnerabilities (in 2014) had been identified for more than a year, some as far back as 1999. Wannacry was one of these – firms had 2 months notice and the issues still weren’t addressed by many organisations.

When we go in after a breach, typically the breach has been taking place for 200 days already – and that’s the breaches we find. That means the attacker has had access and has been able to explore the system for that long. This is very real and firms are dealing with this well and really badly – some real variance.

One example, the most successful bank robbery of all time, was the Bangladesh Central Bank was attacked in Feb 2016 through the SWIFT network .These instructions totalled over US $900 million, mostly laundered through casinos in Macau. The analysis identified that malware was tailored for the target organisation based on the printers they were using, which scrubbed all entry and exit points in the bank. The US Secret Service found that there were three groups – two inside the bank, one outside executing the attack.

Cyber security concerns are being raised, but how can we address this as organisations? How do we invest in the right ways? What risk is acceptable? One challenge for banks is that they are being asked to use Fintechs and SMEs working in technology… But some of these startups are very small and that’s a real concern for heads of securities in banks.

We do a global annual survey on security, across about 10,000 people. We ask about the source of compromise – current employees are the biggest by some distance. And current customer data, as well as IPR, tend to be the data that is at risk. We also see Health and Social Care adopting more technology, and having high concern, but spending very little to counter the risks. So, with Wannacry, the NHS were not well set up to cope and the press love the story… But they weren’t the target in any way.

A few Mythbusters for you…

Anti-Virus software… We create Malware to test our clients’ set up. We write malware that avoids AVs. Only 10-15% of malware will be caught with Anti-Virus software. There is an open source tool, Veil-Framework, that teaches you how to write that sort of Malware so that you can understand the risks. You should be using AV, but you have to be aware that malware goes beyond that (and impacts Macs too)… There is a malware SaaS business model on the darknet – as an attacker you’ll get a guarantee for your malware’s success and support to use it!

Myth 2: we still have time to react. Well, no, the lag from discovery to impacting you and your set up can be minutes.

Myth 3: well it must have been a zero day that got us! True Zero Day exploits are extremely rare/valuable. Attacker won’t use one unless target is very high value and they have no other option. They are hard to use. Even NSA admits that persistence is key to sucessful compromise, not zero day exploits. The NSA created EternalBlue – a zero day exploit – and that was breached and deployed out to these “good guys” as Wannacry.

Passwords… They are a thing of the past I think. 2-factor authentication is more where we are at. Passphrases and strength of passphrases is key. So complex strings with a number and a site name at the end is recommended these days. Changing every 30 days isn’t that useful – it’s so easy to bruteforce the password if lost – much better to have a really strong hash in the first place.

Phishing email is huge. We think about 80% of cyber attacks start that way. Beware spoofed addreses, or extremely small changes to email addresses.

We had a client that had an email from their “finance director” about urgently paying money to an account, which was only spotted because someone in finance noticed the phrasing… “the chief exec never says “Thanks”!”

Malware trends: our strong view is that you should never ever pay for a Ransomeware attack.

I have another video here…

(In this video we have people having their “mind read” for some TV show… It was uncanny… And included spending data… But it wasn’t psychic… It was data that they had looked up and discovered online… )

YouTube Preview Image

It’s not a nice video… This is absolutely real… This whole digital footprint. We do a service called Digital Footprinting for senior execs in companies, and you have to be careful about it as they can give so much away by what you and those around you post… It’s only getting worse and more pointed. There are threat groups going for higher value targets, they are looking for disruption. We think that the Internet of Things will open up the attack surface in whole new ways… And NACS – the Air Traffic people – they are thinking about drones and the issues there around fences and airspace… How do you prepare for this. Take the connected home… These fridges are insecure, you can detect if owner is opened or not and detect if they are at home or not… The nature of threats is changing so much…

In terms of trends the attacks are moving up the value chain… Retain bank clients aren’t interesting compared to banks finance systems, more to exchanges or clearing houses. It’s about value of data… Data is maybe $0.50 for email credentials; a driving license is maybe $25… and upwards the price goes depending on value to the attackers…

So, a checklist for you and your work: (missed this but delighted that digital footprint was item 1)

Finally, go have a look at your phone and how much data is being captured about you… Check your iPhone frequent locations. And on Android check Google Location History. The two biggest companies in the world, Google and Facebook, are free, and they are free because of all the data that they have about you… But the terms of service… Paypal’s are longer than Hamlet. If you have a voice control TV from Samsung and you sign those, you agree to always on and sharable with third parties…

So, that’s me… Hopefully that gave you something to ponder!

Q&A

Q1) What does PWC think about Deloitte’s recent attack?

A1) Every firm faces these threats, and we are attacked all the time… We get everything thrown at us… And we try to control those but we are all at risk…

Q2) What’s your opinion on cyber security insurance?

A2) I think there is a massive misunderstanding in the market about what it is… Some policies just cover recovery, getting a response firm in… When you look at Equifax, what would that cover… That will put insurers out of business. I think we’ll see government backed insurance for things like that, with clarity about what is included, and what is out of scope. So, if, say, SQL Injection is the cause, that’s probably negligence and out of scope…

Q3) What role should government have in protecting private industry?

A3) The national cyber security centre is making some excellent progress on this. Backing for that is pretty positive. All of my clients are engaging and engaged with them. It has to be at that level. It’s too difficult now at lower levels… We do work with GCHQ sharing information on upcoming threats… Some of those are state sponsored… They even follow working hours in their source location… Essentially there are attack firms…

Q4) (I’m afraid I missed this question)

A4) I think Microsoft in the last year have transformed their view… My honest view is that clients should be on Windows 10 its a gamechanger for security. Firms will do analysis on patches and service impacts… But they delayed that a bit long. I have worked at a firm with a massively complex infrastructure, and it sounds easy to patch but it can be quite difficult to do that in practice, and it can put big operational systems at risk. As a multinational bank for instance you might be rolling out to huge numbers of machines and applications.

Talk by Kami Vaniea (University of Edinburgh) covering common misconceptions around Information Security and to avoid them

My research is on the usability of security and why some failings are happening from the point of view of an average citizen. I do talks to community groups – so this presentation is a mixture of that sort of content and proper security discussion.

I wanted to start with misconceptions as system administrators… So I have a graph here of where there is value to improving your password; then the range in which having rate limits on password attempts; and the small area of benefit to the user. Without benefits you are in the deadzone.

OK, a quick question about URL construction… http://facebook.mobile.com? Is it Facebook’s website, Facebook’s mobile site, AT&T’s website, or Mobile’s website. It’s the last one by construction. It’s both of the last two if you know AT&T own mobile.com. But when you ask a big audience they mainly get it right. Only 8% can correctly differentiate http://facebook.profile.com vs http://profile.facebook.com. Many users tend to just pick a big company name regardless of location in URLs. A few know how to to correctly read subdomain URLs. We did this study on Amazon Mechanical Turk – so that’s a skewed sample of more technical people. And that URL understanding has huge problematic implications for phishing email.

We also tried http://twitter.com/facebook.com. Most people could tell that was Twitter (not Facebook). But if I used “@” instead of “/” people didn’t understand, thought it was an email…

On the topic of email… Can we trust the “from” field? No. Can we trust a “this email has been checked for viruses…” box? No. Can you trust the information on the source URL for a link in the email, that is shown in the bottom of the browser? Yes.

What about this email – a Security alert for your linked Google account email? Well this is legitimate… Because it’s coming from accounts.google.com. But you knew this was a trick question… Phishing is really tricky…

So, a shocking percentage of my students think that “from” address is legitimate… Tell your less informed friends how easily that can be spoofed…

What about Google. Does Google know what you type as you type it and before you hit enter? Yes, it does… Most search engines send text to their servers as you write it. Which means you can do fun studies on what people commonly DON’T post to Facebook!

A very common misconception is that opening web pages, emails, pdfs, and docs is like reading physical paper… So why do they need patching?

Lets look at an email example… I don’t typically get emails with “To protect your privacy, Thunderbird has blocked remote content in this message” from a student… This showed me that a 1 pixel invisible image had come with the email… which pinged the server if I opened it. I returned the email and said he had a virus. He said “no, I used to work in marketing and forgot that I had that plugin set up”.

Websites are made of many elements from many sources. Mainly dynamically… And there are loads of trackers across those sites. There is a tool called Lightbeam that will help you track the sites you go to on purpose, and all the other sites that track you. That’s obviously a privacy issue. But it is also a security problem. The previous speaker spoke about supply chains at Target, this is the web version of this… That supply chain gets huge when you visit, say, six websites.

So, a quiz question… I got to Yahoo, I hit reload… Am I running the same code as a moment ago… ? Well, it’s complicated… I had a student run a study on this… And how much changes… In a week about half of the top 200 sites had changed their javascript in a week. I see trackers change between individual reloads… But it might change, it might not…

So we as users you access a first party website, then they access third party sites… So they access ad servers and that sells that user, and ad is returned, with an image (sometimes with code). Maybe I bid to a company, that bids out again… This is huge as a supply chain and tracking issue…

So the Washington Post, for instance, covering the yahoo.com malware attack showed that malicious payloads were being delivered to around 300k users per hour, but only about 9% (27k) users per hour were affected – they were the ones that hadn’t updated their systems. How did that attack take place? Well rather than attack, they just brought an ad and ran malware code.

There is a tool called Ghostery… It’s brilliant and useful… But it’s run by the ad industry and all the trackers are set the wrong way. Untick those all and then it’s fascinating… They tell you about page load and all the components involved in loading a page…

To change topic…

Cookies! Yes, they can be used to track you across web sites. But they can’t give you malware as is. So… I will be tackling the misconception that cookies is evil… And I’m going to try to convince you otherwise. Tracking can be evil… But cookies is kind of an early example of privacy by design…

It is 1994. The internet cannot remember anyone between page loads. You have an interaction with a web server that has absolutely no memory. Cookies help something remember between page loads and web pages… Somehow a server has to know who you are… But back in 1994 you just open a page and look at it, that’s the interaction point…

But companies wanted shopping baskets, and memory between two page reloads. There is an obvious technical solution… You just give every browser a unique identifier… Great! The server remembers you. But the problem is a privacy issue across different servers… So, Netscape implemented cookies – small text strings the server could ask the browser to remember and give back to it later…

Cookies have some awesome properties: it is client visible; third party tracking is client visible too; it’s opt out (delete) option on a per-site basis; it’s only readable by the site that set it; and it allows for public discussion of tracking…

… Which is why Android/iOS both went with the unique ID option. And that’s how you can be tracked. As a design decision it’s very different…

Now to some of the research I work on… I believe in getting people to touch stuff, to interact with it… We can talk to each other, or mystify, but we need to actually have people understand this stuff. So we ran an outreach activity to build a website, create a cookie, and then read the cookie out… Then I give a second website… To let people try to understand how to change their names on one site, not the other… What happens when you view them in Incognito mode… And then exploring cookies across sites. And how that works…

Misconception: VPNs solve all privacy and security problems. Back at Indiana I taught students who couldn’t code… And that was interesting… They saw VPNs as magic fairy dust. And they had absorbed this idea that anyone can be hacked at any time… They got that… But that had resulted in “but what’s the point”. That worries me… In the general population we see media coverage of attacks on major companies… And the narrative that attacks are inevitable… So you end up with this problem…

So, I want to talk about encryption and why it’s broken and what that means by VPNs. I’m not an encryption specialist. I care about how it works for the user.

In encryption we want (1) communication between you and the other party is confidential and has not been changes, and no-one can read what you sent and no one can change what you sent; and (2) to know who we are talking about. And that second part is where things can be messed up. You can make what you think is the secure connection to the right person, but could be a secure connection to the wrong person – a man in the middle attack. A real world example… You go to a coffee shop and use wifi to request the BBC news site, but you get a wifi login page. That’s essentially a man in the middle attack. That’s not perhaps harmful, it’s normal operating procedure… VPNs basically work like this…

So, an example of what really happened to a student… I set up a page that just had them creating a very simple cookie page… I was expecting something simple… But one of them submitted a page with a bit of javascript… it is basically injecting code so if I connect to it, it will inject an ad to open in my VPN…. So in this case a student logged in to AnchorFree – magic fairy dust – and sees a website and injects code that is what I see when they submit the page in Blackboard Learn…

VPNs are not magic fairy dust. The University runs an excellent VPN – far better for coffee shops etc!

So, I like to end with some common advice:

  • Install anti virus scanner. Don’t turn off Windows 8+ automatically installed AV software… I ran a study where 50% of PhD students had switched off that software and firewalls…
  • Keep your software updated – best way to stay safe
  • Select strong passcode for important things you use all the time
  • For non-important stuff, use a password manager for less important things that you use rarely… Best to have different password between them…
  • Software I use:
    • Ad blockers – not just ads, reduce lots of extra content loading. The more websites you visit the more vulnerable you are
    • Ghostery and Privacy Badger
    • Lightbeam
    • Password Managers (LastPass, OnePassword and KeePass are most recommended
    • 2-factor like Yubikey – extra protection for e.g. Facebook.
    • If you are really serious: UMatrix and NoScript BUT it will break lots of pages…

Q&A

Q1) It’s hard to get an average citizen to do everything… How do you get around that and just get the key stuff across…

A1) Probably it’s that common advice. The security community has gotten better at looking at 10 key stuff. Google did a study with Blackhats Infosec conference about what they would do… And asked on Amazon Mechanical Turj about what they would recommend to friends. About the only common answer amongst blackhats was “update your software”. But actually there is overlap… People know they should change passwords, and should use AV software… But AV software didn’t show on the Blackhat list… But 2-factor and password managers did…

Q2) What do you think about passwords… long or complex or?

A2) We did a study maybe 8 years ago on mnemonic passwords… And found that “My name is Inigo Montoya, you killed my father, prepare to die” was by far the most common. The issue isn’t length… It’s entropy. I think we need to think server side about how many other users have used the same password (based on encrypted version), and you need something that less than 3 people use…

Q2) So more about inability to remember it…

A2) And it depends on threat type… If someone knows you, your dog, etc… Then it’s easier… But if I can pick a password for a long time I might invest in it – but if you force people to change passwords they have to remember it. There was a study that people using passwords a lot use some affirmations, such as “I love God”… And again, hard to know how you protect that.

Q3) What about magic semantic email links instead of passwords…

A3) There is some lovely work on just how much data is in your email… That’s a poor mans version of the OAuth idea of getting an identity provider to authenticate the user. It’s good for the user, but that is one bigger stake login then… And we see SMS also being a mixed bag and being subject to attack… Ask a user though… “there’s nothing important in my email”.

Q4) How do you deal with people saying “I don’t have anything to hide”?

A4) Well I start with it not being about hiding… It’s more, why do you want to know? When I went to go buy a car I didn’t dress like a professor, I dressed down… I wanted a good price… If I have a lot of time I will refer them to Daniel Salvo’s Nothing to Hide.

Talk by Nicola Osborne (EDINA) covering Digital Footprints and how you can take control of your online self

And that will be me… So keep an eye out for tweets from others on the event hashtag: #UoEInfoSec.

And with a very brief summing up from Alastair Fenemore, the day came to a close. Thanks to the lovely University Information Security team for organising this really interesting event (and inviting me to speak) as part of their awesome Information Security Awareness Week programme.

 October 4, 2017  Posted by at 3:06 pm digital footprint, Events Attended, LiveBlogs Tagged with: ,  No Responses »
Aug 032017
 

Today I am at Repository Fringe which runs today and tomorrow in Edinburgh and is celebrating 10 years of Repofringe! I’m just here today – presenting a 10×10 on our recent Reference Rot in Theses: A HiberActive Pilot project work – and will be blogging whilst I’m here. As usual, as this is live, may include the odd typo or error so all comments, corrections, questions, additions, etc. are very much welcomed!

Welcome – Janet Roberts, Director of EDINA

My colleagues were explaining to me that this event came from an idea from Les Carr that there should be not just one repository conference, but also a fringe – and here were are at the 10th Repository Fringe on the cusp of the Edinburgh Fringe.

So, this week we celebrate ten years of repository fringe, and the progress we have made over the last 10 years to share content beyond borders. It is a space for debating future trends and challenges.

At EDINA we established the OpenDepot to provide a space for those without an institutional repository… That has now migrated to Zenodo… and the challenges are changing, around the size of data, how we store and access that data, and what those next generation repositories will look like.

Over the next few days we have some excellent speakers as well as some fringe events, including the Wiki Datathon – so I hope you have all brought your laptops!

Thank you to our organising team from EDINA, DCC and the University of Edinburgh. Thank you also to our sponsors: Atmire; FigShare; Arkivum; ePrints; and Jisc!

Opening Keynote – Kathleen Shearer, Executive Director COARRaising our game – repositioning repositories as the foundation for sustainable scholarly communication

Theo Andrew: I am delighted to introduce Kathleen, who has been working in digital libraries and repositories for years. COAR is an international organisation of repositories, and I’m pleased to say that Edinburgh has been a member for some time.

Kathleen: Thank you so much for inviting me. It’s actually my first time speaking in the UK and it’s a little bit intimidating as I know that you folks are really ahead here.

COAR is now about 120 members. Our activities fall into four areas: presenting an international voice so that repositories are part of a global community with diverse perspective. We are being more active in training for repository managers, something which is especially important in developing countries. And the other area is value added services, which is where today’s talk on the repository of the future comes in. The vision here is about

But first, a rant… The international publishing system is broken! And it is broken for a number of reasons – there is access, and the cost of access. The cost of scholarly journals goes up far beyond the rate of inflation. That touches us in Canada – where I am based, in Germany, in the UK… But much more so in the developing world. And then we have the “Big Deal”. A study of University of Montreal libraries by Stephanie Gagnon found that of 50k subscribed-to journals, really there were only 5,893 unique essential titles. But often those deals aren’t opted out of as the key core journals separately cost the same as that big deal.

We also have a participation problem… Juan Pablo Alperin’s map of authors published in Web of Science shows a huge bias towards the US and the UK, a seriously reduced participation in Africa and parts of Asia. Why does that happen? The journals are operated from the global North, and don’t represent the kinds of research problems in the developing world. And one Nobel Prize winner notes that the pressure to publish in “luxury” journals encourages researchers to cut corners and pursue trendy fields rather than areas where there are those research gaps. That was the cake with Zika virus – you could hardly get research published on that until a major outbreak brought it to the attention of the dominant publishing cultures, then there was huge appetite to publish there.

Timothy Gowers talks about “perverse incentives” which are supporting the really high costs of journals. It’s not just a problem for researchers and how they publish, its also a problem of how we incentivise researchers to publish. So, this is my goats in trees slide… It doesn’t feel like goats should be in trees… Moroccan tree goats are taught to climb the trees when there isn’t food on the ground… I think of the researchers able to publish in these high end journals as being the lucky goats in the tree here…

In order to incentivise participation in high end journals we have created a lucrative publishing industry. I’m sure you’ve seen the recent Guardian article: “is the staggeringly profitable business of science publishing bad for science”. Yes. For those reasons of access and participation. We see very few publishers publishing the majority of titles, and there is a real

My colleague Leslie Chan, funded by the International Development Council, talked about openness not just being about gaining access to knowledge but also about having access to participate in the system.

On the positive side… Open access has arrived. A recent study (Piwowar et al 2017) found that about 45% of articles published in 2015 were open access. And that is increasing every year. And you have probably seen the May 27th 2016 statement from the EU that all research they fund must be open by 2020.

It hasn’t been a totally smooth transition… APCs (Article Processing Charges) are very much in the mix and part of the picture… Some publishers are trying to slow the growth of access, but they can see that it’s coming and want to retain their profit margins. And they want to move to all APCs. There is discussion here… There is a project called OA2020 which wants to flip from subscription based to open access publishing. It has some traction but there are concerns here, particularly about sustainability of scholarly comms in the long term. And we are not syre that publishers will go for it… Particularly one of them (Elsevier) which exited talks in The Netherlands and Germany. In Germany the tap was turned off for a while for Elsevier – and there wasn’t a big uproar from the community! But the tap has been turned back on…

So, what will the future be around open access? If you look across APCs and the average value… If you think about the relative value of journals, especially the value of high end journals… I don’t think we’ll see lesser increases in APCs in the future.

At COAR we have a different vision…

Lorcan Dempsey talked about the idea of the “inside out” library. Similarly a new MIT Future of Libraries Report – published by a broad stakeholder group that had spent 6 months working on a vision – came up with the need for libraries to be open, trusted, durable, interdisciplinary, interoperable content platform. So, like the inside out library, it’s about collecting the output of your organisation and making is available to the world…

So, for me, if we collect articles… We just perpetuate the system and we are not in a position to change the system. So how do we move forward at the same time as being kind of reliant on that system.

Eloy Rodrigues, at Open Repository earlier this year, asked whether repositories are a success story. They are ubiquitous, they are adopted and networked… But then they are also using old, pre-web technologies; mostly passive recipients; limited interoperability making value added systems hard; and not really embedded in researcher workflows. These are the kinds of challenges we need to address in next generation of repositories…

So we started a working group on Next Generation Repositories to define new technologies for repositories. We want to position repositories as the foundation for a distributed, globally networked infrastructure for scholarly communication. And on top of which we want to be able to add layers of value added services. Our principles include distributed control to guard againts failure, change, etc. We want this to be inclusive, and reflecting the needs of the research communities in the global south. We want intelligent openness – we know not everything can be open.

We also have some design assumptions, with a focus on the resources themselves, not just associated metadata. We want to be pragmatic, and make use of technologies we have…

To date we have identified major use cases and user stories, and shared those. We determined functionality and behaviours; and a conceptual models. At the moment we are defining specific technologies and architectures. We will publish recommendations in September 2017. We then need to promote it widely and encourages adoption and implementation, as well as the upgrade of repositories around the world (a big challenge).

You can view our user stories online. But I’d like to talk about a few of these… We would like to enable peer review on top of repositories… To slowly incrementally replace what researchers do. That’s not building peer review in repositories, but as a layer on top. We also want some social functionalities like recommendations. And we’d like standard usage metrics across the world to understand what is used and hw.. We are looking to the UK and the IRUS project there as that has already been looked at here. We also need to address discovery… Right now we use metadata, rather than indexing full text content… So contat can be hard to get to unless the metadata is obvious. We also need data syncing in hubs, indexing systems, etc. reflect changes in the repositories. And we also want to address preservation – that’s a really important role that we should do well, and it’s something that can set us apart from the publishers – preservation is not part of their business model.

So, this is a slide from Peter Knoth at CORE – a repository aggregator – who talks about expanding the repository, and the potential to layer all of these additional services on top.

To make this happen we need to improve the functionality of repositories: to be of and not just on the web. But we also need to step out of the article paradigm… The whole system is set up around the article, but we need to think beyond that, deposit other content, and ensure those research outputs are appropriately recognised.

So, we have our (draft) conceptual model… It isn’t around siloed individual repositories, but around a whole network. And some of our draft recommendations for technologies for next generation repositories. These are a really early view… These are things like: ResourceSync; Signposting; Messaging protocols; Message queue; IIIF presentation API; AOAuth; Webmention; and more…

Critical to the widespread adoption of this process is the widespread adoption of the behaviours and functionalities for next generation repositories. It won’t be a success if only one software or approach takes these on. So I’d like to quote a Scottish industrialist, Andrew Carnegie: “strength is derived from unity…. “. So we need to coalesce around a common vision.

Ad it isn’t just about a common vision, science is global and networked and our approach has to reflect and connect with that. Repositories need to balance a dual mission to (1) showcase and provide access to institutional research and (2) be nodes in a global research network.

To support better networking in repositories and in Venice, in May we signed an International Accord for Repository Networks, with networks from Australasia, Canada, China, Europe, Japan, Latin America, South Africa, United States. For us there is a question about how best we work with the UK internationally. We work with with OpenAIRE but maybe we need something else as well. The networks across those areas are advancing at different paces, but have committed to move forward.

There are three areas of that international accord:

  1. Strategic coordination – to have a shared vision and a stronger voice for the repository community
  2. Interoperability and common “behaviours” for repositories – supporting the development of value added services
  3. Data exchange and cross regional harvesting – to ensure redundancy and preservation. This has started but there is a lot to do here still, especially as we move to harvesting full text, not just metadata. And there is interest in redundancy for preservation reasons.

So we need to develop the case for a distributed community-managed infrastructure, that will better support the needs of diverse regions, disciplines and languages. Redundancy will safeguard against failure. With less risk of commercial buy out. Places the library at the centre… But… I appreciate it is much harder to sell a distributed system… We need branding that really attracts researchers to take part and engage in †he system…

And one of the things we want to avoid… Yesterday it was announced that Elsevier has acquired bepress. bepress is mainly used in the US and there will be much thinking about the implications for their repositories. So not only should institutional repositories be distributed, but they should be different platforms, and different open source platforms…

Concluding thoughts here… Repositories are a technology and technologies change. What its really promoting is a vision in which institutions, universities and their libraries are the foundational nodes in a global scholarly communication system. This is really the future of libraries in the scholarly communication community. This is what libraries should be doing. This is what our values represent.

And this is urgent. We see Elsevier consolidating, buying platforms, trying to control publishers and the research cycle, we really have to move forward and move quickly. I hope the UK will remain engaged with this. And i look forward to your participation in our ongoing dialogue.

Q&A

Q1 – Les Carr) I was very struck by that comment about the need to balance the local and the global I think that’s a really major opportunity for my university. Everyone is obsessed about their place in the global university ranking, their representation as a global university. This could be a real opportunity, led by our libraries and knowledge assets, and I’m really excited about that!

A1) I think the challenge around that is trying to support common values… If you are competing with other institutions it’s not always an incentive to adopt systems with common technologies, measures, approaches. So there needs to be a benefit for institutions in joining this network. It is a huge opportunity, but we have to show the value of joining that network It’s maybe easier in the UK, Europe, Canada. In the US they don’t see that value as much… They are not used to collaborating in this way and have been one of the hardest regions to bring onboard.

Q2 – Adam Field) Correct me if I’m wrong… You are talking about a Commons… In some way the benefits are watered down as part of the Commons, so how do we pay for this system, how do we make this benefit the organisation?

A2) That’s where I see that challenge of the benefit. There has to be value… That’s where value added systems come in… So a recommender system is much more valuable if it crosses all of the repositories… That is a benefit and allows you to access more material and for more people to access yours. I know CORE at the OU are already building a recommender system in their own aggregated platform.

Q3 – Anna Clements) At the sharp end this is not a problem for libraries, but a problem for academia… If we are seen as librarians doing things to or for academics that won’t have as much traction… How do we engage academia…

A3) There are researchers keen to move to open access… But it’s hard to represent what we want to do at a global level when many researchers are focused on that one journal or area and making that open access… I’m not sure what the elevator pitch should be here. I think if we can get to that usage statistics data there, that will help… If we can build an alternative system that even research administrators can use in place of impact factor or Web of Science, that might move us forward in terms of showing this approach has value. Administrators are still stuck in having to evaluate the quality of research based on journals and impact factors. This stuff won’t happen in a day. But having standardised measures across repositories will help.

So, one thing we’ve done in Canada with the U15 (top 15 universities in Canada)… They are at the top of what they can do in terms of the cost of scholarly journals so they asked us to produce a paper for them on how to address that… I think that issue of cost could be an opportunity…

Q4) I’m an academic and we are looking for services that make our life better… Here at Edinburgh we can see that libraries are the naturally the consistent point of connection with repository. Does that translate globally?

A4) It varies globally. Libraries are fairly well recognised in Western countries. In developing world there are funding and capacity challenges that makes that harder… There is also a question of whether we need repositories for every library.. Can we do more consortia repositories or similar.

Q5 – Chris) You talked about repository supporting all kinds of materials… And how they can “wag the dog” of the article

A5) I think with research data there is so much momentum there around making data available… But I don’t know how well we are set up with research data management to ensure data can be found and reused. We need to improve the technology in repositories. And we need more resources too…

Q6) Can we do more to encourage academics, researchers, students to reuse data and content as part of their practice?

A6) I think the more content we have at Commons level, the more it can be reused. We have to improve discoverability, and improve the functionality to help that content to be reused… There is huge use of machine reuse of content – I was speaking with Peter Knoth about this – but that isn’t easy to do with repositories…

Theo) It would be really useful to see Open Access buttons more visible, using repositories for document delivery, etc.

Chris Banks, Director of Library Services, Imperial CollegeFocusing upstream: supporting scholarly communication by academics

Gavin MacLachlan: I’d just like to welcome you again to Edinburgh, our beautiful city and our always lovely weather (note for remote followers: it’s dreich and raining!). I’m here to introduce Chris, whose work with LIBER and LERU will be well known to you.

Chris: This is my first fringe and I find it quite terrifying that I’m second up! Now, I’m going to go right back to basics and policy…

The Finch report in 2012 and Research Councils UK: we had RCUK policy; funding available for immediate Gold OA (including hybrid); embargo limits apply where Green OA chosen. Nevertheless the transition across the world is likely to take a number of years. For my money we’ve moved well on repositories, partly as the UK has gone it alone in terms of funding that transition process.

In terms of REF we had the Funding council REF policy (2013) which is applicable to all outputs that are to be submitted to the post 2014 REF exercise – effectively covers all researchers. No additional funding available Where Green OA selected, requirement for use of repositories. There were also two paragraphs (15 and 26) shaping what we have been doing…

That institutions are encouraged to go beyond the minimum (and will receive credit for doing so) – and the visibility of that is where we see the rise of University presses. And the statement that repositories do not need to be accessible for reuse and text mining, but that, again, there will be credit for those that are. Those two paragraphs have driven what we’ve been doing at Imperial.

At the moment UK researchers face the “policy stack” challenge. There are many funder policies; the REF policy differs substantially from other policies and applies to all UK research academics – you can comply with RCUK policy and fall foul of REF; many publisher policies…

So how can the REF policy help? Institutions recognise IP, copyright and open access policies are not necessarily supporting funder compliance – something needs to be done. There is a variety of approaches to academic IP observed in UK institutions. Legally in the UK the employer is the first copyright holder… subject to any other agreements and unless the individual is a contractors etc.

Publishers have varying approaches to copyright, licence to first publish, to outright copyright transfer. Licences are not read to academics. It’s not just in publishing… It’s social media… It’s a big problem.

For the library we want to create frictionless services. We need to upscale services to all researchers – REF policy requirements. We can’t easily give an answer to researchers on their OA options. So we started our work at imperial to address this, and to ensure our own organisational policy aligned with funder policies. We also wanted to preserve academic choice over publishing, and ability to sign away rights when necessary (though encouraging scrutiny of licenses). We have a desire to maximise impact of publication. And there is a desire to retain some re-use rights for us in teaching etc, including rights to diagrams etc.

The options we explored with academics was to do as we do at the moment – with academics signing over copyright, through to the institution claiming all copyright on all academic outputs. And we wanted to look at two existing models in between, the SPARC model (academic signed copyright over to publisher but licenses back); and the Harvard model – which we selected.

The Harvard model is implemented as part of the university OA policy. Academic deposits Author Accepted Manuscipts (AAMs) and grant a non-exclusive licence to the university for all journal articles. It is a well established policy and has been in use (elsewhere) since 2008. Where a journal seeks a waiver that can be managed by exception. And this is well tested in Ivy League colleges but also much more widely, including universities in Kenya.

The benefits here is that academia retains rights, authors have the right to make articles open access – open access articles have higher citations than closed ones. Authors can continue to publish in journal or choice irrespective of whether it allows ope access or not. Single means by which authors can comply with green open access policies. We are minimising reliance on hybrid open access – reducing “double dipping”, paying twice through subscriptions and APC – a complex and costly process. I think we and publishers see money for hybrid OA models drying up in the future, as the UK has pretty much been the one place doing that. Instead funding is typically used for pure gold OA models and publications.

We have mae some changes to the Harvard model policy to make it work in the context of UK law, also to ensure it facilitated funder deposit compliance and REF eligibility. The next step here is that 60 institutions overall are interested and we have a first mover group of around 12 institutions. We are discussing with publishers. And we have had wider engagement with the researcher, library, research office and legal office communities. We have a website and advocacy materials under development. We are also drafting boilerplate texts for authors, collaboration agreements etc. especially for large international collaborative projects. We have a steering committee established and that includes representatives from across institutions, and including a publisher.

At the moment we are addressing some publisher concerns and perceptions. Publishers are taking a very particular approach to us. We have a real range of responses. Some are very positive – including the pure gold (e.g. PLoS) and also learned society (e.g. Royal Society). Other publishers have raised concerns and are in touch with the steering group, and with ASPLP.

Summary of current concerns:

  • that it goes beyond requirements of Finch. We have stated that UK-SCL is to support REF and other
  • AAMs will be made available on publication. Response: yes, as per Harvard model around since 2008
  • Administrative burden on UK author/institutions as publishers would have to ask for waivers in 70-80%. We have responded that in other Harvard using experiences it has been less than 5% and we can’t see why UK authors would be treated differently.
  • They noted that only 8% of material submitted to the REF were green OA compliant. We have noted that only 8% submitted were green OA, not 8% of all eligible for submission.

Researchers have also raised concerns

  • the need to seek agreement from co-authors, especially in collaborations. Can be addressed through a phased/gradual implementation
  • Fear that a publisher will refuse to publish. Institutions using Harvard model repot no instances of this happening
  • Learned Societies – fear loss of income. No reliable research evidence to back up this fear.
  • Don’t like the CC-BY-ND Licence. That is to comply with RCUK but warrants further discussion.

Our next step is further meeting with PA/ALPSP to take place during the summer. We have encouraged proposals to delivery more than simply minimum REF eligibility which would resolve current funder/publisher policy stack complexity. We will finalise the website, waiver system, advocacy materials and boilerplate texts. To gain agreement on early mover institutions and on the dat of first adoption. And to notify publishers.

Another bit of late breaking news… Publishers recently went to HEFCE to ask about policy statements and, as a result of that, HEFCE will be clarifying that it is pushing for minimum compliance and encouraging more than that. One concern of the REF policy had been that only material submitted to the REF would have been deposited…

Last time my institution submitted 5k items, more than half were not monographs. We submitted 95% of our researchers. Out of that four items were looked at, now would be 2. And from that our funding is decided. And you can see, from that, why that bigger encouragement for the open scholarly ecosystem is so important.

I wanted to close by sharing some useful further materials and to credit others who have been part of this work.

One important thing to note is that we are trying to help researchers and university to comply as policies from funders and publishers evolve. I would like to see that result in discussion with publishers, and a move to all gold OA…  The AAMs is not the goal here, it’s the published article. Now that could see the end of repositories – something I am cautious of raising with this audience. Now in the

Q&A

Q1) The elephant in the room here is Sci Hub… They are making 95% of published content available for free. You have AAMCs out there… And we haven’t seen subscriptions drop.

A1) So our initiative is about legal sharing. And also need to note that the UK is just one scholarly community. And others have not moved towards mandates and funding. I think it is a shame that fights have been picked is with institutions, when we have that elephant in the room…

Q2) Congratulations on the complex and intricate discussions you have been holding… Almost a legal game of Twister, where all the participants hate each other! This ia particular negotiation at the end of a process, at the end of the scholarly publishing change. How might you like your experience to feed into training of researchers and their own understanding of copyright, ownership of their own outputs.

A2) The challenge that we observe is that we have many younger researches and authors who are very passionate and ethically minded about openness. They are under pressure from supervisors who say they will not get tenured position if they don’t have a “good” journal on heir cv. And they are frustrated by the slow movement on the San Francisco research assessment declaration. Right now the quality journals remain those subscription high impact journals. But we have research showing the higher use of open access journals. But we still have that debate within academe that is slowing down that environment. But training researchers about their IP and what copyright. I also think it is interesting that Sir Mark Walpock in charge of UKRI as he has written before about the evolving scholarly record, and the scattering of articles and outputs, instead building online around research projects. He gave a talk at LIBER in 2015, and an article for THE. He was also at Wellcome when they first introduced their mandate so I think we really do have someone who understands that complexity and the importance of openness.

10×10 presentations (Chair: Ianthe Sutherland, University Library & Collections)

  1. v2.juliet – A Model For SHERPA’s Mid-Term Infrastructure. Adam Field, Jisc

I’m here from SHERPA HQ at Jisc! I’m going to go back to 2006… We saw YouTube celebrating it’s first year… Eight out of Ten Cats began… The Nintendo WII appeared… And… SHERPA/JULIET was launched (SHERPA having been around in 2001). So, when we set up Sherpa REF as a beta service in 2016 we had to build something new, as JULIET hadn’t been set up for APIs and interoperability in that kind of way.

So, we set about a new SHERPA/JULIET based around a pragmatic, functional data model; to move data into a platform; to rebrand to Jisc norms; a like-for-like replacement; and a precedent for our other services as we update them all..

So, a quick demo… We now have the list of funders – as before – include an overview of open access. So if we choose Cancer Research UK… You can see the full metadata record, headings for more information. Can see which groups it is part of… We have a nice open API where you can retrieve information.

So, whilst it was a like for like rebuild we have snuck in new features, including FundRef DOIs – added automatically where possible, will be added to with manual input too. More flexible browsing. And a JSON API – really easy to work with. And in the future we’d like funders able to add to their own records and other usefu l3rd party editorial features. We want to integrate ElasticSearch. And we want to add microservices…

In terms of our process here… The hard part was analying the existing data, structuring it into a more appropriate shape… the next part was much easier… We configured EPrints, imported data, and added some bespoke service requirements.

Right now we have a beta of SHERPA/JULIET. All take a look please! We are now working on OpenDOAR. And then SHERPA/ROMEO is expected to be in early 2018.

We now want your feedback! Email sherpa@jisc.ac.uk with your comments and feedback. We’ll have feedback sessions later today that you can join us for and share your thoughts, ask questions about the API. And myself and Tom Davey our user interface person, are here all day – come find us!

  1. CORE Recommender: a plug in suggesting open access content. Nancy Pontika, CORE

I want to talk about discoverability of content in repositories… Salo 2008, Konkiel 2012 and Acharya 2017 talk about the challenges of discoverability in repositories. So, what is needed? Well, we need recommender systems in repositories so that we can increase the number of incoming links to relevant resources…

For those of you new to repositories, CORE is an aggregation service, we are global and focused we have started harvesting gold OA journals… We have services at various levels, including for text mining and data science. We have a huge collection of 8 million full text articles,  77 million metadata records… They are all in one place… So we can build a good recommendation system.

What effect can we have? Well it will increase the accessibility meaning more engagement, higher Click-Through Rate (CTR); twice as often people access resources on CORE via its recommender system than via search. And that additional engagement increases the time spent in your repositories – which is good for you. And you can open another way to find research…

For instance you can see within White Rose Research Online that suggested articles are presented that come from all of the collections of CORE, including basic geographic information… We would like crowd sourced feedback here. The more users that engage in feedback, the more the recommender will improve. We also get feedback from our community. At the moment the first tab is CORE recommendations, the second tab is institutional recommendations. We’ve had feedback that institutions would prefer it th eother way… We have heard that… Although we note that CORE recommendations are better as its a bigger data set…. We want to make sure the institutional tab appears first unless there are few recommendations/poor matches… We are working on this…

CORE Recommender has been installed at St Mary’s; LSHTM; the OU; University of York; University of Sheffield; York St John; Strathclyde University… and others with more to follow.

How does it work? Currently it’s an article-to-article recommender system. There is preprocessing to make this possible. What is unique is that recommendations is based on full text, and the full text is open access.

What is the CORE recommender not? It is not always right – but which recommendation system is? And it does not compare the “quality” of the recommended articles with the “quality” of the initial paper…

  1. Enhancing Two workflows with RSpace & Figshare: Active Data to Archival Data and Research to Publication. Rory Macneil, Research Space and Megan Hardeman of Figshare

Rory: Most of the discussion so far has been on publications, but we are talking about data. I think it’s fair to say that FigShare in the data field; and RSpace in the Lab notebooks world have been totally fixated on interoperability!

Right now most data does not make it into repositories… Some shouldn’t be but even the data that should be shared, is not. One way to increase deposit is to make it easy to deposit data. By integrating with RSpace notebooks that allows easy and quick deposit.

So, in RSPace you can capture metadata of various types. There are lots of ways to organise the data… And to use that you just need to activate the FigShare plugin. Then you select the data to deposit – exporting one or many documents… You select what you want to deposit, and the format to deposit in. You can export all of your work, or all of your lab’s work – whatever level of granularity you want to share… You deposit to Figshare… And over to Megan!

Megan: Figshare is a repository where users can male all of their research outputs availale in citable, accessible ways (all marked up for Google Scholar). You upload any file type (we support over 1000 types); we assign a DOI on an item level’ Store items in perpetuity (and backed up in DPN); track usage stats and Altmetrics (more exposure) and you can collaborate with researchers inside and outside your institutions.

figshare has na open API and integrations with RSpace nad other organizations and tools…

For an example… You can see an electronic lab notebook from RSpace which can be browsed and explored in the browser!

  1. Thesis digitisation project. Gavin Willshaw, University of Edinburgh

I’m digital curator here, and manager of the PhD digitisation project. This project sees a huge amount of content going into ERA, our repository. In the last three years we’ve moved from having two photographers to having two teams of photographers and cataloguers across two sites – we are investing heavily.

We have 17,000 PhD theses and that will all be online by the end of 2018. This will provide global access to entire PhD collection. We have obtained some equipment. We are creating metadata records, and also undertaking some preservation work where thre required.

The collection is largely standardised… But we have some latin and handwritten theses. We have awkward objects – like slices of lungs!

For 10k theses we have duplicates and they are scanned destructively. 3000 unique these are scanned non-destructively in house. And 40000 unique these outsourced. All are OCRed. And they are all catalogued, with data protection checks made before determining what can be shared online in full and which cannot.

In terms of copyright and licensing, that is still with the author. We have contacted some and had positive feedback. It’s a risk but a low risk. In any case we can’t asset the copyright or change licences on our own. And we already have over 2500 theses live.

And these theses are not just text… We have images that are rare and unusual. We share some of these highlights in our blog: https://libraryblogs.is.ed.ac.uk/phddigitisation/ and we use, on Twitter, the hashtag #UoEPhD. We have some notable theses… Alexander Macall Smith’s PhD is there; Isabelle Elmsley Hutton, a doctor in the first world war in the Balkans – so noted she was on a stamp in Serbia last year; Helen Pankhurts; and of course members of staff from the university too!

Impact wise the theses on ERA have been downloaded 2 million times since 2012. Those digitised in the project are seeing around 3000 downloads per month. Oddly our most popular thesis right now is on the differentiation of people in Norwich. We are also looking at what else we can d… Linking theses to Wikipedia; adding a thesis to Wikisource (and getting 10x the views); and now looking at what else… text and data mining etc.

  1. Weather Cloudy & Cool Harvest Begun’: St Andrews output usage beyond the repository. Michael Bryce, University of St Andrews

I didn’t expect it to actually be cloudy today…!

Our repository has been going since 2006, and use has been growing steadily…

Some of the highlights fro our repository has included research on New Caledonian crows and collaborative tool use. We also have farming diaries in our repository under Creative Commons license… Pushing that out into the community in blog posts and posters… So going beyond traditional publications and use. Our material on Syria has seen significant usage driven partly by use in OJS journals.

Our repository isn’t currently OpenAIRE compliant, but we have some content shared that way, which means a bigger audience… For instance material on virtual learning environments associated with a big EU project.

We’ve also been engaging in publishing engagement. The BBC asked us to digitise a thesis at the time of broadcasting Coast which added that work to our repository.

When we reached our 10,000th item we had cake! And helped publicise the student and their work to a wider audience…

Impact and the REF panel session

Brief for this session: How are institutions preparing for the next round of the Research Excellence Framework #REF2021, and how do repositories feature in this? What lessons can we learn from the last REF and what changes to impact might we expect in 2021? How can we improve our repositories and associated services to support researchers to achieve and measure impact with a view to the REF? In anticipation of the forthcoming announcement by HEFCE later this year of the details of how #REF2021 will work, and how impact will be measured, our panel will discuss all these issues and answer questions from RepoFringers.

Chair: Keith McDonald (KM), Assistant Director, Research and Innovation Directorate, Scottish Funding Council

The panel here include Pauline Jones, REF Manager at University of Edinburgh, and a veteran of the two previous REFs – she was at Napier University in 2008, and was working at the SFC (where I work) for the previous REF and was involved in the introduction of Impact.

Catriona Firth (CF), REF Deputy Manager, HEFCE

I used to work in universities, now I am a poacher-turned-gamekeeper I suppose!

Today I want to talk about Impact in REF 2014. Impact was introduced and assessed for the first time in REF 2014. After extensive consultation Impact was defined in an inclusive way. So, for REF 2014, impact was assessed in four-page case studies describing impacts that had occurred between January 2008 and July 2013. The submitting university must have produced high quality research since 1993 that contributed to the impacts. Each submitting unit (usually subject area) returned one case study, plus an additional case study for every 10 staff.

At the end of the REF 2014 we had 6,975 case studies submitted. On average across submissions 44% of impacts were judged outstanding (4*) by over 250 external users of research, working jointly with the academic panel. There was global spread of impact, and those impacts were across a wealth of areas of life policy, performance and creative practice, etc. There was, for instance, a case study of drama and performance that had an impact on nuclear technology. The HEFCE report on impact is highly recommended reading.

In November 2015 Lord NicholasStern was commissioned by the Minister of Universities and Science to conduct an independent review of the REF. He found that the exercise was excellent, and had achieved what was desired. However there were recommendations for improvement:

  • lowering the burden on institutions
  • less game-playing and use of loop holes
  • less personalisation, more institutionally focused – to take pressure off institutions but also recognise and reward institutional investment in research
  • recognition for investment
  • more rounded view of research activity – again avoiding distortion
  • interdisciplinary emphasis – some work could
  • broaden impact – and find ways to capture, reward, and promote the ways UK research has a benefit on and impacts society.

If you go to the HEFCE website you’ll see a video of a webinar on the Stern Review and specifically on staff and outputs, including that all research active staff should be included, that outputs be determined at assessment level, and that outputs should not be portable.

In terms of impact there was keenness to broaden and deepen the definition of impact and provide additional guidance. Policy was a safer kind of case studies before. The Stern Review emphasised a need for more focus on public engagement and impact on curricula and/or pedagogy. Reduce the number of required case studies to a minimum of one. And to include impact arising from research, research activity, or a “body of work”.  And having a quality threshold for underpinning research based on rigour – not just originality. And the opportunity to resubmit case studies if the impact was ongoing.

We have been receiving feedback – over 400 responses – which are being summarised. That feedback includes positive feedback on broadening impact and to aligning definitions of impact and on public engagement across funding bodies. There were some concerns about sub-profile based on one case study – especially in small departments. And in those case you’d know exactly whose work and case study was 4* (or not). There have been concerns about how you separate rigour from originality and significance. There was a lot of support for broader basis of research, but challenges in drawing boundaries in practice – in terms of timing and how far back you go… For scholarly career assessment do you go back further? And there was broad support for resubmission of 2014 case studies but questions about “additionality” – could it be the same sort of impact or did it need to be something new or additional? So, we are working on those questions at the moment.

The other suggestion from the Stern Review was the idea of an institutional level assessment of impact, giving universities opportunities to show case studies that didn’t fall neatly elsewhere. Th ecase studies arising from multi and interdisciplinary and collaborative work, and that that should be 10-20% of total ipact case studies; minimum of one. But feedback has been unclear here, particularly the conflation of interdisciplinary research with institutional profiles. Concern also that the University might take over a case study that would otherwise sit in another unit.

So, the next step is communications in summer/autumn 2017. There will be a REF initial decisions document. A summary of consultation responses. And there will be sharing of full consultation responses (with permission).  And there will be a launch for our REF 2021 website and Twitter account.

Anne-Sofie Laegran (ASL), Knowledge Exchange Manager, College of Arts, Humanities and Social Sciences, University of Edinburgh

KM: Is resubmission better for some areas than others?

ASL: I think it depends on what you mean by resubmission.. We have some good case studies arising from the same research as in 2014, but they are different impacts.

So.. I will give you a view from the trenches. To start I draw your attention to the University strapline that we have been “Influencing the world since 1583”. But we have to demonstrate and evidence that of course.

There has been impact of impact in academia… When I started in 2008 it was about having conversations about the importance of having an impact, and now it is much more about how you do this. There has been a culture change – all academic staff must consider th epotential impact of research. The challenge is not only to create impact but also to demonstrate impact. There is also an incentive to show ipact – it is part of career progression, it is part of recruitment, and it is part of promotion.

Impact of impact in academia has also been about training – how to develop pathways as well as how to capture and evidence impact. And there has been more support – expert staff as well as funding from funders and from the university.

In terms of developing pathways to impact we have borrowed questions that funders ask:

  • who may benefit from your researh?
  • what might th ebenefts ve?
  • what can you do to ensure potential beneficiaries and decision makers have th eopportunity to engage and benefit

And it is also – especially when capturing impact – about developing new skills and networks.

For instance… If you want to impact the NHS, who makes decisions, makes changes… If you are working with museums and galleries the decision makers will vary depending on where you can find that influence. And, for instance, you rarely partner with the Scottish Government, but you may influence NGOs who then influence Scottish Government.

Whatever the impact it starts from excellent research; which leads to knowledge exchange – public engagement, influencing policy, informing professional practice and service deliver, technology transfer; and that results in impact. You don’t “do” impact, your work is used and influences that then effects a change and an impact.

REF impact challenges include demonstrating change/benefit as opposed to reporting engagement activity. Attributing that change to research. And providing robust evidence. In 2014 that was particularly tricky as the guidance was in 2012 so people had to dig back… That should be less of an issue now, we’ve been collecting evidence along the way…

Some cases that we think did well, and/or had feedback were doing well:

  • College of art scholar, who has a dual appointment at the National Galleries of Scotland. She curated the Impressionism Scotland show with over 100k visitors. There was good feedback that also generated debate. It had a change on how th egallery curates shows. And on the market the works displayed went up in value – it had a real economic impact.
  • In law two researchers have been undertaking longitudinal work on young people, their lives, careers, and criminal careers. That is funded by Scottish Government. That research led to a new piece of policy based on the findings of that research. And there was a quote from Scottish Government showing a decline in youth crime, attributing that to the policy change, and which was based on research – showing that clear line of impact.
  • In sociology, a researcher wrote about the impact of research on the financial crisis for the London Review of Books, it was well received and he was named one of the most influential thinkers on the crisis; his work was translated to French; it was picked up in parliament; and Stephanie Flanders – then BBC economics editor – tweeted that this work was the most important on the financial crisis.
  • In music, researchers developed the Skoog, an instrument for disabled students to engage in music. They set up a company, they had investment. At the the time of the REF they had 6 employees, they were selling to organisations – so reaching many people. And in the cultural olympiad during the Olympics in 2012 they were also used, showing that wider impact.

So for each of these you can see there was both activity, and impact here.

In terms of RepoFringe areas I was asked to talk about the role of repositories and open access. It is potentially important. But typically we don’t see impact coming from the scholarly publication, it’s usually the activities coming from the research or from that publication. Making work open access certainly isn’t enough to just trigger impact.

Social media can be important but it needs to have high level of engagement, reach and/or significance to demonstrate more than mere dissemination. That Stephanie Flanders example wouldn’t be enough on it’s own, it works as part of demonstrating another impact, and a good way to track impact, to understand your audience… And to follow up and see what happened next…

Metrics – there is no doubt that numeric evidence was important. Our head of research said last time “numbers speak louder than adjectives” but they have to be relevant and useful. You need context. Standardised metrics/Altmetrics doesn’t work – a big report recently concluded the same. Altmetrics is about alternative metrics that can be tracked online, using DOI. A company called Altmetrics gathers that data, can be useful to track… And can be manipulated by friends with big Twitter followers.. It won’t replace case studies, but may be useful for tracking…

In terms of importance of impact… It relates to 20% of REF score; determined 26% of the funding in Scotland. Funding attracted per annum for the next 7 years:

  • 4* case study brings in £45-120k
  • 3* £9-25k
  • 2* £0
  • 4* output, for comparison, is work £7-15k…

The question that does come up is “what is impact” and yes, a single Tweet could be impact that someone has read and engaged with your work… But those big impact case studies are about making a real change and a range of impacts.

Pauline Jones (PJ), REF Manager and Head of Strategic Performance and Research Policy, University of Edinburgh

Thank you to Catriona and Anne-Sofie for introducing impact. I wanted to reinforce the idea that this is what we are doing anyway, making an impact on society, so it is important anyway, not only because of the REF.

Catriona suggested we had a “year off” but actually once REF happened we went into an intense period of evaluation and reflection, then of course the Stern review, consultation, general election… It has been quite non-stop. But actually even if that wasn’t all going on, we’d need our academics to be aware of the REF and of open access. I think open access is incredibly important, people are looking for it… Research is publicly funded… But it has required a lot of work to get up and running.

Although we are roughly at mid point between REFs, we are up and running, gathering impact, preparing to emphasise our impact. In terms of collecting evidence, depositing papers… That will happen in most universities. I think many will be doing the sort of Mock REFs/REF readiness exercises that we have been undertaking. We are also already thinking about drafting our case studies. As we get nearer to submission we’ll take decisions on inclusion… and getting everything ready.

So for REF 2021 we have a long time period over which submission is prepared. There is no period over which outputs, impacts, environment don’t count. Academics thinking now about what to include: 2017 REF readiness exercise to focus on open access and numbers; 2018 Mock REF to focus on quality. And we all have to have a CRIS system now to make that work.

What’s new here? We are still waiting for the draft to understand what’s happening. There are open access journal articles/conference proceedings. There are probably the challenges of submitting all research staff; decoupling the one-to-four staff-to-outputs ratio. That break is quite a good thing… Some researchers might struggle to create four key outputs – part time staff, staff with maternity leave, etc. But we want a sense of what that looks like from our mock/readiness work. That non-portability requirement seems useful and desirable, but speaking personally I think the researcher invests a lot – not just an institution – making that complex. Taking all those together I’m not sure the Stern idea of less complexity or burden here, not alongside those changes.

And then we have the institutional impact case studies – we had a number of interdisciplinary examples of work, so we are comfortable with that possibility. institutional environment is largely shared so doing that once for the whole university could be a really helpful reduction in work load. And each new element will have implications for how CRIS systems support REF submissions.

And as we prepare for REF 2021 we also have to look to REF 2028. We think open data will be important given the Concordat on Open Data Research (signed by HEFCE; RCUK; Universities UK; Wellcome) so we can get ready now, ready for when that happens. I’m pretty confident that open access monographs will be part of the next REF (following Monographs and Open Access HEFCE report). Then there is institutional impact – may not happen here but may be back. And then there are metrics. We have The Metric Tide: Report of the Independent Review of the Role of Metrics in Research Assessment Management.

IN terms of responsible metrics,we haven’t heard the last of them… Forum for responsible metrics’ Data and metrics to support decisions, not the sole driver; but the conversation will not end with th e metric tide. Metrics are alluring but to date they have’t worked well versus other types of evidence.

SO, how do we prepare?

  • For REF 2021 we need to be agile, support research managers to help academics deposit work, we have to help us lobby CROS system designers to have fit-for-purpose systems.
  • For REF 2028 we have to understand the benefits and challenges of making more research open
  • Be part of the conversation on responsible metrics – any bibliometrics experts in the room will stay busy.
  • And we want to have interoperability in systems…

Q&A/Discussion

Q1) How can we do something useful in terms of impact for case studies as our repository remit expands to different materials, different levels of openness, etc.

A1 – ASL) I think being easily accessible on Univesity websites, making them findable… Then also perhaps improved search functionality, and some way to categorise what comes out… If creating things other than peer reviewed publications – what is this? type information. I might have been too negative about repositories because historically our data wasn’t in those… I think actually sciences find that even more important…

Q1) For collecting evidence?

A1 – ASL) Yes. for collecting… Some have metrics that help us see how those impact have worked.

A1 – PJ) We’ve been talking about how best to use our CRIS to improve join up and understand those impacts…

A1 – CF) I think it’s also about getting that rounded view of the researcher – their outputs, publications, etc. being captured as impacts alongside the outputs… That could be useful and valuable…

Q2) A common theme was the burden of this exercise… But could be argued that it drives positive changes… How can the REF add to the sector?

A2 – CF) Wearing my personal and former job hat, as impact officer, I did see REF drive strategic investment in universities, including public engagement, that rewards, recognises, and encourages more engagement with the coomunity. There is real sharing of knowledge brought about by impact and the REF.

A2 – ASL) Totally agree.

A2 – PJ) More broadly the REF and RAE… They recognise the importance of research and supporting researchers. For us we get £75M a year through the research excellence funding. And we see the quality of research publications going up…

Q3) Do you have any comments on the academic community and how that supports the REF, particularly around data.

A3 – PJ) At Edinburgh we are very big – we submitted 1800 staff, we could have submitted up to 2500. In my previous role we had much smaller numbers of resarch staff… So they are different challenges and different systems… We have spoken to our Informatics colleagues to see what we can do. There are definitely benefits at th elevel of building a sysetm to manage this…

Q3) In an academic environment we have collegiate working practice, and need systems that work together.

A3 – PJ) We have a really distributed set up at Edinburgh, so we are constrantly having that conversation, and looking for cross cutting areas, exchanging information…

Q4) the relationship with the researcher changes here… In previous years universities talked about “their research” but it was actually all structured around the individual. In this new model that shift is big, and the role and responsibility of the organisation, the ways that schools interact with their researcher…

A4 – ASL) You do see that in pre-funding application activity with internal peer review processes that build that collegiality within the organisation…

Q5) I was intrigued with the comment that lots of impact isn’t associated with outputs… So that raises questions about the importance of outputs in the REF. Should we rebalance the value of the output and how it is valued.

A5 – ASL) Perhaps. For example when colleagues are providing evidence to government and parliament it is rare for publications to be referenced, and rare for publications to be read… I don’t think those matter… But those include methodology, rigour, evidence of quality of work. But that then becomes briefing papers etc… Otherwise you and I could just make a paper – but that would be opinion. So you need that (hard to read) academic publication, and you have to acknowledge that those are different things and have different roles – and that has to be demonstrated in the case studies.

A5 – CF) I think it’s an interesting question, especially thinking ahead to REF 2021… We are considering how those impacts o the field and impact on wider society are represented – some blue skies research won’t have impact for many years to come…

Q6) I think lay summaries of a piece of work is so crucial. Science Open and John Tennent is putting up lay summaries, you have Kudos and other things there contributing to that… The public want to understand what they are reading. I have personally sat on panels as a lay member and I know how hard it is to have that kind of lay summary is, to understand what has taken place.

A6 – ASL) You do need that lay summary of work, or briefing paper, or expert communities which are not lay people… You have to think about audiences and communicating your work widely, and target it… I think repositories are useful to access work, but it’s not enough to put it there – just as it isn’t enough to put an article out there – you have to actively reach out to your audiences.

A6 – CF) I would agree and I would add that there is such a key need to help academics to do that, to support skills for writing lay summaries… Getting it clearer benefits the researcher, their thinking, and how they tell others about their work – that truly enables knowledge exchange.

A6 – PJ) And it benefits the academic audience too. I was listening to a podcast where academics from across disciplines to see which papers were most valuable, and being readable to a lay audience was a key factor in how those papers did.

10×10 presentations (Chair: Ianthe Sutherland, University Library & Collections)

  1. National Open Data and Open Science Policies in Europe. Martin Donnelly, DCC

I’m talking about some work we’ve done at DCC with SPARC Europe looking at Open Data and Policies across Europe.

The DCC is a centre of expertise in digital curation and data management. We maintain a watching brief on funders research data policies (largely focused on the UK). SPARC Europe is a membership organisation comprising academic institutions, library consortia, funding bodies, research institutes and publishers. Their gial is advocating change in scholarly communications for the benefit of research and society. And we have been collaborating since 2016 looking at open data and open science policies across Europe.

So, what is a policy? Well the dictionary definition works, it’s a set of ideas or a plan of what to do in particular situations that has been agreed to officially by a group of people or an organisation.

In this work we looked at national policies – in some regions with a single research funder that could be the funder policy but, in the UK the AHRC wouldn’t count here as that is not a national policy across the whole country. And the last known analysis of this sort dates back to 2013 and much has changed in that time.

We began by compiling and briefing describing a list of national policies in the EU and some ERA states (IS, NO, CH). We circulated that list for comment and additions. We also sought intelligence from contacts fro DCC European projects to ask about the status of national approaches, forthcoming or exiting policies, etc. We then attempted to classify the policies.

Across the thirteen countries we found: 6 funder policies; 4 national plans or roadmaps; 2 concordat type documents; 2 laws; and one working paper. There are more than 13 there as some parallel documents. Identifying the lead, ranking or sponsoring organisation was not always straightforward, sometimes documents were co-signed by partners or groups. All of the policies discussed research data; 7 addressed open access publication explicitly; 6 addressed software, code, tools or models; 5 addressed methods, workflows or protocols, and one addressing physical (non-digital) samples.

Most policies were prescriptive or imperative. Monitoring of compliance and/or penalties are not that common. And these are new – only 2 policies pre-date 2014 but there are open preceeding open access policies. And new policies keep appearing as a result of our work… And two policies have been translated to English specifically because of this work (Estonia, Cyprus). The EC’s Open Research Data Pilot for Horizon 2020 was cited in multiple policy documents. And we hope that Brexit won’t diminish our role or engagement in European open data policy.

  1. IIIF: you can keep your head while all around are losing theirs! Scott Renton, University of Edinburgh

IIIF is the International Image Interoperability Framework which enables you to use images in your cultural heritage resources. IIIF works through two APIs. You bring in images through the Image API through IIIF compliant URLs, which have long URLs that include the region of the image, instructions for display, etc. The other API is the Presentation API which is much more about curation, including the ability to curate collections of content – so you can structure these as, say, an image of a building that is related to images of the rooms in that building.

We have images in Luna and we pushed on Luna to support IIIF and we did get success there. We have implemented IIIF in December. We made a lot of progress and have IIIF websites online. The workflows are really complex but it allows us to maintain one set of images and metadata through these embedded images, rather than having to copy and duplicate work. And those images are zoomable, draggable, etc. And Metadata games is also IIIF compliant. And it is feeding into our websites including the new St Cecilia’s Hall museum website.

Our next implementation was the Coimbra virtual implementation – which includes other people’s images. For our images, and other IIIF compliant organisations that was easy, but we had to set up our own server (named Cantaloupe) to manage those images from others.

The next challenge was the Mahabharata Scroll. It is a huge document but the IIF spec and Luna allows us to prorgamme a sequence of viewers…

And our main achievement has been Polyanno that allows annotation that can then be stored in manifests, to upload and discuss annotations. It’s proving very popular with the IIIF community. We have huge amount of images to convert to IIIF but lots of plans, lots of ideas, and lots to do…

We are also collabortion with NLS around their content, and are up to talk with others about IIIF!

  1. Reference Rot in theses: a HiberActive pilot. Nicola Osborne, EDINA, University of Edinburgh

This was my presentation – so notes from me here but some links to Site2Cite, a working demo/pilot tool for researchers to proactively archive their web citations as they are doing their research, to ensure that by the time they submit their PhD, have their work published, or begin follow up work, they still have access to those important resources.

Introducing Site2Cite: http://hiberactive.edina.ac.uk/

Try out the Site2Cite tools for yourself here: http://hiberactive.edina.ac.uk/site2cite/

You can view my full slides (slightly updated to make more sense for those who didn’t hear the accompanying talk) from the 10×10 here:

This ISG Innovation Funded pilot project builds upon our previous Andrew W. Mellon-funded Hiberlink project; a collaboration between EDINA, Los Alamos National Laboratory, and the University of Edinburgh School of Informatics. The Hiberlink project built on and worked with Herbert Van de Sompel’s and his Memento work.

  1. Lifting the lid on global research impact: implementation and analysis of a Request a Copy service. Dimity Flanagan, London School of Economics and Political Science

Apologies for missing the first few minutes of Dimity’s talk…

LSE have only recently implemented the “request a copy” button in the repository but, having done that Dimity and colleagues have been researching how it is used.

We’ve had about 500 requests so far. The most popular requests have been for international relations, law and media areas. And we see demand from organisations and governments – including requests explicitly stating that they do not subscribe to the journal and they felt it was crucial to their work. There is that potential impact here being revealed in requests for articles ahead of key meetings and events, etc.

And these requests show huge reach form organisations locally and around the world.

One thing we have noticed is that we get a lot of requests from students who can definitely access the version of record through journals subscribed to by their university – they don’t realise and that causes avoidable delay. We have also seen academics linking from reading lists to restricted items in repositories. But, on a more positive note, we’ve had lots of requests from our alumni – 70% of our alumni are international and that shows really positive impact for our work.

Overall this button and the evidence that requests provide has been really positive.

  1. What RADAR did next: developing a peer review process for research plans. Nicola Siminson, Glasgow School of Art

RADAR captures performances, exhibitions, as well as traditional articles, monographs etc. It is hosted on EPrints. And we encourage staff to add as much metadata as possible. But increasingly it is being used internally, with staff developing annual research plans (ARPs) and that feeding into allocations in the year ahead.

These ARPs arose in part from the outcome of the REF 2014 assessment. These are peer reviewed (but not openly available) ARPs aim to enable research time to be allocated more effectively with a view to maximising the number of high quality submissions to the next REF. RADAR houses the template as it played a key role in the GSA REF 2014 submissions, and staff already use and know the system.

The templates went live in 2015, and was tweaked, tried and relaunched in February 2015. The ARP template captures the research, the researchers details, and the expected impact of their work – and a submit process. The process was really quite manual so we thought carefully about how this should work… So once submitted the digital ARP went into a manual process. Once piloted we built the peer review process into RADAR, including access management that allows the researcher sole access until submitted, and then manages access back and forth as required.

We discussed this work with EPrints in Autumn 2016 and development commenced in Spring 2017. This was quite an involved process. The system was live in time for ARP panel chairs to send feedback and results.

So the process now sees ARPS submit; RADAR admin provides Head of Research with report of all ARPs submitted. Then it goes through a series of review stages and feedback stages.

So administrators can view ARPs, panels, status, etc. and there is space for reviews to be captured and the outcome to be shared.

Lessons learned here… No matter how much testing you have done, you’ll still need to tweak and flag things – it’s useful to have a keen researcher to test it and feedback on ‘those tweaks. We still need to increase prominence of summary and decision for the researcher, with more differentiated fields for peer reviews, etc. In conclusion the ARP peer reviewed process has been integrated into RADAR and will be fully tested next year. The continued development of RADAR is bearing fruit – researchers are using the repository and adding more outputs, and offering greater visibility and downloads for GSA.

Explore our repository at http://radar.gsa.ac.uk

  1. Edinburgh DataVault: Local implementation of Jisc DataVault: the value of testing. Pauline Ward, EDINA

I am Pauline Ward from the Research Data Service at the University of Edinburgh, and I am based at EDINA which is part of the University. Jisc commissioned UoE’s Library and University Collections (L&UC) team to design a service for researchers to store data for the long term with the Jisc Data Vault. And we’ve now implemented a version of this at Edinburgh – using that software from L&UC and specified and managed by EDINA.

The DataVault allows safe data storage in the University’s archival storage option, which links this data to a metadata record in Pure without having to re-enter any of the data. And, optionally, to receive a DOI for the data which can be used in publications and other outputs – depending on the context and appropriate visibility of the data. That allows preservation of data at the University. The DataVault is not for making data public – we have a service called DataShare for that.

So, let’s talk about metadata… We push that metadata to Pure and keep DataVault metadata as concise as possible. We need metadata that is usable and have some manual intervention to check and curate that.

We had a fairly extensive user testing process, to ensure documentation works well, then we also recruited academics from across the University to bring us their data and test the system to help us ensure it met their needs.

So, the interim version is out there, and we are continuing to develop and improve it.

  1. Data Management & Preservation using PURE and Archivematica at Strathclyde. Alan Morrisson, University of Strathclyde

We are governed and based in the research department. We wanted to look at both research data management and long term preservation, including reflecting on whether Pure is the right tool for the job here. Pure was already in use at Strathclyde when our Research Data Deposit Policy was being developed, so we deliberately made the policy as open as possible. Also Strathclyde is predominantly a STEM university, and we started off by surveying what else was out there… We knew the quantity and type of data coming in…

And since we opened up the service, in terms of data deposits to date we are have seen a steady increase from about 200 to 400 data sets over the last year.

In terms of our preservation and curation systems we have Pure in place and that does a lot – data storage, metadata, DOI etc. But we’ve also recently implemented Archivematica – it’s free, it’s open source, it’s compatible with Jisc DataVault. So the workflow right now is that data, metadata and related outputs are added to to Pure, and a DOI minted. This feeds the knowledgebase portal. In parallel the data from Pure goes to Archivematica where it is ingested and processed for preservation, and AIP METS file cleaned using METSflask before being stored.

The benefits of this set up is that Pure is familiar to researchers, does a good job of metadata management and related content and has a customised front end (Knowledgebase). Archivematica is well supported, open access, and designed for archiving. But those systems don’t work together, we are manually moving data across. Pure is designed for storage and presentation, not curation. Archivematica only recognises about 40% of the data.

So, in the future we are reviewing our system, perhaps using Pure for metadata only. We are keeping an eye on Jisc RDSS and considering possible Arkivum like storage. And generally looking at what is possible and most appropriate moving forward for curation and archiving.

  1. Open Access… From Oblivion… To the Spotlight? Dawn Hibbert, University of Northampton

I’ll be looking back over the last ten years… And actually ten years back I was working here in Accommodation Services, so not thinking about repositories at all!

Looking back at 2007/8 in the repository world we had our NECTAR repository. Then in 2011, Jisc funded project enabled an author deposit tool for NECTAR. At that time we had a carrot/incentive for deposit, but no stick. Which was actually a nice thing as we’ve now slipped more towards it all being about the REF.

By 2012/13 we engaged with our researchers around open access who had feedback such as “it’s in the library – you can get a copy from there” or “it’s only £30 to buy the journal I publish in, if I make my article free the journal go under” or “My work is not funded by RCUK so why should my work be open access”. We wanted everything open… But by 2014/15 (and the HEFCE announcement) we were still getting “I don’t have to give you anything until 2016” and similar… And we get that idea of “it’s all about the REF”. And it is not. Using the REF in that way, and the repository in that way overlooks the other benefits of open access.

So in 2016/17 HEFCE compliance started. Attitudes have shifted. But the focus has all been about gold APCs and the idea of the university paying. When actually we are using the HEFCE deposit and (later) open access green OA route. And for us we really want researchers to deposit much more than the open access part (we can do that later on).

So, in 2017 and beyond we are looking at emphasising the benefits, sharing that information, being positive about the opportunities, no just using the HEFCE stick. And for open access work we are looking at improving acceptance, extending open access to other outputs, and focus on visibility of research outputs – the Kudos type tool. And we are shifting the focus to Digital Preservation.

We are looking at datasets being open access too. RDM and Digital Preservation gaining ground. And when work is deposited, shared, tweeted, etc. that can really shift attitudes and show benefits and engagement for academic colleagues.

But we still see lots of money spent on PA and journal subscriptions. And we have yet to see what happens with RCUK and REF compliance.

  1. Automated metadata collection from the researcher CV Lattes Platform to aid IR ingest. Chloe Furnival, Universidade Federal de São Carlos

I am pleased to present work by myself and my colleagues from Sao Paulo in Brazil. Back in 1999 all Brail universities were required to share CVs of their research and academic staff on a platform (Curriculo Lattes) which now has over 2 million records now.

However, our University’s repository was only launched in 2016. Different to many universities using Web of Science or Scopus capturing their researchers’ work there, we saw that the Lattes CV Platform was the key and most up to date metadata – always extremely updated as required in funding. It is a really useful stepping stone to identify our staff publications for the initial repository.

So we have very well known researchers, Mena-Chalco and Cesar Jr (2013) who developed ScriptLattes for this extraction. But then the CNPq decided to implement a CAPTCHA which inhibits this Script. They alleged this was for security reasons but it created an uproar as it was seen as “our data”… So, this has all been very complicated and impacted on our plans to identify our own researchers’ work… So we went for SOAP (Simple Object Access Protocol). We also developed a proxy server to deal with CNPq limits. This is based on OpenResty platform to share access to the Lattes SOAP webservices. That lets us manage our local IP address and manage load/avoid going over capacity.

We extract data in xml format, then process in Python to generate Dublin Core. Then we use another script to eliminate duplicates using the Jaccard measure that helps detects differences… Then, once processed, it is held in DSpace. Each record in Lattes has a unique identifier as that site uses an ID number that all Brazillians are required to have to access e.g. a bank account.

So now we have the CVs of 1,166 teaching staff and researchers working at our HEI were retrieved in just 11 minutes. including metadata for 78K journal articles and proceedings papers. We had the specific objective of gaining direct and official access to public metadata held in Lattes CV.

  1. The Changing Face of Goldsmiths Research Online. Jeremiah Spillane, Goldsmiths, University of London

JS: Goldsmiths Research Online started as a vanilla install of EPrints, and it has become customised more and more over time. Important to that development have been several projects. The Jisc Kultur project created a transferable and sustainable institutional repository model for research output i the creative and applied arts, and creating facility for capturing multimedia content in repositories.

Kultur led to the Jisc Kaptur project, led by VADS working with various art colleges including Goldsmiths and GSA.

Then in 2009 we had the Defiant Objects project which looked to understand what makes some objects more difficult to deposit than others.

Jeremiah’s colleague: RAE/REF work has looked at policy versus the open access ethos – and striking the right balance there. So, the Goldsmiths website now includes content brought in from the repository. And that is now organised depending on the needs of different departments. We are also redesigning the website to better embed content to enable exploration of visual content. And the new design should be in place by autumn this year.

Speaking of design… We have been working with OJS but have been wanting to more thoroughly design OJS journals, so we have a new journal coming, Volupte, which runs on OJS in the background but uses SquareSpace at the front end – that’s a bit of an experiment at the moment.

JS: So, the repository continues to develop, whilst our end users primarily focus on their research.

Take a look at: research.goldsmiths.ac.uk

And with that Day One, and my visit to Repository Fringe 2017, is done. 

Aug 022017
 

As we reach the end of the academic year, and I begin gearing up for the delightful chaos of the Edinburgh Fringe and my show, Is Your Online Reputation Hurting You?, I thought this would be a good time to look back on a busy recent few months of talks and projects (inspired partly by Lorna Campbell’s post along the same lines!).

This year the Managing Your Digital Footprint work has been continuing at a pace…

We began the year with funding from the Principal’s Teaching Award Scheme for a new project, led by Prof. Sian Bayne: “A Live Pulse”: Yik Yak for Teaching, Learning and Research at Edinburgh. Sian, Louise Connelly (PI for the original Digital Footprint research), and I have been working with the School of Informatics and a small team of fantastic undergraduate student research associates to look at Yik Yak and anonymity online. Yik Yak closed down this spring which has made this even more interesting as a cutting edge research project. You can find out more on the project blog – including my recent post on addressing ethics of research in anonymous social media spaces; student RA Lilinaz’s excellent post giving her take on the project; and Sian’s fantastic keynote from#CALRG2017, giving an overview of the challenges and emerging findings from this work. Expect more presentations and publications to follow over the coming months.

Over the last year or so Louise Connelly and I have been busy developing a Digital Footprint MOOC building on our previous research, training and best practice work and share this with the world. We designed a three week MOOC (Massive Open Online Course) that runs on a rolling basis on Coursera – a new session kicks off every month. The course launched this April and we were delighted to see it get some fantastic participant feedback and some fantastic press coverage (including a really positive experience of being interviewed by The Sun).

The MOOC has been going well and building interest in the consultancy and training work around our Digital Footprint research. Last year I received ISG Innovation Fund support to pilot this service and the last few months have included great opportunities to share research-informed expertise and best practices through commissioned and invited presentations and sessions including those for Abertay University, University of Stirling/Peer Review Project Academic Publishing Routes to Success event, Edinburgh Napier University, Asthma UK’s Patient Involvement Fair, CILIPS Annual Conference, CIGS Web 2.0 & Metadata seminar, and ReCon 2017. You can find more details of all of these, and other presentations and workshops on the Presentations & Publications page.

In June an unexpected short notice invitation came my way to do a mini version of my Digital Footprint Cabaret of Dangerous Ideas show as part of the Edinburgh International Film Festival. I’ve always attended EIFF films but also spent years reviewing films there so it was lovely to perform as part of the official programme, working with our brilliant CODI compare Susan Morrison and my fellow mini-CODI performer, mental health specialist Professor Steven Lawrie. We had a really engaged audience with loads of questions – an excellent way to try out ideas ahead of this August’s show.

Also in June, Louise and I were absolutely delighted to find out that our article (in Vol. 11, No. 1, October 2015) for ALISS Quarterly, the journal of the Association of Librarians and Information Professionals in the Social Sciences, had been awarded Best Article of the Year. Huge thanks to the lovely folks at ALISS – this was lovely recognition for our article, which can read in full in the ALISS Quarterly archive.

In July I attended the European Conference on Social Media (#ecsm17) in Vilnius, Lithuania. In addition to co-chairing the Education Mini Track with the lovely Stephania Manca (Italian National Research Council), I was also there to present Louise and my Digital Footprint paper, “Exploring Risk, Privacy and the Impact of Social Media Usage with Undergraduates“, and to present a case study of the EDINA Digital Footprint consultancy and training service for the Social Media in Practice Excellence Awards 2017. I am delighted to say that our service was awarded 2nd place in those awards!

Social Media in Practice Excellence Award 2017 - 2nd place - certificate

My Social Media in Practice Excellence Award 2017 2nd place certificate (still awaiting a frame).

You can read more about the awards – and my fab fellow finalists Adam and Lisa – in this EDINA news piece.

On my way back from Lithuania I had another exciting stop to make at the Palace of Westminster. The lovely folk at the Parliamentary Digital Service invited me to give a talk, “If I Googled you, what would I find? Managing your digital footprint” for their Cyber Security Week which is open to members, peers, and parliamentary staff. I’ll have a longer post on that presentation coming very soon here. For now I’d like to thank Salim and the PDS team for the invitation and an excellent experience.

The digital flyer for my CODI 2017 show - huge thanks to the CODI interns for creating this.

The digital flyer for my CODI 2017 show (click to view a larger version) – huge thanks to the CODI interns for creating this.

The final big Digital Footprint project of the year is my forthcoming Edinburgh Fringe show, Is Your Online Reputation Hurting You? (book tickets here!). This year the Cabaret of Dangerous Ideas has a new venue – the New Town Theatre – and two strands of events: afternoon shows; and “Cabaret of Dangerous Ideas by Candlelight”. It’s a fantastic programme across the Fringe and I’m delighted to be part of the latter strand with a thrilling but challengingly competitive Friday night slot during peak fringe! However, that evening slot also means we can address some edgier questions so I will be talking about how an online reputation can contribute to fun, scary, weird, interesting experiences, risks, and opportunities – and what you can do about it.

QR code for CODI17 Facebook Event

Help spread the word about my CODI show by tweeting with #codi17 or sharing the associated Facebook event.

To promote the show I will be doing a live Q&A on YouTube on Saturday 5th August 2017, 10am. Please do add your questions via Twitter (#codi17digifoot) or via this anonymous survey and/or tune in on Saturday (the video below will be available on the day and after the event).

So, that’s been the Digital Footprint work this spring/summer… What else is there to share?

Well, throughout this year I’ve been working on a number of EDINA’s ISG Innovation Fund projects…

The Reference Rot in Theses: a HiberActive Pilot project has been looking at how to develop the fantastic prior work undertaken during the Andrew W. Mellon-funded Hiberlink project (a collaboration between EDINA, Los Alamos National Laboratory, and the University of Edinburgh School of Informatics), which investigated “reference rot” (where URLs cease to work) and “content drift” (where URLs work but the content changes over time) in scientific scholarly publishing.

For our follow up work the focus has shifted to web citations – websites, reports, etc. – something which has become a far more visible challenge for many web users since January. I’ve been managing this project, working with developer, design and user experience colleagues to develop a practical solution around the needs of PhD students, shaped by advice from Library and University Collections colleagues.

If you are familiar with the Memento standard, and/or follow Herbert von de Sompel and Martin Klein’s work you’ll be well aware of how widespread the challenge of web citations changing over time can be, and the seriousness of the implications. The Internet Archive might be preserving all the (non-R-rated) gifs from Geocities but without preserving government reports, ephemeral content, social media etc. we would be missing a great deal of the cultural record and, in terms of where our project comes in, crucial resources and artefacts in many modern scholarly works. If you are new the issue of web archiving I would recommend a browse of my notes from the IIPC Web Archiving Week 2017 and papers from the co-located RESAW 2017 conference.

A huge part of the HiberActive project has been working with five postgraduate student interns to undertake interviews and usability work with PhD students across the University. My personal and huge thanks to Clarissa, Juliet, Irene, Luke and Shiva!

Still from the HiberActive gif featuring Library Cat.

A preview of the HiberActive gif featuring Library Cat.

You can see the results of this work at our demo site, http://hiberactive.edina.ac.uk/, and we would love your feedback on what we’ve done. You’ll find an introductory page on the project as well as three tools for archiving websites and obtaining the appropriate information to cite – hence adopting the name one our interviewees suggested, Site2Cite. We are particularly excited to have a tool which enables you to upload a Word or PDF document, have all URLs detected, and which then returns a list of URLs and the archived citable versions (as a csv file).

Now that the project is complete, we are looking at what the next steps may be so if you’d find these tools useful for your own publications or teaching materials, we’d love to hear from you.  I’ll also be presenting this work at Repository Fringe 2017 later this week so, if you are there, I’ll see you in the 10×10 session on Thursday!

To bring the HiberActive to life our students suggested something fun and my colleague Jackie created a fun and informative gif featuring Library Cat, Edinburgh’s world famous sociable on-campus feline. Library Cat has also popped up in another EDINA ISG Innovation-Funded project, Pixel This, which my colleagues James Reid and Tom Armitage have been working on. This project has been exploring how Pixel Sticks could be used around the University. To try them out properly I joined the team for fun photography night in George Square with Pixel Stick loaded with images of notable University of Edinburgh figures. One of my photos from that night, featuring the ghostly image of the much missed Library Cat (1.0) went a wee bit viral over on Facebook:

James Reid and I have also been experimenting with Tango-capable phone handsets in the (admittedly daftly named) Strictly Come Tango project. Tango creates impressive 3D scans of rooms and objects and we have been keen to find out what one might do with that data, how it could be used in buildings and georeferenced spaces. This was a small exploratory project but you can see a wee video on what we’ve been up to here.

In addition to these projects I’ve also been busy with continuing involvement in the Edinburgh Cityscope project, which I sit on the steering group for. Cityscope provided one of our busiest events for this spring’s excellent Data Festread more about EDINA’s participation in this new exciting event around big data, data analytics and data driven innovation, here.

I have also been working on two rather awesome Edinburgh-centric projects. Curious Edinburgh officially launched for Android, and released an updated iOS app, for this year’s Edinburgh International Science Festival in April. The app includes History of Science; Medicine; Geosciences; Physics; and a brand new Biotechnology tours that led you explore Edinburgh’s fantastic scientific legacy. The current PTAS-funded project is led by Dr Niki Vermeulen (Science, Technology & Innovation Studies), with tours written by Dr Bill Jenkins, and will see the app used in teaching around 600 undergraduate students this autumn. If you are curious about the app (pun entirely intended!), visiting Edinburgh – or just want to take a long distance virtual tour – do download the app, rate and review it, and let us know what you think!

Image of the Curious Edinburgh History of Biotechnology and Genetics Tour.

A preview of the new Curious Edinburgh History of Biotechnology and Genetics Tour.

The other Edinburgh project which has been progressing at a pace this year is LitLong: Word on the Street, an AHRC-funded project which builds on the prior LitLong project to develop new ways to engage with Edinburgh’s rich literary heritage. Edinburgh was the first city in the world to be awarded UNESCO City of Literature status (in 2008) and there are huge resources to draw upon. Prof. James Loxley (English Literature) is leading this project, which will be showcased in some fun and interesting ways at the Edinburgh International Book Festival this August. Keep an eye on litlong.org for updates or follow @litlong.

And finally… Regular readers here will be aware that I’m Convener for eLearning@ed (though my term is up and I’ll be passing the role onto a successor later this year – nominations welcomed!), a community of learning technologists and academic and support staff working with technologies in teaching and learning contexts. We held our big annual conference, eLearning@ed 2017: Playful Learning this June and I was invited to write about it on the ALTC Blog. You can explore a preview and click through to my full article below.

Playful Learning: the eLearning@ed Conference 2017

Phew! So, it has been a rather busy few months for me, which is why you may have seen slightly fewer blog posts and tweets from me of late…

In terms of the months ahead there are some exciting things brewing… But I’d also love to hear any ideas you may have for possible collaborations as my EDINA colleagues and I are always interested to work on new projects, develop joint proposals, and work in new innovative areas. Do get in touch!

And in the meantime, remember to book those tickets for my CODI 2017 show if you can make it along on 11th August!

Jul 042017
 

Today I am again at the Mykolo Romerio Universitetas in Vilnius, Lithuania, for the European Conference on Social Media 2017. As usual this is a liveblog so additions, corrections etc. all welcome… 

Keynote presentation: Daiva Lialytė, Integrity PR, Lithuania: Practical point of view: push or pull strategy works on social media 

I attended your presentations yesterday, and you are going so far into detail in social media. I am a practitioner and we can’t go into that same sort of depth because things are changing so fast. I have to confess that a colleague, a few years ago, suggested using social media and I thought “Oh, it’s all just cats” and I wasn’t sure. But it was a big success, we have six people working in this area now. And I’m now addicted to social media. In fact, how many times do you check your phone per day? (various guesses)…

Well, we are checking our smartphones 100-150 times per day. And some people would rather give up sex than smartphones! And we have this constant flood of updates and information – notifications that pop up all over the place… And there are a lot of people, organisations, brands, NGOs, etc. all want our attention on social media.

So, today, I want to introduce three main ideas here as a practitioner and marketer…

#1 Right Mindset

Brands want to control everything, absolutely everything… The colour, the font, the images, etc. But now social media says that you have to share your brand in other spaces, to lose some control. And I want to draw on Paul Holmes, a PR expert (see www.holmesreport.com) and he says when he fell in love with social media, there were four key aspects:

  • Brands (in)dependency
  • Possibilities of (non)control
  • Dialogue vs monologue
  • Dynamic 24×7

And I am going to give some examples here. So Gap, the US fashion brand, they looked at updating their brand. They spent a great deal of money to do this – not just the logo but all the paperwork, branded items, etc. They launched it, it went to the media… And it was a disaster. The Gap thought for a few days. They said “Thank you brand lover, we appreciate that you love our brand and we are going to stick with the old one”. And this raises the question of to whom a brand belongs… Shareholders or customers? Perhaps now we must think about customers as owning the brand.

Yesterday I saw a presentation from Syracuse on University traditions – and some of the restrictions of maintaining brand – but in social media that isn’t always possible. So, another example… Lagerhaus (like a smaller scale Ikea). They were launching a new online store, and wanted to build community (see videos) so targeted interior six design blogs and created “pop up online stores” – bloggers could select products from the store’s selection, and promote them as they like. That gained media attention, gained Facebook likes for the store’s Facebook page. And there was then an online store launch, with invitees approached by bloggers, and their pop up stores continue. So this is a great example of giving control to others, and building authentic interest in your brand.

In terms of dialogue vs monologue I’d quote from Michael Dell here, on the importance of engaging in honest, direct conversations with customers and stakeholders. This is all great… But the reality is that many who talk about this, many are never ever doing this… Indeed some just shut down spaces when they can’t engage properly. However, Dell has set up a social media listening and command centre. 22k+posts are monitored daily, engaging 1000+ customers per week. This was tightly integrated with @dellcares Twitter/Facebook team. And they have managed to convert “ranters” to “ravers” in 30% of cases. And a decrease of negative commentary since engagement in this space. Posts need quick responses as a few minutes, or hours, are great, longer and it becomes less and less useful…

Similarly we’ve seen scandinavian countries and banks engaging, even when they have been afraid of negative comments. And this is part of the thing about being part of social media – the ability to engage in dialogue, to be part of and react to the conversations.

Social media is really dynamic, 24×7. You have to move fast to take advantage. So, Lidl… They heard about a scandal in Lithuania about the army paying a fortune for spoons – some were €40 each. So Lidl ran a promotion for being able to get everything, including spoons there cheaper. It was funny, clever, creative and worked well.

Similarly Starbucks vowing to hire 10,000 refugees in the US (and now in EU) following Trump’s travel ban, that was also being dynamic, responding quickly.

#2 Bold Actions

When we first started doing social media… we faced challenges… Because the future is uncertain… So I want to talk about several social media apps here…

Google+ launched claiming to be bigger than Facebook, to do it all better. Meanwhile WhatsApp… Did great… But disappearing as a brand, at least in Lithuania. SnapChat has posts disappearing quickly… Young people love it. The owner has said that it won’t be sold to Facebook. Meanwhile Facebook is trying desperately to copy functionality. We have clients using SnapChat, fun but challenging to do well… Instagram has been a big success story… And it is starting to be bigger than Facebook in some demographics.

A little history here… If you look at a world map of social networks from December 2009, we see quite a lot of countries having their own social networks which are much more popular. By 2013, it’s much more Facebook, but there are still some national social media networks in Lithuania or Latvia. And then by 2017 we see in Africa uptake of Twitter and Instagram. Still a lot of Facebook. My point here is that things move really quickly. For instance young people love SnapChat, so we professionally need to be there too. You can learn new spaces quickly… But it doesn’t matter as you don’t have to retain that for long, everything changes fast. For instance in the US I have read that Facebook is banning posts by celebrities where they promote items… That is good, that means they are not sharing other content…

I want to go in depth on Facebook and Twitter. Of course the most eminent social media platform is Facebook. They are too big to be ignored. 2 billion monthly active Facebook users (June 2017). 1.28 billion people log onto Facebook daily. 83 million fake profiles. Age 25 to 34 at 29.7% of users are biggest age group. For many people they check Facebook first in the morning when they wake up. And 42% of marketers report that Facebook is very important to their business. And we now have brands approaching us to set up Facebook presence no matter what their area of work.

What Facebook does well is most precise targeting – the more precise the more you pay, but that’s ok. So that’s based on geolocation, demographic characteristic, social status, interests, even real time location. That works well but remember that there are 83 million fake profiles too.

So that’s push, what about pull? Well there are the posts, clicks, etc. And there is Canvas – which works for mobile users, story driven ads (mini landing), creative story, generate better results and click through rates. (we are watching a Nespresso mobile canvas demo). Another key tool is Livestream – free of charge, notifications for your followers, and it’s live discussion. But you need to be well prepared and tell a compelling story to make proper use of this. But you can do it from anywhere in the world. For instance one time I saw livestream of farewell of Barack Obama – that only had 15k viewers though so it’s free but you have to work to get engagement.

No matter which tool, “content is the king!” (Bill Gates, 1996). Clients want us to create good stories here but it is hard to do… So what makes the difference? The Content Marketing Institute (US), 2015 suggest:

  1. Content
  2. Photos
  3. Newsletters
  4. Video
  5. Article
  6. Blogs
  7. Events
  8. Infographics
  9. Mobile applications
  10. Conferences and Livestreams

So, I will give some examples here… I’ll show you the recent winner of Cannes Lions 2017 for social media and digital category. This is “Project Graham” – a public driver safety campaign about how humans are not designed to survive a crash… Here is how we’d look if we were – this was promoted heavily in social media.

Help for push from Facebook – well the algorithms prioritise content that does well. And auctions to reach your audience mean that it is cheaper to run good content that really works for your audience.

And LinkedIn meanwhile is having a renaissance. It was quite dull, but they changed their interface significantly a few months back, and now we see influencers (in Lithunia) now using LinkedIn, sharing content there. For instance lawyers have adopted the space. Some were predicting LinkedIn would die, but I am not so sure… It is the biggest professional social network – 467 million users in 200 countries. And it is the biggest network of professionals – a third have LinkedIn profile. Users spend 17 minutes per dat, 40% use it every day, 28% of all internet users use LinkedIn. And it is really functioning as a public CV, recruitment, and for ambassadorship – you can share richer information here.

I wanted to give a recent example – it is not a sexy looking case study – but it worked very well. This was work with Ruptela, a high tech company that provides fleet management based on GPS tracking and real-time vehicle monitoring and control. They needed to hire rapidly 15 new sales representatives via social media. That’s a challenge as young people, especially in the IT sector – are leaving Lithuania or working in Lithuania-based expertise centres for UK, Danish, etc. brands.

So we ran a campaign, on a tiny budget (incomparable with headhunters for instance), around “get a job in 2 days” and successfully recruited 20 sales representatives. LinkedIn marketing is expensive, but very targeted and much cheaper than you’d otherwise pay.

#3 Right Skills

In terms of the skills for these spaces:

  • copywriter (for good storytelling)
  • visualist (graphics, photo, video)
  • community manager (to maintain appropriate contact) – the skills for that cannot be underestimated.
  • And… Something that I missed… 

You have to be like a one man band – good at everything. But then we have young people coming in with lots of those skills, and can develop them further…

So, I wanted to end on a nice story/campaign… An add for Budweiser for not drinking and driving

Q&A

Q1) Authenticity is the big thing right now… But do you think all that “authentic” advertising content may get old and less effective over time?

A1) People want to hear from their friends, from people like them, in their own words. Big brands want that authenticity… But they also want total control which doesn’t fit with that. The reality is probably that something between those two levels is what we need but that change will only happen as it becomes clear to big brands that their controlled content isn’t working anymore.

Q2) With that social media map… What age group was that? I didn’t see SnapChat there.

A2) I’m not sure, it was a map of dominant social media spaces…

Q3) I wanted to talk about the hierarchy of content… Written posts, visual content etc… What seemed to do best was sponsored video content that was subtitled.

A3) Facebook itself, they prioritise video content – it is cheaper to use this in your marketing. If you do video yes, you have to have subtitles so that you can see rather than listen to the videos… And with videos, especially “authentic video” that will be heavily prioritised by Facebook. So we are doing a lot of video work.

Introduction to ECSM 2018 Niall Corcoran, Limerick Institute of Technology, Ireland

I wanted to start by thanking our hosts this year, Vilnius has been excellent this year. Next year we’ll a bit earlier in the year – late June – and we’ll be at the Limerick Institute of Technology, Ireland. We have campuses around the region with 7000 students and 650 staff, teaching from levels 6 to 10. The nearest airport is Shannon, or easy distance from Cork or Dublin airports.

In terms of social media we do research on Social MEdia Interactive Learning Environment, Limerick Interactive Storytelling Network, Social Media for teaching and research, Social Media for cancer recovery.

In terms of Limerick itself, 80-90% of the Europe’s contact lenses are manufactured there! There is a lot of manufacturing in Limerick, with many companies having their European headquarters there. So, I’ve got a short video made by one of our students to give you a sense of the town. And we hope to see you there next year!

Social Media Competition Update

The top three placed entries are: Developing Social Paleantology – Lisa Lundgren; EDINA Digital Footprint Consulting and Training Service – Nicola Osborne (yay!); Traditions Mobile App – Adam Peruta.

Stream A: Mini track on Ethical use of social media data – Chair: Dragana Calic

The Benefits and Complications of Facebook Memorials – White Michelle, University of Hawai’i at Manoa, USA

I wanted to look at who people imagine are their audience for these memorials. And this happened because after the death made me look at this, and I decided to look into this in more depth.

So, I’m using danah boyd’s definition of social networking here. We are talking Facebook, Twitter, SnapChat etc. So, a Facebook Memorial is a group that is created specifically to mark the death of a friend or family members – or for public figures (e.g. Michael Jackson).

Robert Zebruck and Brubecker talk about imagined audience as the flattening of realities. So, right now I can see people in the room, I can see who you are, how you react, how to modify my tone or style to meet you, to respond to you. But it is hard to do that on social media. We see context collapse. And we can be sat there alone at our computer and not have that sense of being public. Sometimes with memorials we will say things for that audience, but in other cases perhaps it is sharing memories of drinking together, or smoking weed with something… Memories that may jar with others.

It was a long road to get to this research. My review board were concerned about emotional distress of interviewees. I agreed in the end to interview via Skype or Facebook and to check everything was ok after every question, to make it easier to see and review their state of mind. I had to wait over a year to interview people, the death had to not be by suicide, and the participants had to be over 18 years old. So I did conduct qualitative research over Skype and Facebook… And I found interviewees by looking at memorial pages that are out there – there are loads there, not all labelled as memorials.

So, my data… I began by asking who people thought they were talking to… Many hadn’t thought about it. They talked about family members, friends… Even in a very controlled group you can have trolls and haters who can get in… But often people assumed that other people were like them. A lot of people would write to the deceased – as if visiting a grave, say. I asked if they thought the person could hear or understand.. But they hadn’t really thought about it, it felt like the right thing to do… And they wanted family and friends to hear from them. They felt likes, shares, etc. were validating and therapeutic, and that sense of connection was therapeutic. Some even made friends through going out drinking, or family gatherings… with friends of friends who they hadn’t met before…

This inability to really think or understand the imagine audience, that led to context collapse. Usually family is in charge of these pages… And that can be challenging… For instance an up and coming football star died suddenly, and then it was evident that it was the result of a drug overdose… And that was distressing for the family who tried to remove that content. There is an idea of alternative narratives. Fake news or alternative facts has a particular meaning right now… But we are all used to presenting ourselves in a particular way to different friends, etc. In one memorial site the deceased had owed money to a friend, and they still felt owed that money and were posting about that – like a fight at the funeral… It’s very hard to monitor ourselves and other people…

And there was fighting about who owned the person… Some claiming that someone was their best friend, fights over who was more important or who was more influenced. It happens in real life… But not quite as visibly or with all involved…

So, in conclusion… There are  a lot of benefits for Facebook Memorials. Pyschologists talk of the benefit of connecting, grieving, not feeling alone, to get support. Death happens. We are usually sad when it happens… Social networking sites provide another way to engage and connect. So if I’m in Lithuania and there is a funeral in Hawaii that I can’t travel to, I can still connect. It is changing our social norms, and how we connect. But we can do more to make it work better – safety and security needs improving. Facebook have now added the ability to will your page to someone. And now if someone dies you can notify Twitter – it changes it slightly, birthday reminders no longer pop up, it acts as a memorial. There are new affordances.

Personally, doing this research was very sad, and it’s not an area I want to continue looking at. It was emotionally distressing for me to do this work.

Q&A

Q1) I am old enough to remember LiveJournal and remember memorials there. They used to turn a page into a memorial, then were deleted… Do you think Facebook should sunset these memorials?

A1) I personally spoke to people who would stare at the page for a month, expecting posts… Maybe you go to a funeral, you mourn, you are sad… But that page sticking around feels like it extends that… But I bet Mark Zuckerberg has some money making plan for keeping those profiles there!

Q2) What is the motivation for such public sharing in this way?

A2) I think young people want to put it out there, to share their pain, to have it validated – “feel my pain with me”. One lady I spoke to, her boyfriend was killed in a mass shooting… Eventually she couldn’t look at it, it was all debate about gun control and she didn’t want to engage with that any more…

Q3) Why no suicides? I struggle to see why they are automatically more distressing than other upsetting deaths…

A3) I don’t know… But my review board thought it would be more distressing for people…

Q4) How do private memorials differ from celebrity memorials?

A4) I deliberately avoided celebrities, but also my IRB didn’t want me to look at any groups without permission from every member of that group…

Comment) I’ve done work with public Facebook groups, my IRB was fine with that.

A4) I think it was just this group really… But there was concern about publicly identifiable information.

Online Privacy: Present Need or Relic From the Past? – Aguirre-Jaramillo Lina Maria, Universidad Pontificia Bolivariana, Colombia

In the influential essay, The Right to Privacy, in the Harvard Law Review (1890) – Warren and Brandeis, privacy was defined as “Privacy – the right to be let alone”. But in the last ten years or so we now see sharing of information that not long ago would have been seen and expected to be private. Earl Warren is a famous US judge and he said “The fantastic advances in the field of electronic communication constitute a greater danger to the privacy of the individual.”

We see privacy particularly threatened by systematic data collection. Mark Zuckerberg (1999) claims “Privacy is no longer a social norm”. This has been used as evidence of disregard toward users rights and data. The manner in which data is stored, changed and used and the associated threats. But we also see counter arguments such as the American Library Association’s Privacy Revolution campaign.

So, this is the context for this work in Columbia. It is important to understand literature in this area, particularly around data use, data combinations, and the connection between privacy concerns and behaviours online (Joinsen et al 2008). And we also refer to the work of Sheenan (2002) in the characterisations of online users. Particularly we are interested in new privacy concerns and platforms, particularly Facebook. The impact of culture on online privacy has been studied by Cho, Rivera Sanchez and Lim (2009).

The State of the Internet from OxII found that Columbia had between 40 and 60% of people online. Internet uptake is, however, lower than in e.g. the US. And in Columbia our population is 46% 25-54 years old.

So, my study is currently online. A wider group is also engaging in personal and group interviews. Our analysis will focus on what background knowledge, risk and privacy awareness there is amongst participants. Wat self-efficacy level is regealed by participants – their knowledge and habits. And what interest and willingness is there to acquire more knowledge and gain more skills to manage privacy. At a later stage we will be building a prototype tool.

Our conclusions so far… Privacy is hard to define and we need to do more to define it. Privacy is not a concept articulated in one only universally accepted definition. Different groups trade off privacy differently. Relevant concepts here include background knowledge, computer literacy, privacy risk, self efficacy.

And finally… Privacy is still important but often ignored as important in the wider culture. Privacy is not a relic but a changing necessity…

Q&A

Q1) Did age play a role in privacy? Do young people care as much as older people?

A1) They seem to care when they hear stories of peers being bullied, or harassed, or hear stories of hacking Instagram accounts. But their idea of privacy is different. But there is information that they do not want to have public or stolen. So we are looking more at that, and also a need to understand how they want to engage in privacy. As my colleague Nicola Osborne form Edinburgh said in her presentation yesterday, we have to remember students already come in with a long internet/social media history and presence.

Q2) I was wondering about cultural aspect… Apps used and whether privacy is important… For instance SnapChat is very exhibitionist but also ephemeral…

A2) I don’t have full answers yet but… Young people share on SnapChat and Instagram to build popularity with peers… But almost none of them are interested in Twitter… At least that’s the case in Columbia. But they do know some content on Facebook may be more vulnerable that SnapChat and Instagram… It may be that they have the idea of SnapChat as a space they can control perhaps…

Q3) I often feel more liberal with what I share on Facebook, than students who are 10 or 15 years younger… I would have some privacy settings but don’t think about the long story of that… From my experience students are a lot more savvy in that way… When they first come in, they are very aware of that… Don’t want a bigger footprint there…

A3) That is not exactly true in Columbia. The idea of Digital Footprint affecting their career is not a thing in the same way… Just becoming aware of it… But that idea of exhibitionism… I have found that most of the students in Columbia seem quite happy to share lots of selfies and images of their feet… That became a trend in other countries about three years ago… They don’t want to write much… Just to say “I’m here”… And there has been some interesting research in terms of the selfie generation and ideas of expressing yourself and showing yourself… May be partly to do with other issues… In Columbia many young women have plastic surgery – came out of the 1980s and 1990s… Many women, young women, have cosmetic surgery and want to share that… More on Instagram than Pinterest – Pinterest is for flowers and little girlie things…

Q4) You were talking about gender, how do privacy attitudes differ between males and females?

A4) The literature review suggests women tend to be more careful about what they publish online… They may be more careful selecting networks and where they share content… More willing to double check settings, and to delete content they might have difficulty explaining… Also more willing to discuss issues of privacy… Things may change over time… Suggestion that people will get to an age where they do care more… But we also need to see how the generation that have all of their images online, even from being a baby, will think about this… But generally seems to be slightly more concern or awareness from women…

Comment) I wanted to just follow up the Facebook comment and say that I think it may not be age but experience of prior use that may shape different habits there… Students typically arrive at our university with hundreds of friends having used Facebook since school, and so they see that page as a very public space – in our research some students commented specifically on that and their changing use and filtering back of Facebook contacts… For a lot of academics and mid career professionals Facebook is quite a private social space, Twitter plays more that public role. But it’s not age per se perhaps, it’s that baggage and experience.

Constructing Malleable Truth: Memes from the 2016 U.S. Presidential Campaign – Wiggins Bradley, Webster University, Vienna, Austria, Austria

Now, when I wrote this… Trump was “a candidate”. Then he was nominee. Then president elect… And now President. And that’s been… surprising… So that’s the context.

I look at various aspects in my research, including internet memes. So, in the 2008 Obama’s campaign was great at using social media, at getting people out there and sharing and campaigning for them on a voluntary and enthusiastic basis. 2016 was the meme election I think. Now people researching Memes feel they must refer to Richard Dawkins talking about memes. He meant ideas… That’s not the same as internet memes… So what are the differences betwen Dawkins’ memes and Internet memes? Well honestly they are totally different EXCEPT that they require attention, and have to be reproducable….

Mikhail Bakhtin wrote about the Carnivalesque as something that subverts the dominant mode or perspective, it turns the world on its head… The king becomes the jester and the jester becomes the king. So the Trump tie memes… We need no text here, the absurd is made more absurd. It is very critical. It has that circus level laugh… He’s a clown or a buffoon… You know about it and how to reproduce this.

In terms of literature.. There is work on memes but I think when understanding memes with millennials, but also baby boomers, even people in their 70’s and 80s… We have to go back to major theorists, concepts and perspectives – Henry Jenkins, Erving Goffman, etc. This is a new mode of communication I think, not a new language, but a new mode.

So method wise… I wanted to do a rhetorical-critical analysis of selected internet memes from the facebook page Bernie Sanders Dank Meme Stash, which had over 420k members when I wrote this slide – more now. It was founded by a college student in October 2015. And there are hundreds of thousands of memes there. People create and curate them.

Two months before nad one month after the US Election I did two sets of samples… Memes that received 1000 or more likes/retweets. And memes that received at least 500 or more likes/reactions and at least 100 shares. As an unexpected side note I found that I needed to define “media narrative”. There doesn’t seem to be a good definition. I spoke to Brooke Gladstone of WYNC, I spoke with colleagues in Vienna… We don’t usually take time to think about media narrative… For instance the shooting at Pulse Nightclub has a narrative on the right around gun control, for others its around it being a sad and horrible event…

So, media narrative I am defining as:

  1. Malleable depending upon the ability to ask critical questions
  2.  Able to shape opinion as well as perceptions of reality and a person’s decision-making process and…
  3.  Linguistic and image-based simulations of real-world events which adhere and/or appeal to ontologically specific perspectives, which may include any intentional use of stereotyping, ideology, allegory, etc.

Some findings… The relational roles between image and text are interchangable because of the relationship to popular culture. Barthes (1977) takls about the text loading the image burdening it with culture, a moral, an imagination. And therefore the text in internet memes fluctuates depending n the intended message and the dependence on popular culture.

So, for instance we have an image from Nightmare at 20,000 ft, a classic Twilight Zone image… You need to know nothing here and if I replace a monster face with Donald Trump’s face… It’s instantly accessible and funny. But you can put any image there depending on the directionality of the intended meaning. So you have the idea of the mytheme or function of the monster/devil/etc. can be replaced by any other monster… It doesn’t matter, the reaction will depend on your audience.

Back to Barthes (1977) again, I find him incredibly salient to the work I’ve done here. One thing emerging from this and Russian memes work done before, is the idea of Polysemic directionality. It has one direction and intentionality.. No matter what version of this image you use…

So, here’s a quick clip of the Silence of the Lambs. And here Buffallo Bill, who kills women and skins them… A very scary character… We have him in a meme being a disturbing advisor in memes. If you get that reference it has more weight, but you don’t need to know the reference.

We have the image of Hillary as Two Face, we have Donald as The Joker… And a poster saying “Choose”. The vitriol directed at Clinton was far worse than that at Trump… Perhaps because Sanders supporters were disappointed at not getting the nomination.

We have intertextuality, we also have inter-memetic references… For example the Hilary deletes electoral colleges meme which plays on Grandma on the internet memes… YOu also have the Superman vs Trump – particularly relevant to immigrant populations (Jenkins 2010).

So, conclusions… The construction of a meme is affected and dependent on the media around it… That is crucial… We have heard about fake news, and we see memes in support of that fake news… And you may see that on all sides here. Intertextual references rely on popular culture and inter memetic references which assumes knowledge, a new form of communication. And I would argue that memes are a digital myth – I think Levi Strauss might agree with me on that…

And to close, for your viewing pleasure, the Trump Executive Order meme… The idea of a meme, an idea that can be infinitely replaced with anything really…

Q&A

Q1) This new sphere of memes… Do you think that Trump represents a new era of presidency… Do you think that this will pass? With Trump posting to his own Twitter account…

A1) I think that it will get more intense… And offline too… We see stickers in Austrian elections around meme like images… These are tools for millennials. They are hugely popular in Turkey… There are governments in Turkey, Iran and China are using memes as propaganda against other parties… I’m not sure it’s new but we are certainly more aware of it… Trump is a reality TV star with the nucleaur keys… That should scare us… But memes won’t go away…

Q2) In terms of memes in real life… What about bumper stickers… ? They were huge before… They are kind of IRL memes…

A2) I am working on a book at the moment… And one of the chapters is on pre-digital memes. WWII used to write “Kilroy was here”. Is Magritte’s Ceci n’est pas une pipe a meme? There is definitely a legacy of that… So yes, but depends on national regional context…

Q3) So… In Egypt we saw memes about Trump… We were surprised at the election outcome… What happened?

A3) Firstly, there is that bias that reinforcing narrative… If you looked at the Sanders meme page you might have had that idea that Clinton would not win because, for whatever reason, these people hated Hillary. Real rage and hatred towards her… And Trump as clown hitler… Won’t happen… Then it did… Then rage against him went up… After the Muslim ban, the women’s march etc…

Q4) There are some memes that seem to be everywhere – Charlie and the Chocolate Factory, Sean Bean, etc… Why are we picking those specific particular memes of all things?

A4) Like the Picard WTF meme… Know Your Meme.com is a great resource… In the scene that Picard image is from he’s reciting Shakespeare to get Louixana Troy away from the aliens… It doesn’t matter… But it just fits, it has a meaning

Q5) Gender and memes: I wondered about the aspect of gender in memes, particularly thinking about Clinton – many of those reminded me of the Mary Beard memes and trolling… There are trolling memes – the frog for Trump… the semi-pornographic memes against women… Is there more to that than just (with all her baggage) Clinton herself?

A5) Lisa Silfestry from Gonzaga, Washington State and Lemour Shipman in Tel Aviv do work in that area. Shipman looks at Online Jokes of all types and has done some work on gender.

Q6) Who makes memes? Why?

A6) I taught a course on internet memes and cultures. That was one of the best attended courses ever. My students concluded that the author didn’t matter… But look at 4Chan and Reddit or Know Your Meme… And you can tell who created it… But does that matter… It’s almost a public good. Who cares who created the Trump tie meme. With the United Airline you can see that video, it turned into a meme… and it had lost millions in stock.

Stream B: Mini track on Enterprise Social Media – Chair: Paul Alpar

The Role of Social Media in Crowdfunding – Makina Daniel, University of South Africa, Pretoria, South Africa

My work seeks to find the connection between social media and finance, specifically crowd funding. And the paper introduces the phenomena of crowdfunding, and how the theory of social networking underpins social media. The theory around social media is still developing… Underpinned by theory of information systems and technology adoption, with different characteristics from what happens in social media.

So, a definition of crowdfunding. Crowdfunding is essentially an aspect of crowdsourcing, spurred by ubiquitous web 2.0 technologies. And “Crowdfunding refers to the efforts of entrepreneurial individuals and groups – cultural, social and for-profit – to fund their ventures by drawing on relatively small contributions from a relatively large number of individuals using the internet, without standard financial intermediaries” (Mollick 2014).

Since 2010 there have been growing amounts of money raised globally through crowdfunding. Fobes estimates $34 billion in 2015 (compared to $16 billion in 2014, and $880 million in 2010). The World Bank estimates that crowdfunding will raise $93 billion annually by 2025. This growth couldn’t be achieved in the absence of internet technology, and social media are critical in promoting this form of alternative finance.

Cheung and Lee (2010) examined social influence processes in determining collective social action in the context of online social networks. Their model shows intentional soial action, with users considering themselves part of the social fabric. And they explain three processes of social influence: subjective norm – self outside of any group; group norm – self awareness as a member of a group; and social identity – self in context. Other authors explain social media popularity because of a lack of trust in traditional media, with people wary of information that emanates from people they do not know personally. Kaplin and Haenlein (2010) define social media as “a group of internet-based applications that build on the ideological and technological foundations of web 2.0 applications that allow the creation and exchange of user generated content” So it is a form of online interaction that enables people to create, comment, share and exchange content with other people.

So, how does social media facilitate finance, or crowd sourcing? Since social media assists in maintaining social ties, this should in turn aid facilitation of crowdfunding campaigns. Draw on Linus’s Law “given enough eyeballs, all bugs are shallow”. Large groups are more adept at detecting potential flaws in a campaign than individuals (alone). Thus providing fraudulent campaigns from raising money for crowdfunding projects. Facebook, Twitter, etc. provide spaces for sharing and connection are therefore suitable for crowdfunding campaigns. Studies have shown that 51% of Facebook users are more likely to buy a product after becoming a fan of the products Facebook page (Knudsen 2015).

Brossman (2015) views crowdfunding as existing in two phases (i) brand awareness and (ii) targeting people to support/back one’s campaign. And crowdfunding sites such as Kickstarted and IndieGoGo allow project creators to publish pertinent information and updates, as well as to link to social media. Those connections are present and that also helps deal with a relative lack of social networking functionality within the platform itself, where they are able to create project descriptions, they have a community of users and utilise web 2.0 technologies that allow users to comment on projects and attract money.

A study by Moisseyez (2013) on 100 Kickstarter projects found that connection between social media approval and success in funding. Mollick (2014) observed that crowdfunding success is associated with having a large number of friends in online social networks: a founder with ten Facebook friends would have a 9% chance of succeeding; one with 100 friends would have a 20% chance of success; one with 1000 friends would have a 40% chance of success. He cited a film industry example where more friends mapped to a much higher potential success rates.

So, in conclusion, we don’t have many studies on this are yet. But social media is observed to aid crowdfunding campaigns through its ability to network disparate people through the internet. One notable feature is that although there are main forms of social media, crowdfunding utilizes a limited number of spaces, primarily Facebook and Twitter. Furthermore future research should examine how the expertise of the creator (requestor of funds) and project type, social network, and online presence influence motivations.

Q&A

Q1) I was wondering if you see any connection between the types of people who back crowdfunding campaigns, and why particular patterns of social media use, or popularity are being found. For instance anecdotally the people who back lots of crowdfunding campaigns – not just one off – tend to be young men in their 30s and 40s. So I was wondering about that profile of backers and what that looks like… And if that profile of backer is part of what makes those social media approaches work.

A1) The majority of people using social media are young people… But young people as sources of finance for, say, small businesses… They are mainly likely to be either studying or starting professional career… But not accumulating money to give it out… So we see a disconnect… Between who is on social media… On Twitter, Facebook, etc. to raise finance… You successful in raising funding from people who cannot raise much… So one would expect people in mid career were using most social media, would expect more money coming from crowdfunding… One aspect of crowdfunding… We are looking at resources… You asking for small amounts… Then young people are able to spare that much…

Q2) So most people giving funding on crowdfunding sites are young people, and they give small amounts…

A2) Yes… And that data from Mollick… combined with evidence of people who are using Facebook…

Q2) What about other specialised crowdfunding networks… ?

A2) There is more work to be done. But even small crowdfunding networks will connect to supporters through social media…

Q3) Have you looked at the relative offerings of the crowdfunding campaigns?

A3) Yes, technology products are more successful on these platforms than other projects…

Using Enterprise Social Networks to Support Staff Knowledge Sharing in Higher Education – Corcoran Niall, Limerick Institute of Technology, Ireland and Aidan Duane, Waterford Institute of Technology, Ireland

This work is rooted in knowledge management, this is the basis for the whole study. So I wanted to start with a Ikujio Nonaka “in an econoy where the only certainty is uncertainty… ” And Lew Platt, former CEO of Hewlett-Packard said “If HP knew what HP knows it would be three times more productive” – highlighting the crucial role of knowledge sharing.

Organisations can gain competitive advantage through encouraging and promoting knowledge sharing – that’s the theory at least. It’s very important in knowledge-intensive organisations, such as public HEIs. HEIs need to compete in a global market place… We need to share knowledge… Do we do this?

And I want to think about this in the context of social media. We know that social media enable creation, sharing or exchange of information, ideas and media in virtual communities and networks. And organisational applications are close to some of the ideals of knowledge management: supporting group interaction towards establishing communities; enable creation and sharing of content; can help improve collaboration and communication with organisations; distinct technological features that are ideally suited for knowledge sharing; fundamental disruption in knowledge management; and social media is reinvigorating knowledge management as a field.

We do see Enterprise Social Networks (ESN). If you just bring one into an organisation, people don’t necessarily just go and use it. People need a reason to share. So another aspect is communities of practice (Lave and Wenger 1991), this is an important knowledge management strategy, increasingly used. This is about groups pf people who share a passion for something – loose and informal social structures, largely voluntary, and about sharing tacit knowledge. So Communities of Practice (CoP) tend to meet from time to time – in person or virtually.

ESN can be used to create virtual communities. This is particularly suitable for distributed communities – our university has multiple campuses for instance.

So, knowledge sharing in HEIs… Well many don’t do it. A number of studies have shown that KM implementation and knowledge sharing in HEIs is at a low level. Why? Organisational culture, organisational structures, beurocractic characteristics. And there is well documented divide/mistrust between faculty and support staff (silos) – particularly work from Australia, US and UK. So, can CoP and ESN help? Well in theory they can bypass structures that can reinforce silos. That’s an ideal picture, whether we get there is a different thing.

So our research looked at what the antecedents for staff knowledge sharing are; what the dominant problems in the implementation of ESN and CoP. The contextual setting here is Limerick Institute of Technology. I used to work in IT services and this work came significantly from this interest. There is a significant practical aspect to the research so action research seemed like the most appropriate approach.

So we had a three cycle action research project. We looked at Yammer. It has all the features of social networking you’d expect – can engage in conversations, tagged, shared, can upload content. It lends itself well to setting up virtual communities, very flexible and powerful tools for virtual communities. We started from scratch and grew to 209 users.

Some key findings… We found culture and structure are major barriers to staff knowledge sharing. We theorised that and found it to be absolutely the case. The faculty staff divide in HEI exacerbates the problem. Management have an important role to play in shaping a knowledge sharing environment. The existence of CoP are essential to build a successful knowledge sharing environment, and community leaders and champions are require for the ESN. Motivation to participate is also crucial. If they feel motivated, and they see benefit, that can be very helpful. And those benefits can potentially lead to culture change, which then effects motivation…

We found that our organisation has a strong hierarchical model. Very beaurocratic and rigid. Geographic dispersal doesn’t help. To fix this we need to move from a transactional culture. The current organisational structure contributes to the faculty staff divide, limits opportunities and motivations for staff and faculty to work together. But we also found examples where they were working well together. And in terms of the role of management, they have significant importance, and have to be involved to make this work.

Virtual communities are a Knowledge Management strategy has the potential to improve collaboration and interaction between staff, and it has to be seen as valued, relevant, a valid work activity. Staff motivation wise there are some highly motivated people, but not all. Management have to understand that.

So management need to understand the organisational culture; recognise the existence of structural and cultural problems; etc. Some of the challenges here are the public sector hierarchical structures – public accountability, media scrutiny, transitional culture etc.

Q&A

Q1) On a technical level, which tools are most promising for tacit knowledge sharing…

A1) The whole ability to have a conversation. Email doesn’t work for that, you can’t branch threads… That is a distinctive feature of Yammer groups, to also like/view/be onlookers in a conversation. We encourage people to like something if they read it, to see that it is useful. But the ability to have a proper conversation, and organised meetings and conversations in real time.

Q2) What kind of things are they sharing?

A2) We’ve seen some communities that are large, they have a real sense of collaboration. We’re had research coming out of that, some really positive outcomes.

Q3) Have you seen any evidence of use in different countries… What are barriers across different regions, if known?

A3) I think the barriers are similar to the conceptual model (in the proceedings) – both personal and organisational barriers… People are afraid largely to share stuff… They are nervous of being judged… Also that engagement on this platform might make managers thing that they are not working. Age is a limiting factor – economic issues mean we haven’t recruited new staff for almost 10 years, so we are older as a staff group.

Q3) Might be interested to compare to different cultures, with asian culture more closed I think…

A3) Yes, that would be really interesting to do…

Q4) I am trying to think how and what I might share with my colleagues in professional services, technical staff, etc.

A4) The way this is constructed is in communities… We have staff interested in using Office 365 and Classroom Notebook, and so we set up a group to discuss that. We have champions who lead that group and guide it. So what is posted there would be quite specific… But in Yammer you can also share to all… But we monitor and also train our users in how and where to post… You can sign up for groups or create new groups… And it is moderated. But not limited to specifically work related groups – sports and social groups are there too. And that helps grow the user base and helps people see benefits.

Q5) Have you looked at Slack at all? Or done any comparison there?

A5) We chose Yammer because of price… We have it in O365, very practical reason for that… We have looked at Slack but no direct comparison.

Finalists in the Social Media in Practice Excellence Competition present their Case Histories

EDINA Digital Footprint Consulting and Training Service – Nicola Osborne

No notes for this one…

Developing Social Paleantology – Lisa Lundgren;

This is work with a software development company, funded by the National Science Foundation. And this was a project to develop a community of practice around paleontology… People often think “dinosaur” but actually it’s about a much wider set of research and studies of fossils. For our fossil project to meet it’s goal, to develop and support that community, we needed to use social media. So we have a My Fossil community, which is closed to the community, but also a Facebook group and Twitter presence. We wanted to use social media in an educative way to engage the community with our work.

We began with design studies which looked at what basic elements to contribute to engage with social media, and how to engage. We were able to assess practical contributions and build an educatie and evidence-based social media plan. So we wanted to create daily posts using social paleantology, e.g. #TrilobiteTuesday; design branded image-focused posts that are practice-specific, meet design principles, often huperlinks to vetted paleontological websites; respond to members in ways that encourage chains of communication. There is a theoretical contribution here as well. And we think there are further opportunities to engage more with social paleontology and we are keen for feedback and further discussion. So, I’m here to chat!

 

Traditions Mobile App – Adam Peruta.

When new university students come to campus they have lots of concerns like what is this place, where do I fit in, how can I make new friends. That is particularly the case at small universities who want to ensure students feel part of the community, and want to stay around. his is where the Traditions Challenge app comes in – it provides challenges and activities to engage new students in university traditions and features. This was trialled at Ithaca University. So, for instance we encourage students to head along to go along to events, meet other new students, etc. We encourage students to meet their academic advisors outside of the classroom. To explore notable campus features. And to explore the local community more – like the farmers market. So we have a social feed – you can like, comment, there is an event calendar, a history of the school, etc. And the whole process is gamified, you gain points through challenges, you can go on the leaderboard so there are incentives to gain status… And there are prizes too.

Looking at the results this year… We had about 200 students who collectively completed over 1400 challenges, the person who completed the most (and won a shirt) completed 53 challenges. There are about 100 challenges in the app so it’s good they weren’t all done in one year. And we see over 50k screen views so we know that the app is getting more attention whether or not people engage in the challenges. Students focus groups raised themes of the enjoyment of the challenge list, motivation for participation (which varied), app design and user experience – if there’s one key takeaway: this demographic has really high expectations for user interface, design and tone; contribution to identity… Lots of academic research that the more students are engaged on campus, the more likely they will remain at that university and remain engaged through their studies and as alumni. So there is loads of potential here, and opportunity to do more with the data.

So, the digital experience is preferred, mobile development is expensive and time consuming, good UI/UX is imperative to success, universities are good at protecting their brands, and we learned that students really want to augment their on-campus academic experiences.

Conference organiser: Those were the finalists from yesterday, so we will award the prizes for first, second and third… and the PhD prize…

Third place is Lisa; Second place is me (yay!); First place is Adam and the Traditions mobile app.

I’m going to rely on others to tweet the PhD winners…

The best poster went to IT Alignment through Artificial Intelligence – Amir  – this was mainly based on Amir’s performance as his poster went missing so he had to present to an A4 version of the poster so he did a great job of presenting.

Thank you to our hosts here… And we hope you can join us in Limerick next year!

Thanks to all at ECSM 2017.