Today I am at Repository Fringe which runs today and tomorrow in Edinburgh and is celebrating 10 years of Repofringe! I’m just here today – presenting a 10×10 on our recent Reference Rot in Theses: A HiberActive Pilot project work – and will be blogging whilst I’m here. As usual, as this is live, may include the odd typo or error so all comments, corrections, questions, additions, etc. are very much welcomed!
Welcome – Janet Roberts, Director of EDINA
My colleagues were explaining to me that this event came from an idea from Les Carr that there should be not just one repository conference, but also a fringe – and here were are at the 10th Repository Fringe on the cusp of the Edinburgh Fringe.
So, this week we celebrate ten years of repository fringe, and the progress we have made over the last 10 years to share content beyond borders. It is a space for debating future trends and challenges.
At EDINA we established the OpenDepot to provide a space for those without an institutional repository… That has now migrated to Zenodo… and the challenges are changing, around the size of data, how we store and access that data, and what those next generation repositories will look like.
Over the next few days we have some excellent speakers as well as some fringe events, including the Wiki Datathon – so I hope you have all brought your laptops!
Thank you to our organising team from EDINA, DCC and the University of Edinburgh. Thank you also to our sponsors: Atmire; FigShare; Arkivum; ePrints; and Jisc!
Opening Keynote – Kathleen Shearer, Executive Director COARRaising our game – repositioning repositories as the foundation for sustainable scholarly communication
Theo Andrew: I am delighted to introduce Kathleen, who has been working in digital libraries and repositories for years. COAR is an international organisation of repositories, and I’m pleased to say that Edinburgh has been a member for some time.
Kathleen: Thank you so much for inviting me. It’s actually my first time speaking in the UK and it’s a little bit intimidating as I know that you folks are really ahead here.
COAR is now about 120 members. Our activities fall into four areas: presenting an international voice so that repositories are part of a global community with diverse perspective. We are being more active in training for repository managers, something which is especially important in developing countries. And the other area is value added services, which is where today’s talk on the repository of the future comes in. The vision here is about
But first, a rant… The international publishing system is broken! And it is broken for a number of reasons – there is access, and the cost of access. The cost of scholarly journals goes up far beyond the rate of inflation. That touches us in Canada – where I am based, in Germany, in the UK… But much more so in the developing world. And then we have the “Big Deal”. A study of University of Montreal libraries by Stephanie Gagnon found that of 50k subscribed-to journals, really there were only 5,893 unique essential titles. But often those deals aren’t opted out of as the key core journals separately cost the same as that big deal.
We also have a participation problem… Juan Pablo Alperin’s map of authors published in Web of Science shows a huge bias towards the US and the UK, a seriously reduced participation in Africa and parts of Asia. Why does that happen? The journals are operated from the global North, and don’t represent the kinds of research problems in the developing world. And one Nobel Prize winner notes that the pressure to publish in “luxury” journals encourages researchers to cut corners and pursue trendy fields rather than areas where there are those research gaps. That was the cake with Zika virus – you could hardly get research published on that until a major outbreak brought it to the attention of the dominant publishing cultures, then there was huge appetite to publish there.
Timothy Gowers talks about “perverse incentives” which are supporting the really high costs of journals. It’s not just a problem for researchers and how they publish, its also a problem of how we incentivise researchers to publish. So, this is my goats in trees slide… It doesn’t feel like goats should be in trees… Moroccan tree goats are taught to climb the trees when there isn’t food on the ground… I think of the researchers able to publish in these high end journals as being the lucky goats in the tree here…
In order to incentivise participation in high end journals we have created a lucrative publishing industry. I’m sure you’ve seen the recent Guardian article: “is the staggeringly profitable business of science publishing bad for science”. Yes. For those reasons of access and participation. We see very few publishers publishing the majority of titles, and there is a real
My colleague Leslie Chan, funded by the International Development Council, talked about openness not just being about gaining access to knowledge but also about having access to participate in the system.
On the positive side… Open access has arrived. A recent study (Piwowar et al 2017) found that about 45% of articles published in 2015 were open access. And that is increasing every year. And you have probably seen the May 27th 2016 statement from the EU that all research they fund must be open by 2020.
It hasn’t been a totally smooth transition… APCs (Article Processing Charges) are very much in the mix and part of the picture… Some publishers are trying to slow the growth of access, but they can see that it’s coming and want to retain their profit margins. And they want to move to all APCs. There is discussion here… There is a project called OA2020 which wants to flip from subscription based to open access publishing. It has some traction but there are concerns here, particularly about sustainability of scholarly comms in the long term. And we are not syre that publishers will go for it… Particularly one of them (Elsevier) which exited talks in The Netherlands and Germany. In Germany the tap was turned off for a while for Elsevier – and there wasn’t a big uproar from the community! But the tap has been turned back on…
So, what will the future be around open access? If you look across APCs and the average value… If you think about the relative value of journals, especially the value of high end journals… I don’t think we’ll see lesser increases in APCs in the future.
At COAR we have a different vision…
Lorcan Dempsey talked about the idea of the “inside out” library. Similarly a new MIT Future of Libraries Report – published by a broad stakeholder group that had spent 6 months working on a vision – came up with the need for libraries to be open, trusted, durable, interdisciplinary, interoperable content platform. So, like the inside out library, it’s about collecting the output of your organisation and making is available to the world…
So, for me, if we collect articles… We just perpetuate the system and we are not in a position to change the system. So how do we move forward at the same time as being kind of reliant on that system.
Eloy Rodrigues, at Open Repository earlier this year, asked whether repositories are a success story. They are ubiquitous, they are adopted and networked… But then they are also using old, pre-web technologies; mostly passive recipients; limited interoperability making value added systems hard; and not really embedded in researcher workflows. These are the kinds of challenges we need to address in next generation of repositories…
So we started a working group on Next Generation Repositories to define new technologies for repositories. We want to position repositories as the foundation for a distributed, globally networked infrastructure for scholarly communication. And on top of which we want to be able to add layers of value added services. Our principles include distributed control to guard againts failure, change, etc. We want this to be inclusive, and reflecting the needs of the research communities in the global south. We want intelligent openness – we know not everything can be open.
We also have some design assumptions, with a focus on the resources themselves, not just associated metadata. We want to be pragmatic, and make use of technologies we have…
To date we have identified major use cases and user stories, and shared those. We determined functionality and behaviours; and a conceptual models. At the moment we are defining specific technologies and architectures. We will publish recommendations in September 2017. We then need to promote it widely and encourages adoption and implementation, as well as the upgrade of repositories around the world (a big challenge).
You can view our user stories online. But I’d like to talk about a few of these… We would like to enable peer review on top of repositories… To slowly incrementally replace what researchers do. That’s not building peer review in repositories, but as a layer on top. We also want some social functionalities like recommendations. And we’d like standard usage metrics across the world to understand what is used and hw.. We are looking to the UK and the IRUS project there as that has already been looked at here. We also need to address discovery… Right now we use metadata, rather than indexing full text content… So contat can be hard to get to unless the metadata is obvious. We also need data syncing in hubs, indexing systems, etc. reflect changes in the repositories. And we also want to address preservation – that’s a really important role that we should do well, and it’s something that can set us apart from the publishers – preservation is not part of their business model.
So, this is a slide from Peter Knoth at CORE – a repository aggregator – who talks about expanding the repository, and the potential to layer all of these additional services on top.
To make this happen we need to improve the functionality of repositories: to be of and not just on the web. But we also need to step out of the article paradigm… The whole system is set up around the article, but we need to think beyond that, deposit other content, and ensure those research outputs are appropriately recognised.
So, we have our (draft) conceptual model… It isn’t around siloed individual repositories, but around a whole network. And some of our draft recommendations for technologies for next generation repositories. These are a really early view… These are things like: ResourceSync; Signposting; Messaging protocols; Message queue; IIIF presentation API; AOAuth; Webmention; and more…
Critical to the widespread adoption of this process is the widespread adoption of the behaviours and functionalities for next generation repositories. It won’t be a success if only one software or approach takes these on. So I’d like to quote a Scottish industrialist, Andrew Carnegie: “strength is derived from unity…. “. So we need to coalesce around a common vision.
Ad it isn’t just about a common vision, science is global and networked and our approach has to reflect and connect with that. Repositories need to balance a dual mission to (1) showcase and provide access to institutional research and (2) be nodes in a global research network.
To support better networking in repositories and in Venice, in May we signed an International Accord for Repository Networks, with networks from Australasia, Canada, China, Europe, Japan, Latin America, South Africa, United States. For us there is a question about how best we work with the UK internationally. We work with with OpenAIRE but maybe we need something else as well. The networks across those areas are advancing at different paces, but have committed to move forward.
There are three areas of that international accord:
- Strategic coordination – to have a shared vision and a stronger voice for the repository community
- Interoperability and common “behaviours” for repositories – supporting the development of value added services
- Data exchange and cross regional harvesting – to ensure redundancy and preservation. This has started but there is a lot to do here still, especially as we move to harvesting full text, not just metadata. And there is interest in redundancy for preservation reasons.
So we need to develop the case for a distributed community-managed infrastructure, that will better support the needs of diverse regions, disciplines and languages. Redundancy will safeguard against failure. With less risk of commercial buy out. Places the library at the centre… But… I appreciate it is much harder to sell a distributed system… We need branding that really attracts researchers to take part and engage in †he system…
And one of the things we want to avoid… Yesterday it was announced that Elsevier has acquired bepress. bepress is mainly used in the US and there will be much thinking about the implications for their repositories. So not only should institutional repositories be distributed, but they should be different platforms, and different open source platforms…
Concluding thoughts here… Repositories are a technology and technologies change. What its really promoting is a vision in which institutions, universities and their libraries are the foundational nodes in a global scholarly communication system. This is really the future of libraries in the scholarly communication community. This is what libraries should be doing. This is what our values represent.
And this is urgent. We see Elsevier consolidating, buying platforms, trying to control publishers and the research cycle, we really have to move forward and move quickly. I hope the UK will remain engaged with this. And i look forward to your participation in our ongoing dialogue.
Q1 – Les Carr) I was very struck by that comment about the need to balance the local and the global I think that’s a really major opportunity for my university. Everyone is obsessed about their place in the global university ranking, their representation as a global university. This could be a real opportunity, led by our libraries and knowledge assets, and I’m really excited about that!
A1) I think the challenge around that is trying to support common values… If you are competing with other institutions it’s not always an incentive to adopt systems with common technologies, measures, approaches. So there needs to be a benefit for institutions in joining this network. It is a huge opportunity, but we have to show the value of joining that network It’s maybe easier in the UK, Europe, Canada. In the US they don’t see that value as much… They are not used to collaborating in this way and have been one of the hardest regions to bring onboard.
Q2 – Adam Field) Correct me if I’m wrong… You are talking about a Commons… In some way the benefits are watered down as part of the Commons, so how do we pay for this system, how do we make this benefit the organisation?
A2) That’s where I see that challenge of the benefit. There has to be value… That’s where value added systems come in… So a recommender system is much more valuable if it crosses all of the repositories… That is a benefit and allows you to access more material and for more people to access yours. I know CORE at the OU are already building a recommender system in their own aggregated platform.
Q3 – Anna Clements) At the sharp end this is not a problem for libraries, but a problem for academia… If we are seen as librarians doing things to or for academics that won’t have as much traction… How do we engage academia…
A3) There are researchers keen to move to open access… But it’s hard to represent what we want to do at a global level when many researchers are focused on that one journal or area and making that open access… I’m not sure what the elevator pitch should be here. I think if we can get to that usage statistics data there, that will help… If we can build an alternative system that even research administrators can use in place of impact factor or Web of Science, that might move us forward in terms of showing this approach has value. Administrators are still stuck in having to evaluate the quality of research based on journals and impact factors. This stuff won’t happen in a day. But having standardised measures across repositories will help.
So, one thing we’ve done in Canada with the U15 (top 15 universities in Canada)… They are at the top of what they can do in terms of the cost of scholarly journals so they asked us to produce a paper for them on how to address that… I think that issue of cost could be an opportunity…
Q4) I’m an academic and we are looking for services that make our life better… Here at Edinburgh we can see that libraries are the naturally the consistent point of connection with repository. Does that translate globally?
A4) It varies globally. Libraries are fairly well recognised in Western countries. In developing world there are funding and capacity challenges that makes that harder… There is also a question of whether we need repositories for every library.. Can we do more consortia repositories or similar.
Q5 – Chris) You talked about repository supporting all kinds of materials… And how they can “wag the dog” of the article
A5) I think with research data there is so much momentum there around making data available… But I don’t know how well we are set up with research data management to ensure data can be found and reused. We need to improve the technology in repositories. And we need more resources too…
Q6) Can we do more to encourage academics, researchers, students to reuse data and content as part of their practice?
A6) I think the more content we have at Commons level, the more it can be reused. We have to improve discoverability, and improve the functionality to help that content to be reused… There is huge use of machine reuse of content – I was speaking with Peter Knoth about this – but that isn’t easy to do with repositories…
Theo) It would be really useful to see Open Access buttons more visible, using repositories for document delivery, etc.
Chris Banks, Director of Library Services, Imperial CollegeFocusing upstream: supporting scholarly communication by academics
Gavin MacLachlan: I’d just like to welcome you again to Edinburgh, our beautiful city and our always lovely weather (note for remote followers: it’s dreich and raining!). I’m here to introduce Chris, whose work with LIBER and LERU will be well known to you.
Chris: This is my first fringe and I find it quite terrifying that I’m second up! Now, I’m going to go right back to basics and policy…
The Finch report in 2012 and Research Councils UK: we had RCUK policy; funding available for immediate Gold OA (including hybrid); embargo limits apply where Green OA chosen. Nevertheless the transition across the world is likely to take a number of years. For my money we’ve moved well on repositories, partly as the UK has gone it alone in terms of funding that transition process.
In terms of REF we had the Funding council REF policy (2013) which is applicable to all outputs that are to be submitted to the post 2014 REF exercise – effectively covers all researchers. No additional funding available Where Green OA selected, requirement for use of repositories. There were also two paragraphs (15 and 26) shaping what we have been doing…
That institutions are encouraged to go beyond the minimum (and will receive credit for doing so) – and the visibility of that is where we see the rise of University presses. And the statement that repositories do not need to be accessible for reuse and text mining, but that, again, there will be credit for those that are. Those two paragraphs have driven what we’ve been doing at Imperial.
At the moment UK researchers face the “policy stack” challenge. There are many funder policies; the REF policy differs substantially from other policies and applies to all UK research academics – you can comply with RCUK policy and fall foul of REF; many publisher policies…
So how can the REF policy help? Institutions recognise IP, copyright and open access policies are not necessarily supporting funder compliance – something needs to be done. There is a variety of approaches to academic IP observed in UK institutions. Legally in the UK the employer is the first copyright holder… subject to any other agreements and unless the individual is a contractors etc.
Publishers have varying approaches to copyright, licence to first publish, to outright copyright transfer. Licences are not read to academics. It’s not just in publishing… It’s social media… It’s a big problem.
For the library we want to create frictionless services. We need to upscale services to all researchers – REF policy requirements. We can’t easily give an answer to researchers on their OA options. So we started our work at imperial to address this, and to ensure our own organisational policy aligned with funder policies. We also wanted to preserve academic choice over publishing, and ability to sign away rights when necessary (though encouraging scrutiny of licenses). We have a desire to maximise impact of publication. And there is a desire to retain some re-use rights for us in teaching etc, including rights to diagrams etc.
The options we explored with academics was to do as we do at the moment – with academics signing over copyright, through to the institution claiming all copyright on all academic outputs. And we wanted to look at two existing models in between, the SPARC model (academic signed copyright over to publisher but licenses back); and the Harvard model – which we selected.
The Harvard model is implemented as part of the university OA policy. Academic deposits Author Accepted Manuscipts (AAMs) and grant a non-exclusive licence to the university for all journal articles. It is a well established policy and has been in use (elsewhere) since 2008. Where a journal seeks a waiver that can be managed by exception. And this is well tested in Ivy League colleges but also much more widely, including universities in Kenya.
The benefits here is that academia retains rights, authors have the right to make articles open access – open access articles have higher citations than closed ones. Authors can continue to publish in journal or choice irrespective of whether it allows ope access or not. Single means by which authors can comply with green open access policies. We are minimising reliance on hybrid open access – reducing “double dipping”, paying twice through subscriptions and APC – a complex and costly process. I think we and publishers see money for hybrid OA models drying up in the future, as the UK has pretty much been the one place doing that. Instead funding is typically used for pure gold OA models and publications.
We have mae some changes to the Harvard model policy to make it work in the context of UK law, also to ensure it facilitated funder deposit compliance and REF eligibility. The next step here is that 60 institutions overall are interested and we have a first mover group of around 12 institutions. We are discussing with publishers. And we have had wider engagement with the researcher, library, research office and legal office communities. We have a website and advocacy materials under development. We are also drafting boilerplate texts for authors, collaboration agreements etc. especially for large international collaborative projects. We have a steering committee established and that includes representatives from across institutions, and including a publisher.
At the moment we are addressing some publisher concerns and perceptions. Publishers are taking a very particular approach to us. We have a real range of responses. Some are very positive – including the pure gold (e.g. PLoS) and also learned society (e.g. Royal Society). Other publishers have raised concerns and are in touch with the steering group, and with ASPLP.
Summary of current concerns:
- that it goes beyond requirements of Finch. We have stated that UK-SCL is to support REF and other
- AAMs will be made available on publication. Response: yes, as per Harvard model around since 2008
- Administrative burden on UK author/institutions as publishers would have to ask for waivers in 70-80%. We have responded that in other Harvard using experiences it has been less than 5% and we can’t see why UK authors would be treated differently.
- They noted that only 8% of material submitted to the REF were green OA compliant. We have noted that only 8% submitted were green OA, not 8% of all eligible for submission.
Researchers have also raised concerns
- the need to seek agreement from co-authors, especially in collaborations. Can be addressed through a phased/gradual implementation
- Fear that a publisher will refuse to publish. Institutions using Harvard model repot no instances of this happening
- Learned Societies – fear loss of income. No reliable research evidence to back up this fear.
- Don’t like the CC-BY-ND Licence. That is to comply with RCUK but warrants further discussion.
Our next step is further meeting with PA/ALPSP to take place during the summer. We have encouraged proposals to delivery more than simply minimum REF eligibility which would resolve current funder/publisher policy stack complexity. We will finalise the website, waiver system, advocacy materials and boilerplate texts. To gain agreement on early mover institutions and on the dat of first adoption. And to notify publishers.
Another bit of late breaking news… Publishers recently went to HEFCE to ask about policy statements and, as a result of that, HEFCE will be clarifying that it is pushing for minimum compliance and encouraging more than that. One concern of the REF policy had been that only material submitted to the REF would have been deposited…
Last time my institution submitted 5k items, more than half were not monographs. We submitted 95% of our researchers. Out of that four items were looked at, now would be 2. And from that our funding is decided. And you can see, from that, why that bigger encouragement for the open scholarly ecosystem is so important.
I wanted to close by sharing some useful further materials and to credit others who have been part of this work.
One important thing to note is that we are trying to help researchers and university to comply as policies from funders and publishers evolve. I would like to see that result in discussion with publishers, and a move to all gold OA… The AAMs is not the goal here, it’s the published article. Now that could see the end of repositories – something I am cautious of raising with this audience. Now in the
Q1) The elephant in the room here is Sci Hub… They are making 95% of published content available for free. You have AAMCs out there… And we haven’t seen subscriptions drop.
A1) So our initiative is about legal sharing. And also need to note that the UK is just one scholarly community. And others have not moved towards mandates and funding. I think it is a shame that fights have been picked is with institutions, when we have that elephant in the room…
Q2) Congratulations on the complex and intricate discussions you have been holding… Almost a legal game of Twister, where all the participants hate each other! This ia particular negotiation at the end of a process, at the end of the scholarly publishing change. How might you like your experience to feed into training of researchers and their own understanding of copyright, ownership of their own outputs.
A2) The challenge that we observe is that we have many younger researches and authors who are very passionate and ethically minded about openness. They are under pressure from supervisors who say they will not get tenured position if they don’t have a “good” journal on heir cv. And they are frustrated by the slow movement on the San Francisco research assessment declaration. Right now the quality journals remain those subscription high impact journals. But we have research showing the higher use of open access journals. But we still have that debate within academe that is slowing down that environment. But training researchers about their IP and what copyright. I also think it is interesting that Sir Mark Walpock in charge of UKRI as he has written before about the evolving scholarly record, and the scattering of articles and outputs, instead building online around research projects. He gave a talk at LIBER in 2015, and an article for THE. He was also at Wellcome when they first introduced their mandate so I think we really do have someone who understands that complexity and the importance of openness.
10×10 presentations (Chair: Ianthe Sutherland, University Library & Collections)
- v2.juliet – A Model For SHERPA’s Mid-Term Infrastructure. Adam Field, Jisc
I’m here from SHERPA HQ at Jisc! I’m going to go back to 2006… We saw YouTube celebrating it’s first year… Eight out of Ten Cats began… The Nintendo WII appeared… And… SHERPA/JULIET was launched (SHERPA having been around in 2001). So, when we set up Sherpa REF as a beta service in 2016 we had to build something new, as JULIET hadn’t been set up for APIs and interoperability in that kind of way.
So, we set about a new SHERPA/JULIET based around a pragmatic, functional data model; to move data into a platform; to rebrand to Jisc norms; a like-for-like replacement; and a precedent for our other services as we update them all..
So, a quick demo… We now have the list of funders – as before – include an overview of open access. So if we choose Cancer Research UK… You can see the full metadata record, headings for more information. Can see which groups it is part of… We have a nice open API where you can retrieve information.
So, whilst it was a like for like rebuild we have snuck in new features, including FundRef DOIs – added automatically where possible, will be added to with manual input too. More flexible browsing. And a JSON API – really easy to work with. And in the future we’d like funders able to add to their own records and other usefu l3rd party editorial features. We want to integrate ElasticSearch. And we want to add microservices…
In terms of our process here… The hard part was analying the existing data, structuring it into a more appropriate shape… the next part was much easier… We configured EPrints, imported data, and added some bespoke service requirements.
Right now we have a beta of SHERPA/JULIET. All take a look please! We are now working on OpenDOAR. And then SHERPA/ROMEO is expected to be in early 2018.
We now want your feedback! Email firstname.lastname@example.org with your comments and feedback. We’ll have feedback sessions later today that you can join us for and share your thoughts, ask questions about the API. And myself and Tom Davey our user interface person, are here all day – come find us!
- CORE Recommender: a plug in suggesting open access content. Nancy Pontika, CORE
I want to talk about discoverability of content in repositories… Salo 2008, Konkiel 2012 and Acharya 2017 talk about the challenges of discoverability in repositories. So, what is needed? Well, we need recommender systems in repositories so that we can increase the number of incoming links to relevant resources…
For those of you new to repositories, CORE is an aggregation service, we are global and focused we have started harvesting gold OA journals… We have services at various levels, including for text mining and data science. We have a huge collection of 8 million full text articles, 77 million metadata records… They are all in one place… So we can build a good recommendation system.
What effect can we have? Well it will increase the accessibility meaning more engagement, higher Click-Through Rate (CTR); twice as often people access resources on CORE via its recommender system than via search. And that additional engagement increases the time spent in your repositories – which is good for you. And you can open another way to find research…
For instance you can see within White Rose Research Online that suggested articles are presented that come from all of the collections of CORE, including basic geographic information… We would like crowd sourced feedback here. The more users that engage in feedback, the more the recommender will improve. We also get feedback from our community. At the moment the first tab is CORE recommendations, the second tab is institutional recommendations. We’ve had feedback that institutions would prefer it th eother way… We have heard that… Although we note that CORE recommendations are better as its a bigger data set…. We want to make sure the institutional tab appears first unless there are few recommendations/poor matches… We are working on this…
CORE Recommender has been installed at St Mary’s; LSHTM; the OU; University of York; University of Sheffield; York St John; Strathclyde University… and others with more to follow.
How does it work? Currently it’s an article-to-article recommender system. There is preprocessing to make this possible. What is unique is that recommendations is based on full text, and the full text is open access.
What is the CORE recommender not? It is not always right – but which recommendation system is? And it does not compare the “quality” of the recommended articles with the “quality” of the initial paper…
- Enhancing Two workflows with RSpace & Figshare: Active Data to Archival Data and Research to Publication. Rory Macneil, Research Space and Megan Hardeman of Figshare
Rory: Most of the discussion so far has been on publications, but we are talking about data. I think it’s fair to say that FigShare in the data field; and RSpace in the Lab notebooks world have been totally fixated on interoperability!
Right now most data does not make it into repositories… Some shouldn’t be but even the data that should be shared, is not. One way to increase deposit is to make it easy to deposit data. By integrating with RSpace notebooks that allows easy and quick deposit.
So, in RSPace you can capture metadata of various types. There are lots of ways to organise the data… And to use that you just need to activate the FigShare plugin. Then you select the data to deposit – exporting one or many documents… You select what you want to deposit, and the format to deposit in. You can export all of your work, or all of your lab’s work – whatever level of granularity you want to share… You deposit to Figshare… And over to Megan!
Megan: Figshare is a repository where users can male all of their research outputs availale in citable, accessible ways (all marked up for Google Scholar). You upload any file type (we support over 1000 types); we assign a DOI on an item level’ Store items in perpetuity (and backed up in DPN); track usage stats and Altmetrics (more exposure) and you can collaborate with researchers inside and outside your institutions.
figshare has na open API and integrations with RSpace nad other organizations and tools…
For an example… You can see an electronic lab notebook from RSpace which can be browsed and explored in the browser!
- Thesis digitisation project. Gavin Willshaw, University of Edinburgh
I’m digital curator here, and manager of the PhD digitisation project. This project sees a huge amount of content going into ERA, our repository. In the last three years we’ve moved from having two photographers to having two teams of photographers and cataloguers across two sites – we are investing heavily.
We have 17,000 PhD theses and that will all be online by the end of 2018. This will provide global access to entire PhD collection. We have obtained some equipment. We are creating metadata records, and also undertaking some preservation work where thre required.
The collection is largely standardised… But we have some latin and handwritten theses. We have awkward objects – like slices of lungs!
For 10k theses we have duplicates and they are scanned destructively. 3000 unique these are scanned non-destructively in house. And 40000 unique these outsourced. All are OCRed. And they are all catalogued, with data protection checks made before determining what can be shared online in full and which cannot.
In terms of copyright and licensing, that is still with the author. We have contacted some and had positive feedback. It’s a risk but a low risk. In any case we can’t asset the copyright or change licences on our own. And we already have over 2500 theses live.
And these theses are not just text… We have images that are rare and unusual. We share some of these highlights in our blog: https://libraryblogs.is.ed.ac.uk/phddigitisation/ and we use, on Twitter, the hashtag #UoEPhD. We have some notable theses… Alexander Macall Smith’s PhD is there; Isabelle Elmsley Hutton, a doctor in the first world war in the Balkans – so noted she was on a stamp in Serbia last year; Helen Pankhurts; and of course members of staff from the university too!
Impact wise the theses on ERA have been downloaded 2 million times since 2012. Those digitised in the project are seeing around 3000 downloads per month. Oddly our most popular thesis right now is on the differentiation of people in Norwich. We are also looking at what else we can d… Linking theses to Wikipedia; adding a thesis to Wikisource (and getting 10x the views); and now looking at what else… text and data mining etc.
- Weather Cloudy & Cool Harvest Begun’: St Andrews output usage beyond the repository. Michael Bryce, University of St Andrews
I didn’t expect it to actually be cloudy today…!
Our repository has been going since 2006, and use has been growing steadily…
Some of the highlights fro our repository has included research on New Caledonian crows and collaborative tool use. We also have farming diaries in our repository under Creative Commons license… Pushing that out into the community in blog posts and posters… So going beyond traditional publications and use. Our material on Syria has seen significant usage driven partly by use in OJS journals.
Our repository isn’t currently OpenAIRE compliant, but we have some content shared that way, which means a bigger audience… For instance material on virtual learning environments associated with a big EU project.
We’ve also been engaging in publishing engagement. The BBC asked us to digitise a thesis at the time of broadcasting Coast which added that work to our repository.
When we reached our 10,000th item we had cake! And helped publicise the student and their work to a wider audience…
Impact and the REF panel session
Brief for this session: How are institutions preparing for the next round of the Research Excellence Framework #REF2021, and how do repositories feature in this? What lessons can we learn from the last REF and what changes to impact might we expect in 2021? How can we improve our repositories and associated services to support researchers to achieve and measure impact with a view to the REF? In anticipation of the forthcoming announcement by HEFCE later this year of the details of how #REF2021 will work, and how impact will be measured, our panel will discuss all these issues and answer questions from RepoFringers.
Chair: Keith McDonald (KM), Assistant Director, Research and Innovation Directorate, Scottish Funding Council
The panel here include Pauline Jones, REF Manager at University of Edinburgh, and a veteran of the two previous REFs – she was at Napier University in 2008, and was working at the SFC (where I work) for the previous REF and was involved in the introduction of Impact.
Catriona Firth (CF), REF Deputy Manager, HEFCE
I used to work in universities, now I am a poacher-turned-gamekeeper I suppose!
Today I want to talk about Impact in REF 2014. Impact was introduced and assessed for the first time in REF 2014. After extensive consultation Impact was defined in an inclusive way. So, for REF 2014, impact was assessed in four-page case studies describing impacts that had occurred between January 2008 and July 2013. The submitting university must have produced high quality research since 1993 that contributed to the impacts. Each submitting unit (usually subject area) returned one case study, plus an additional case study for every 10 staff.
At the end of the REF 2014 we had 6,975 case studies submitted. On average across submissions 44% of impacts were judged outstanding (4*) by over 250 external users of research, working jointly with the academic panel. There was global spread of impact, and those impacts were across a wealth of areas of life policy, performance and creative practice, etc. There was, for instance, a case study of drama and performance that had an impact on nuclear technology. The HEFCE report on impact is highly recommended reading.
In November 2015 Lord NicholasStern was commissioned by the Minister of Universities and Science to conduct an independent review of the REF. He found that the exercise was excellent, and had achieved what was desired. However there were recommendations for improvement:
- lowering the burden on institutions
- less game-playing and use of loop holes
- less personalisation, more institutionally focused – to take pressure off institutions but also recognise and reward institutional investment in research
- recognition for investment
- more rounded view of research activity – again avoiding distortion
- interdisciplinary emphasis – some work could
- broaden impact – and find ways to capture, reward, and promote the ways UK research has a benefit on and impacts society.
If you go to the HEFCE website you’ll see a video of a webinar on the Stern Review and specifically on staff and outputs, including that all research active staff should be included, that outputs be determined at assessment level, and that outputs should not be portable.
In terms of impact there was keenness to broaden and deepen the definition of impact and provide additional guidance. Policy was a safer kind of case studies before. The Stern Review emphasised a need for more focus on public engagement and impact on curricula and/or pedagogy. Reduce the number of required case studies to a minimum of one. And to include impact arising from research, research activity, or a “body of work”. And having a quality threshold for underpinning research based on rigour – not just originality. And the opportunity to resubmit case studies if the impact was ongoing.
We have been receiving feedback – over 400 responses – which are being summarised. That feedback includes positive feedback on broadening impact and to aligning definitions of impact and on public engagement across funding bodies. There were some concerns about sub-profile based on one case study – especially in small departments. And in those case you’d know exactly whose work and case study was 4* (or not). There have been concerns about how you separate rigour from originality and significance. There was a lot of support for broader basis of research, but challenges in drawing boundaries in practice – in terms of timing and how far back you go… For scholarly career assessment do you go back further? And there was broad support for resubmission of 2014 case studies but questions about “additionality” – could it be the same sort of impact or did it need to be something new or additional? So, we are working on those questions at the moment.
The other suggestion from the Stern Review was the idea of an institutional level assessment of impact, giving universities opportunities to show case studies that didn’t fall neatly elsewhere. Th ecase studies arising from multi and interdisciplinary and collaborative work, and that that should be 10-20% of total ipact case studies; minimum of one. But feedback has been unclear here, particularly the conflation of interdisciplinary research with institutional profiles. Concern also that the University might take over a case study that would otherwise sit in another unit.
So, the next step is communications in summer/autumn 2017. There will be a REF initial decisions document. A summary of consultation responses. And there will be sharing of full consultation responses (with permission). And there will be a launch for our REF 2021 website and Twitter account.
Anne-Sofie Laegran (ASL), Knowledge Exchange Manager, College of Arts, Humanities and Social Sciences, University of Edinburgh
KM: Is resubmission better for some areas than others?
ASL: I think it depends on what you mean by resubmission.. We have some good case studies arising from the same research as in 2014, but they are different impacts.
So.. I will give you a view from the trenches. To start I draw your attention to the University strapline that we have been “Influencing the world since 1583”. But we have to demonstrate and evidence that of course.
There has been impact of impact in academia… When I started in 2008 it was about having conversations about the importance of having an impact, and now it is much more about how you do this. There has been a culture change – all academic staff must consider th epotential impact of research. The challenge is not only to create impact but also to demonstrate impact. There is also an incentive to show ipact – it is part of career progression, it is part of recruitment, and it is part of promotion.
Impact of impact in academia has also been about training – how to develop pathways as well as how to capture and evidence impact. And there has been more support – expert staff as well as funding from funders and from the university.
In terms of developing pathways to impact we have borrowed questions that funders ask:
- who may benefit from your researh?
- what might th ebenefts ve?
- what can you do to ensure potential beneficiaries and decision makers have th eopportunity to engage and benefit
And it is also – especially when capturing impact – about developing new skills and networks.
For instance… If you want to impact the NHS, who makes decisions, makes changes… If you are working with museums and galleries the decision makers will vary depending on where you can find that influence. And, for instance, you rarely partner with the Scottish Government, but you may influence NGOs who then influence Scottish Government.
Whatever the impact it starts from excellent research; which leads to knowledge exchange – public engagement, influencing policy, informing professional practice and service deliver, technology transfer; and that results in impact. You don’t “do” impact, your work is used and influences that then effects a change and an impact.
REF impact challenges include demonstrating change/benefit as opposed to reporting engagement activity. Attributing that change to research. And providing robust evidence. In 2014 that was particularly tricky as the guidance was in 2012 so people had to dig back… That should be less of an issue now, we’ve been collecting evidence along the way…
Some cases that we think did well, and/or had feedback were doing well:
- College of art scholar, who has a dual appointment at the National Galleries of Scotland. She curated the Impressionism Scotland show with over 100k visitors. There was good feedback that also generated debate. It had a change on how th egallery curates shows. And on the market the works displayed went up in value – it had a real economic impact.
- In law two researchers have been undertaking longitudinal work on young people, their lives, careers, and criminal careers. That is funded by Scottish Government. That research led to a new piece of policy based on the findings of that research. And there was a quote from Scottish Government showing a decline in youth crime, attributing that to the policy change, and which was based on research – showing that clear line of impact.
- In sociology, a researcher wrote about the impact of research on the financial crisis for the London Review of Books, it was well received and he was named one of the most influential thinkers on the crisis; his work was translated to French; it was picked up in parliament; and Stephanie Flanders – then BBC economics editor – tweeted that this work was the most important on the financial crisis.
- In music, researchers developed the Skoog, an instrument for disabled students to engage in music. They set up a company, they had investment. At the the time of the REF they had 6 employees, they were selling to organisations – so reaching many people. And in the cultural olympiad during the Olympics in 2012 they were also used, showing that wider impact.
So for each of these you can see there was both activity, and impact here.
In terms of RepoFringe areas I was asked to talk about the role of repositories and open access. It is potentially important. But typically we don’t see impact coming from the scholarly publication, it’s usually the activities coming from the research or from that publication. Making work open access certainly isn’t enough to just trigger impact.
Social media can be important but it needs to have high level of engagement, reach and/or significance to demonstrate more than mere dissemination. That Stephanie Flanders example wouldn’t be enough on it’s own, it works as part of demonstrating another impact, and a good way to track impact, to understand your audience… And to follow up and see what happened next…
Metrics – there is no doubt that numeric evidence was important. Our head of research said last time “numbers speak louder than adjectives” but they have to be relevant and useful. You need context. Standardised metrics/Altmetrics doesn’t work – a big report recently concluded the same. Altmetrics is about alternative metrics that can be tracked online, using DOI. A company called Altmetrics gathers that data, can be useful to track… And can be manipulated by friends with big Twitter followers.. It won’t replace case studies, but may be useful for tracking…
In terms of importance of impact… It relates to 20% of REF score; determined 26% of the funding in Scotland. Funding attracted per annum for the next 7 years:
- 4* case study brings in £45-120k
- 3* £9-25k
- 2* £0
- 4* output, for comparison, is work £7-15k…
The question that does come up is “what is impact” and yes, a single Tweet could be impact that someone has read and engaged with your work… But those big impact case studies are about making a real change and a range of impacts.
Pauline Jones (PJ), REF Manager and Head of Strategic Performance and Research Policy, University of Edinburgh
Thank you to Catriona and Anne-Sofie for introducing impact. I wanted to reinforce the idea that this is what we are doing anyway, making an impact on society, so it is important anyway, not only because of the REF.
Catriona suggested we had a “year off” but actually once REF happened we went into an intense period of evaluation and reflection, then of course the Stern review, consultation, general election… It has been quite non-stop. But actually even if that wasn’t all going on, we’d need our academics to be aware of the REF and of open access. I think open access is incredibly important, people are looking for it… Research is publicly funded… But it has required a lot of work to get up and running.
Although we are roughly at mid point between REFs, we are up and running, gathering impact, preparing to emphasise our impact. In terms of collecting evidence, depositing papers… That will happen in most universities. I think many will be doing the sort of Mock REFs/REF readiness exercises that we have been undertaking. We are also already thinking about drafting our case studies. As we get nearer to submission we’ll take decisions on inclusion… and getting everything ready.
So for REF 2021 we have a long time period over which submission is prepared. There is no period over which outputs, impacts, environment don’t count. Academics thinking now about what to include: 2017 REF readiness exercise to focus on open access and numbers; 2018 Mock REF to focus on quality. And we all have to have a CRIS system now to make that work.
What’s new here? We are still waiting for the draft to understand what’s happening. There are open access journal articles/conference proceedings. There are probably the challenges of submitting all research staff; decoupling the one-to-four staff-to-outputs ratio. That break is quite a good thing… Some researchers might struggle to create four key outputs – part time staff, staff with maternity leave, etc. But we want a sense of what that looks like from our mock/readiness work. That non-portability requirement seems useful and desirable, but speaking personally I think the researcher invests a lot – not just an institution – making that complex. Taking all those together I’m not sure the Stern idea of less complexity or burden here, not alongside those changes.
And then we have the institutional impact case studies – we had a number of interdisciplinary examples of work, so we are comfortable with that possibility. institutional environment is largely shared so doing that once for the whole university could be a really helpful reduction in work load. And each new element will have implications for how CRIS systems support REF submissions.
And as we prepare for REF 2021 we also have to look to REF 2028. We think open data will be important given the Concordat on Open Data Research (signed by HEFCE; RCUK; Universities UK; Wellcome) so we can get ready now, ready for when that happens. I’m pretty confident that open access monographs will be part of the next REF (following Monographs and Open Access HEFCE report). Then there is institutional impact – may not happen here but may be back. And then there are metrics. We have The Metric Tide: Report of the Independent Review of the Role of Metrics in Research Assessment Management.
IN terms of responsible metrics,we haven’t heard the last of them… Forum for responsible metrics’ Data and metrics to support decisions, not the sole driver; but the conversation will not end with th e metric tide. Metrics are alluring but to date they have’t worked well versus other types of evidence.
SO, how do we prepare?
- For REF 2021 we need to be agile, support research managers to help academics deposit work, we have to help us lobby CROS system designers to have fit-for-purpose systems.
- For REF 2028 we have to understand the benefits and challenges of making more research open
- Be part of the conversation on responsible metrics – any bibliometrics experts in the room will stay busy.
- And we want to have interoperability in systems…
Q1) How can we do something useful in terms of impact for case studies as our repository remit expands to different materials, different levels of openness, etc.
A1 – ASL) I think being easily accessible on Univesity websites, making them findable… Then also perhaps improved search functionality, and some way to categorise what comes out… If creating things other than peer reviewed publications – what is this? type information. I might have been too negative about repositories because historically our data wasn’t in those… I think actually sciences find that even more important…
Q1) For collecting evidence?
A1 – ASL) Yes. for collecting… Some have metrics that help us see how those impact have worked.
A1 – PJ) We’ve been talking about how best to use our CRIS to improve join up and understand those impacts…
A1 – CF) I think it’s also about getting that rounded view of the researcher – their outputs, publications, etc. being captured as impacts alongside the outputs… That could be useful and valuable…
Q2) A common theme was the burden of this exercise… But could be argued that it drives positive changes… How can the REF add to the sector?
A2 – CF) Wearing my personal and former job hat, as impact officer, I did see REF drive strategic investment in universities, including public engagement, that rewards, recognises, and encourages more engagement with the coomunity. There is real sharing of knowledge brought about by impact and the REF.
A2 – ASL) Totally agree.
A2 – PJ) More broadly the REF and RAE… They recognise the importance of research and supporting researchers. For us we get £75M a year through the research excellence funding. And we see the quality of research publications going up…
Q3) Do you have any comments on the academic community and how that supports the REF, particularly around data.
A3 – PJ) At Edinburgh we are very big – we submitted 1800 staff, we could have submitted up to 2500. In my previous role we had much smaller numbers of resarch staff… So they are different challenges and different systems… We have spoken to our Informatics colleagues to see what we can do. There are definitely benefits at th elevel of building a sysetm to manage this…
Q3) In an academic environment we have collegiate working practice, and need systems that work together.
A3 – PJ) We have a really distributed set up at Edinburgh, so we are constrantly having that conversation, and looking for cross cutting areas, exchanging information…
Q4) the relationship with the researcher changes here… In previous years universities talked about “their research” but it was actually all structured around the individual. In this new model that shift is big, and the role and responsibility of the organisation, the ways that schools interact with their researcher…
A4 – ASL) You do see that in pre-funding application activity with internal peer review processes that build that collegiality within the organisation…
Q5) I was intrigued with the comment that lots of impact isn’t associated with outputs… So that raises questions about the importance of outputs in the REF. Should we rebalance the value of the output and how it is valued.
A5 – ASL) Perhaps. For example when colleagues are providing evidence to government and parliament it is rare for publications to be referenced, and rare for publications to be read… I don’t think those matter… But those include methodology, rigour, evidence of quality of work. But that then becomes briefing papers etc… Otherwise you and I could just make a paper – but that would be opinion. So you need that (hard to read) academic publication, and you have to acknowledge that those are different things and have different roles – and that has to be demonstrated in the case studies.
A5 – CF) I think it’s an interesting question, especially thinking ahead to REF 2021… We are considering how those impacts o the field and impact on wider society are represented – some blue skies research won’t have impact for many years to come…
Q6) I think lay summaries of a piece of work is so crucial. Science Open and John Tennent is putting up lay summaries, you have Kudos and other things there contributing to that… The public want to understand what they are reading. I have personally sat on panels as a lay member and I know how hard it is to have that kind of lay summary is, to understand what has taken place.
A6 – ASL) You do need that lay summary of work, or briefing paper, or expert communities which are not lay people… You have to think about audiences and communicating your work widely, and target it… I think repositories are useful to access work, but it’s not enough to put it there – just as it isn’t enough to put an article out there – you have to actively reach out to your audiences.
A6 – CF) I would agree and I would add that there is such a key need to help academics to do that, to support skills for writing lay summaries… Getting it clearer benefits the researcher, their thinking, and how they tell others about their work – that truly enables knowledge exchange.
A6 – PJ) And it benefits the academic audience too. I was listening to a podcast where academics from across disciplines to see which papers were most valuable, and being readable to a lay audience was a key factor in how those papers did.
10×10 presentations (Chair: Ianthe Sutherland, University Library & Collections)
- National Open Data and Open Science Policies in Europe. Martin Donnelly, DCC
I’m talking about some work we’ve done at DCC with SPARC Europe looking at Open Data and Policies across Europe.
The DCC is a centre of expertise in digital curation and data management. We maintain a watching brief on funders research data policies (largely focused on the UK). SPARC Europe is a membership organisation comprising academic institutions, library consortia, funding bodies, research institutes and publishers. Their gial is advocating change in scholarly communications for the benefit of research and society. And we have been collaborating since 2016 looking at open data and open science policies across Europe.
So, what is a policy? Well the dictionary definition works, it’s a set of ideas or a plan of what to do in particular situations that has been agreed to officially by a group of people or an organisation.
In this work we looked at national policies – in some regions with a single research funder that could be the funder policy but, in the UK the AHRC wouldn’t count here as that is not a national policy across the whole country. And the last known analysis of this sort dates back to 2013 and much has changed in that time.
We began by compiling and briefing describing a list of national policies in the EU and some ERA states (IS, NO, CH). We circulated that list for comment and additions. We also sought intelligence from contacts fro DCC European projects to ask about the status of national approaches, forthcoming or exiting policies, etc. We then attempted to classify the policies.
Across the thirteen countries we found: 6 funder policies; 4 national plans or roadmaps; 2 concordat type documents; 2 laws; and one working paper. There are more than 13 there as some parallel documents. Identifying the lead, ranking or sponsoring organisation was not always straightforward, sometimes documents were co-signed by partners or groups. All of the policies discussed research data; 7 addressed open access publication explicitly; 6 addressed software, code, tools or models; 5 addressed methods, workflows or protocols, and one addressing physical (non-digital) samples.
Most policies were prescriptive or imperative. Monitoring of compliance and/or penalties are not that common. And these are new – only 2 policies pre-date 2014 but there are open preceeding open access policies. And new policies keep appearing as a result of our work… And two policies have been translated to English specifically because of this work (Estonia, Cyprus). The EC’s Open Research Data Pilot for Horizon 2020 was cited in multiple policy documents. And we hope that Brexit won’t diminish our role or engagement in European open data policy.
- IIIF: you can keep your head while all around are losing theirs! Scott Renton, University of Edinburgh
IIIF is the International Image Interoperability Framework which enables you to use images in your cultural heritage resources. IIIF works through two APIs. You bring in images through the Image API through IIIF compliant URLs, which have long URLs that include the region of the image, instructions for display, etc. The other API is the Presentation API which is much more about curation, including the ability to curate collections of content – so you can structure these as, say, an image of a building that is related to images of the rooms in that building.
We have images in Luna and we pushed on Luna to support IIIF and we did get success there. We have implemented IIIF in December. We made a lot of progress and have IIIF websites online. The workflows are really complex but it allows us to maintain one set of images and metadata through these embedded images, rather than having to copy and duplicate work. And those images are zoomable, draggable, etc. And Metadata games is also IIIF compliant. And it is feeding into our websites including the new St Cecilia’s Hall museum website.
Our next implementation was the Coimbra virtual implementation – which includes other people’s images. For our images, and other IIIF compliant organisations that was easy, but we had to set up our own server (named Cantaloupe) to manage those images from others.
The next challenge was the Mahabharata Scroll. It is a huge document but the IIF spec and Luna allows us to prorgamme a sequence of viewers…
And our main achievement has been Polyanno that allows annotation that can then be stored in manifests, to upload and discuss annotations. It’s proving very popular with the IIIF community. We have huge amount of images to convert to IIIF but lots of plans, lots of ideas, and lots to do…
We are also collabortion with NLS around their content, and are up to talk with others about IIIF!
- Reference Rot in theses: a HiberActive pilot. Nicola Osborne, EDINA, University of Edinburgh
This was my presentation – so notes from me here but some links to Site2Cite, a working demo/pilot tool for researchers to proactively archive their web citations as they are doing their research, to ensure that by the time they submit their PhD, have their work published, or begin follow up work, they still have access to those important resources.
Introducing Site2Cite: http://hiberactive.edina.ac.uk/
Try out the Site2Cite tools for yourself here: http://hiberactive.edina.ac.uk/site2cite/
You can view my full slides (slightly updated to make more sense for those who didn’t hear the accompanying talk) from the 10×10 here:
This ISG Innovation Funded pilot project builds upon our previous Andrew W. Mellon-funded Hiberlink project; a collaboration between EDINA, Los Alamos National Laboratory, and the University of Edinburgh School of Informatics. The Hiberlink project built on and worked with Herbert Van de Sompel’s and his Memento work.
- Lifting the lid on global research impact: implementation and analysis of a Request a Copy service. Dimity Flanagan, London School of Economics and Political Science
Apologies for missing the first few minutes of Dimity’s talk…
LSE have only recently implemented the “request a copy” button in the repository but, having done that Dimity and colleagues have been researching how it is used.
We’ve had about 500 requests so far. The most popular requests have been for international relations, law and media areas. And we see demand from organisations and governments – including requests explicitly stating that they do not subscribe to the journal and they felt it was crucial to their work. There is that potential impact here being revealed in requests for articles ahead of key meetings and events, etc.
And these requests show huge reach form organisations locally and around the world.
One thing we have noticed is that we get a lot of requests from students who can definitely access the version of record through journals subscribed to by their university – they don’t realise and that causes avoidable delay. We have also seen academics linking from reading lists to restricted items in repositories. But, on a more positive note, we’ve had lots of requests from our alumni – 70% of our alumni are international and that shows really positive impact for our work.
Overall this button and the evidence that requests provide has been really positive.
- What RADAR did next: developing a peer review process for research plans. Nicola Siminson, Glasgow School of Art
RADAR captures performances, exhibitions, as well as traditional articles, monographs etc. It is hosted on EPrints. And we encourage staff to add as much metadata as possible. But increasingly it is being used internally, with staff developing annual research plans (ARPs) and that feeding into allocations in the year ahead.
These ARPs arose in part from the outcome of the REF 2014 assessment. These are peer reviewed (but not openly available) ARPs aim to enable research time to be allocated more effectively with a view to maximising the number of high quality submissions to the next REF. RADAR houses the template as it played a key role in the GSA REF 2014 submissions, and staff already use and know the system.
The templates went live in 2015, and was tweaked, tried and relaunched in February 2015. The ARP template captures the research, the researchers details, and the expected impact of their work – and a submit process. The process was really quite manual so we thought carefully about how this should work… So once submitted the digital ARP went into a manual process. Once piloted we built the peer review process into RADAR, including access management that allows the researcher sole access until submitted, and then manages access back and forth as required.
We discussed this work with EPrints in Autumn 2016 and development commenced in Spring 2017. This was quite an involved process. The system was live in time for ARP panel chairs to send feedback and results.
So the process now sees ARPS submit; RADAR admin provides Head of Research with report of all ARPs submitted. Then it goes through a series of review stages and feedback stages.
So administrators can view ARPs, panels, status, etc. and there is space for reviews to be captured and the outcome to be shared.
Lessons learned here… No matter how much testing you have done, you’ll still need to tweak and flag things – it’s useful to have a keen researcher to test it and feedback on ‘those tweaks. We still need to increase prominence of summary and decision for the researcher, with more differentiated fields for peer reviews, etc. In conclusion the ARP peer reviewed process has been integrated into RADAR and will be fully tested next year. The continued development of RADAR is bearing fruit – researchers are using the repository and adding more outputs, and offering greater visibility and downloads for GSA.
Explore our repository at http://radar.gsa.ac.uk
- Edinburgh DataVault: Local implementation of Jisc DataVault: the value of testing. Pauline Ward, EDINA
I am Pauline Ward from the Research Data Service at the University of Edinburgh, and I am based at EDINA which is part of the University. Jisc commissioned UoE’s Library and University Collections (L&UC) team to design a service for researchers to store data for the long term with the Jisc Data Vault. And we’ve now implemented a version of this at Edinburgh – using that software from L&UC and specified and managed by EDINA.
The DataVault allows safe data storage in the University’s archival storage option, which links this data to a metadata record in Pure without having to re-enter any of the data. And, optionally, to receive a DOI for the data which can be used in publications and other outputs – depending on the context and appropriate visibility of the data. That allows preservation of data at the University. The DataVault is not for making data public – we have a service called DataShare for that.
So, let’s talk about metadata… We push that metadata to Pure and keep DataVault metadata as concise as possible. We need metadata that is usable and have some manual intervention to check and curate that.
We had a fairly extensive user testing process, to ensure documentation works well, then we also recruited academics from across the University to bring us their data and test the system to help us ensure it met their needs.
So, the interim version is out there, and we are continuing to develop and improve it.
- Data Management & Preservation using PURE and Archivematica at Strathclyde. Alan Morrisson, University of Strathclyde
We are governed and based in the research department. We wanted to look at both research data management and long term preservation, including reflecting on whether Pure is the right tool for the job here. Pure was already in use at Strathclyde when our Research Data Deposit Policy was being developed, so we deliberately made the policy as open as possible. Also Strathclyde is predominantly a STEM university, and we started off by surveying what else was out there… We knew the quantity and type of data coming in…
And since we opened up the service, in terms of data deposits to date we are have seen a steady increase from about 200 to 400 data sets over the last year.
In terms of our preservation and curation systems we have Pure in place and that does a lot – data storage, metadata, DOI etc. But we’ve also recently implemented Archivematica – it’s free, it’s open source, it’s compatible with Jisc DataVault. So the workflow right now is that data, metadata and related outputs are added to to Pure, and a DOI minted. This feeds the knowledgebase portal. In parallel the data from Pure goes to Archivematica where it is ingested and processed for preservation, and AIP METS file cleaned using METSflask before being stored.
The benefits of this set up is that Pure is familiar to researchers, does a good job of metadata management and related content and has a customised front end (Knowledgebase). Archivematica is well supported, open access, and designed for archiving. But those systems don’t work together, we are manually moving data across. Pure is designed for storage and presentation, not curation. Archivematica only recognises about 40% of the data.
So, in the future we are reviewing our system, perhaps using Pure for metadata only. We are keeping an eye on Jisc RDSS and considering possible Arkivum like storage. And generally looking at what is possible and most appropriate moving forward for curation and archiving.
- Open Access… From Oblivion… To the Spotlight? Dawn Hibbert, University of Northampton
I’ll be looking back over the last ten years… And actually ten years back I was working here in Accommodation Services, so not thinking about repositories at all!
Looking back at 2007/8 in the repository world we had our NECTAR repository. Then in 2011, Jisc funded project enabled an author deposit tool for NECTAR. At that time we had a carrot/incentive for deposit, but no stick. Which was actually a nice thing as we’ve now slipped more towards it all being about the REF.
By 2012/13 we engaged with our researchers around open access who had feedback such as “it’s in the library – you can get a copy from there” or “it’s only £30 to buy the journal I publish in, if I make my article free the journal go under” or “My work is not funded by RCUK so why should my work be open access”. We wanted everything open… But by 2014/15 (and the HEFCE announcement) we were still getting “I don’t have to give you anything until 2016” and similar… And we get that idea of “it’s all about the REF”. And it is not. Using the REF in that way, and the repository in that way overlooks the other benefits of open access.
So in 2016/17 HEFCE compliance started. Attitudes have shifted. But the focus has all been about gold APCs and the idea of the university paying. When actually we are using the HEFCE deposit and (later) open access green OA route. And for us we really want researchers to deposit much more than the open access part (we can do that later on).
So, in 2017 and beyond we are looking at emphasising the benefits, sharing that information, being positive about the opportunities, no just using the HEFCE stick. And for open access work we are looking at improving acceptance, extending open access to other outputs, and focus on visibility of research outputs – the Kudos type tool. And we are shifting the focus to Digital Preservation.
We are looking at datasets being open access too. RDM and Digital Preservation gaining ground. And when work is deposited, shared, tweeted, etc. that can really shift attitudes and show benefits and engagement for academic colleagues.
But we still see lots of money spent on PA and journal subscriptions. And we have yet to see what happens with RCUK and REF compliance.
- Automated metadata collection from the researcher CV Lattes Platform to aid IR ingest. Chloe Furnival, Universidade Federal de São Carlos
I am pleased to present work by myself and my colleagues from Sao Paulo in Brazil. Back in 1999 all Brail universities were required to share CVs of their research and academic staff on a platform (Curriculo Lattes) which now has over 2 million records now.
However, our University’s repository was only launched in 2016. Different to many universities using Web of Science or Scopus capturing their researchers’ work there, we saw that the Lattes CV Platform was the key and most up to date metadata – always extremely updated as required in funding. It is a really useful stepping stone to identify our staff publications for the initial repository.
So we have very well known researchers, Mena-Chalco and Cesar Jr (2013) who developed ScriptLattes for this extraction. But then the CNPq decided to implement a CAPTCHA which inhibits this Script. They alleged this was for security reasons but it created an uproar as it was seen as “our data”… So, this has all been very complicated and impacted on our plans to identify our own researchers’ work… So we went for SOAP (Simple Object Access Protocol). We also developed a proxy server to deal with CNPq limits. This is based on OpenResty platform to share access to the Lattes SOAP webservices. That lets us manage our local IP address and manage load/avoid going over capacity.
We extract data in xml format, then process in Python to generate Dublin Core. Then we use another script to eliminate duplicates using the Jaccard measure that helps detects differences… Then, once processed, it is held in DSpace. Each record in Lattes has a unique identifier as that site uses an ID number that all Brazillians are required to have to access e.g. a bank account.
So now we have the CVs of 1,166 teaching staff and researchers working at our HEI were retrieved in just 11 minutes. including metadata for 78K journal articles and proceedings papers. We had the specific objective of gaining direct and official access to public metadata held in Lattes CV.
- The Changing Face of Goldsmiths Research Online. Jeremiah Spillane, Goldsmiths, University of London
JS: Goldsmiths Research Online started as a vanilla install of EPrints, and it has become customised more and more over time. Important to that development have been several projects. The Jisc Kultur project created a transferable and sustainable institutional repository model for research output i the creative and applied arts, and creating facility for capturing multimedia content in repositories.
Kultur led to the Jisc Kaptur project, led by VADS working with various art colleges including Goldsmiths and GSA.
Then in 2009 we had the Defiant Objects project which looked to understand what makes some objects more difficult to deposit than others.
Jeremiah’s colleague: RAE/REF work has looked at policy versus the open access ethos – and striking the right balance there. So, the Goldsmiths website now includes content brought in from the repository. And that is now organised depending on the needs of different departments. We are also redesigning the website to better embed content to enable exploration of visual content. And the new design should be in place by autumn this year.
Speaking of design… We have been working with OJS but have been wanting to more thoroughly design OJS journals, so we have a new journal coming, Volupte, which runs on OJS in the background but uses SquareSpace at the front end – that’s a bit of an experiment at the moment.
JS: So, the repository continues to develop, whilst our end users primarily focus on their research.
Take a look at: research.goldsmiths.ac.uk
And with that Day One, and my visit to Repository Fringe 2017, is done.