This collection of case studies was prepared by myself and Rob Johnson to aid Jisc in promoting RDM and its benefits. In addition, the material we gathered can be used to support advocacy efforts in HEIs.
We presented this work at the Jisc Research Data Network event on 28 June 2017 at the University of York. Please find below an introduction to our RDM case studies and get in touch if you have any questions. Note that there exists an online version of the case studies including abstracts and you can find it here.
Research Data Management (RDM) is an overarching term encompassing the organisation, storage, and documentation of data generated during research projects. RDM deals with the organisation and curation of active research data, with its day-to-day management and use, and with its long-term preservation.
RDM is an important practice for both institutions and individual researchers. Data supporting results should be made available and preserved so as to allow its reuse and the verification of published research. Several other benefits can arise from the implementation of RDM, including increased citations, increased research collaborations, or increased visibility. Today, data has to be managed not only for preservation purposes, but also to fulfil the requirements of most research funders.
Although RDM has been around for a while, the above benefits are usually described qualitatively and the lack of a solid body of evidence makes advocacy difficult. We have sought to fill this gap by using case studies to present a rich and varied picture of the impacts underpinned by RDM.
Spread of disciplines
The case studies assembled here come from a wide range of research fields. Due to inherent differences between disciplines, the benefits of RDM become apparent in different ways. They are more tangible in certain fields, and more abstract in others. Nonetheless, the case studies demonstrate that RDM is a worthwhile activity for all institutions and researchers.
The examples below mostly involve large data management initiatives, as these are more likely to show the wide reach of the benefits of RDM. However, we would like to stress that even smaller data management efforts can have an impact. Unfortunately, this can be very difficult to track, as an individual researcher reusing data from other individual researchers is often lost in a sea of information. Similarly, impact sometimes cannot be traced to a specific source: in some studies, clear evidence of the impact of RDM is available, but they point to a whole repository rather than to a single study or dataset.
Enablers of impact from RDM
The effective implementation of RDM requires both cultural change and specific data skills. This makes its dissemination and practical realisation difficult and is the main obstacle to the above-mentioned benefits. It is, therefore, desirable to examine the RDM environment to investigate its enablers and what has worked historically to encourage future data curation and reuse.
Our research into the benefits or RDM led us to discovering some of the circumstances and situations that facilitate it, along with some of the reasons why this practice should be pursued. A summary of our findings is as follows:
- Open licensing (e.g., in the case of computer code and algorithms) is essential to allow crowd-sourced improvements.
- Data repositories and infrastructures are among the most significant enablers of impact: without them, very few of the impact case studies below would have been possible.
- Collaborations between international bodies or organisations strongly promote data re-use, especially in fields where it was not possible for a single player to take charge. These collaborations create the right environment for sharing and re-use of research data: cultural change is encouraged along with the use of joint infrastructures at a national or international level.
- The impact of RDM is normally seen after a long time, when, i.e., after has been produced, curated, maintained, and reused. Thus, there is a need for sustained investment in this field, as benefits cannot be seen immediately.
Aggregation of data and digitisation of documents are key to encouraging the development of digital humanities. These initiatives often arise from the collaboration between museums, research libraries, and universities. When data that was spread between several sources (e.g., many different books/articles) or held in obsolete formats was organised and analysed through sound RDM, hidden findings could be uncovered.