What is the Problem with Medievalist’s and Open Access to Research Data? Some (rather uncomfortable) Reflections from the Carmen 2019 Data Workshop

After the successful Open Science Workshop at last year’s CARMEN (Worldwide Medieval Network) Annual Meeting which focussed on Scholarly Communication in general (Workshop Report), I decided to focus this years training on Research Data. 

  • Resource: Wuttke, Ulrike. (2019, September). Make your digital research more sustainable and visible: Data Sharing and Data Management Techniques & Tools for Digital Medievalists. Zenodo.  

Workshops like the CARMEN workshops are always very exciting because the audience is very mixed in respective to their “digital skills” but also kind of homogenous because all are medievalists, which makes it easy for me to relate to their research questions and methods etc. Also, the CARMEN people are especially welcoming and not shy of controverse discussions (as you will see below). 

This year’s workshop “Make your digital research more sustainable and visible: Data Sharing and Data Management Techniques & Tools for Digital Medievalists” drew quite a huge group, not only because of the mouth-to-mouth propaganda of last year’s participants, but because it is a hot topic for medievalists.

After outlining the aims of the workshop and the code of conduct that underlined that the workshop should be a safe space for open discussion (see my slides for the code of conduct, I got great comments on it!), we opened the space to hear about participants’ backgrounds and interests. Then I introduced the basics of the concept of research data and key concepts related to humanities research data. We ended the first round with an activity in which the participants were asked to put on their “magic data glasses” and ponder over the question “What is your research data?”. 

Besides the request to explain in more detail “What is research data, so what counts as data?”, the answers of the participants showed the great variance of digital methods used and ranged from: 

  • digital notes and annotations mainly in Word or 
  • use of spreadsheets, 
  • “free” collections of images of manuscripts, 
  • to more structured data such as transcriptions in TEI or 
  • using Omeka or others database systems and collecting data e.g. using the metadata standard Dublin Core. 

The question, however “What is research data? What counts as research data?”, made me think that most definitions of research data are very broad, that they also include digital versions of articles, in fact all “digital stuff”. So there seems to be a broad overlap between digital scholarly publishing and research data management, which makes the concept a bit confusing. For the session we came to agree that all “digital stuff” researchers use and produce are research data, but with varying degrees of structuredness and machine readability (which we deemed important).  

After this already heated discussion during the first group activity, I introduced the key concepts and good practices related to sustainable and visible humanities research data. I especially focussed on technical and intellectual sustainability (think of documentation, research data management, and (FAIR) data publication). Then we dove into the second group activity, a discussion of challenges and needs for Data Sharing. For this part of the discussion, Torsten Roeder, Digital Humanities Coordinator from the Leopoldina (the national German Academy) had joined me.

I had asked Torsten to join me because I expected a lot of detailed technical questions about digital methods related with sustainability or maybe some anxiety about sharing, but surprisingly, the main point of the discussion was: 

  • What if I would love to do data sharing, but cannot because of costs, often posed by libraries? 

Now I had a whole group of medievalists in a heated discussion about how often – in their experience – costs are the main obstacle to publishing Open Access and publishing their research data! 

Of course it depends what your main research area is, but for this group, apparently their main research data (sources) are digitized manuscripts, that is pictures (not TEI encoded texts, to make that clear). It came down to the point that they often have to pay (by themselves or out of dwindling budgets) for publishing rights imposed on them by the holding institutions (often libraries) and the costs for these rights for online, Open Access, publication are much higher than for print (and/or closed access). The licensing model of libraries and other institutions and legal restrictions imposed on archival sources make it almost impossible to publish them or even contributing to community sourced collections if the default for pictures is set to OA. 
This situation leads to a thriving “black market” of sharing via Facebook groups etc. because legally it is not possible to share them publicly. Of course, there are quite some libraries and other institutions that provide Open Access to digitized medieval sources, but they are not always easy to find and often you are looking for a very specific manuscript and chances are apparently still very high that the one you need is not digital available in Open Access.

After the session I started to look for starting points to come up with these resources, these are some useful links I came across (please let me know if I missed something very obvious!):

After the session we continued the discussion over a cup of coffee and from this I would like some more “pieces of food for thought”: 

  • Publication cost by libraries are really a “deal breaker” for researchers like medievalist who work a lot with picture, this has been outlined in detail as an example by Kate Rudy in an article (2019). Of course it’s not the holding institutions alone but I think she is right when she writes:
    “Image-holding institutions should rethink their purpose. They can never have enough in-house expertise to fully research all of their holdings. They should be grateful to scholars who are applying their expertise to their collections. The least they can do is to make images available for free. They should also allow researchers to make study photographs and produce high-resolution images for publication at low costs.” 
  • If only the “rich” e. g. the few that have a research budget can pay for pictures and picture rights, this causes a problem for the diversity of the field of medieval studies 
  • Especially given that libraries are often active promoters and facilitators of Open Access / Open Science (e. g. see LIBERs Open Science Roadmap on Zenodo) these “black sheep” are undermining the credibility of these efforts
  • These obstacles also make it very difficult to comply with strict Open Access requirements of national funders

Additionally, Laura Morreale (Center of Medieval Studies, Fordham University) pointed me to her project Digital Documentation Process (DDP) that has developed: “a set of best practices for cataloguing and preserving digital projects. The DDP makes digital humanities (DH) scholarship findable and citable for all scholars, stores and makes available durable versions of digital objects created in DH work, and facilitates a suite of documentary products for DH practitioners to communicate the value of their work to DH- and non-DH scholars alike”. She underlines that subject specialist, IT specialist, and information specialist have to work hand in hand, but the motivation has to come from the subject specialist, who cannot “just throw data at the end of the project at the librarian” (couldn’t agree more). The proposed process is aimed at making the data better understandable and is a suggestion how to add value to it so that is become a part of the scholarly record, especially the so called Archiving Dossier Narrative. She invites discussion around the question how this approach, that is very much from a subject specialist perspective makes sense from the perspective of Data Management.   

Last but not least, I am happy to share with you the “Research Data Management Treasure Hunt Map” drawn by Torsten Roeder which was just not ready in time to be shown during the workshop.

“Research Data Management Treasure Hunt Map” by Torsten Roeder, 2019. CC BY 4.0

Thank you for reading! As always, I am very keen to hear you thoughts, additions etc. Discuss with me by leaving a comment below, or on Twitter (@uwuttke).