Fellowship of the Data

“Here be dragons”: Open Access to Research Data in the Humanities

Slightly modified for reading version of my talk for the conference “Innovative Library in Digital Era” (ILIDE) 2019 Conference Jasná, Slowakia, 9.04.2019

Cite as: Ulrike Wuttke, “Here be dragons”: Open Access to Research Data in the Humanities, Blogpost, 09.04.2019, CC-BY 4.0. Link: https://ulrikewuttke.wordpress.com/2019/04/09/open-data-humanities/.

In this rather lengthy blog post (which is more like a pre-print of a future article to be), I discuss the paradigm shift to reusable, machine-readable data as one pillar of Open Science or Open Scholarship (the latter being a more inclusive term for the Arts and Humanities), for Humanities and Heritage researchers. I address key challenges and perspectives of Humanities Research Data Management with a focus on educational aspects and available tools, especially the Research Data Management Organizer (RDMO) tool (https://rdmorganiser.github.io/).
This blogpost was awarded in the Open Humanities Tools and Methods Blog Competition with a travel bursary and a poster presentation at the DARIAH Annual Event 2019 at Warsaw.

Resources:

Introduction

#1 It has been estimated by Stefan Winkler Nees (from the Deutsche Forschungsgemeinschaft-DFG) in 2011 that 90% of all digital research data is lost.[1] We don’t know how many of this data belonged to the Humanities and hopefully, these numbers are better today, but we can assume that still a lot of Humanities data (and other data) is lost, because of missing infrastructures, or because no one has taken care of the long-term availability of this data in time.[2]

Are Your Data FAIR?

But, even if data is not lost: Does the available data sparkle joy, to borrow a term from the ubiquitous Marie Kondo? Are these datasets accessible for research, well documented using standards, available in interoperable formats etc., in short: is it FAIR[3] data, too?

#2 The digital transformation of research has dramatically increased the creation, gathering, and use of research data in all disciplines. It also led to the development of the new paradigm of Open Science which not only promotes that published results should be open access, but also that the underlying data should be open (for example in the H2020 Open Research Data Pilot).[4] Together, these developments resulted in a heightened demand for the publication of research data, promoted by national and international funding agencies[5], and research institutions, often directly reflected in (funding) guidelines, which can be more or less binding.[6] Among the many reasons to underpin the demand for Open Research Data, efficiency (reuse), reproducibility (transparency), impact, and public trust are prominent.[7] Slowly, the responsible handling of research data and FAIR publication of research data is becoming an integral part of Good scientific practice.[8]

The vision goes beyond that: we are talking here about the grand vision of the European Open Science Cloud (EOSC)[9] and an even bigger vision for a global Open Science Ecosystem. To realise this vision and for humanities data playing a considerable role in it, our “efforts should be guided by the twin aims of ensuring that data meets the FAIR principles, and that it is effectively preserved in trusted, certified repositories”[10]. The FAIR principles have gained since their publication in 2016[11] great acceptance as guiding principle, because they take not only into account that not all data can be open[12], leading, for example, the European Research Council to formulate the principle “as open as possible, as closed as necessary”[13], but because they also put great emphasis on “the attributes data need to have to enable and enhance reuse, by humans and machines”, in a nutshell metadata.[14]

Vision for Humanities Data

In the following, I will first line out key challenges related to Open Access to Research Data in the humanities and then discuss some perspectives to improve the current situation. I will focus on educational aspects, and tools for research data management as keystones for Open & FAIR Research Data. As the German situation is best known to me, examples are mainly drawn from a German context.

“Here be dragons”

#4 The phrase “Hic sunt dracones” (transl. “Here be Dragons”), is used on some old maps of the world to describe an area that was unknown to the cartographer. I found it quite appropriate to summarize the ambivalence of humanists towards data and all these “fancy” concepts discussed by “infrastructure people” like FAIR Data, the EOSC, or Research Data Management. First of all, it must be said that humanities researchers tend to be ambivalent about the concept of ‘data’[15] and that “[t]here are issues surrounding […] the acceptance of the ‘research data concept’”[16]. In short, they just don’t use the word “data”, but talk about “sources”, “research materials” etc., which leads to the fact that the whole “data talk” doesn’t appeal to them. Additionally, an expeditionary survey conducted by PARTHENOS in 2017 among researchers in the domain of digital humanities, language studies, and cultural heritage showed that the FAIR Principles and the EOSC, concepts and recommendations, thriving among “infrastructure folks”, are relatively little known in the research communities themselves.[17] Often, the publication of research data only comes as an afterthought (if at all).[18] However, at the end of a project, it is often too late to publish the data in a meaningful way because of the lack of documentation and the lack of resources to prepare the data properly for publishing.

#5 Let me discuss some reasons that currently seem to form barriers for establishing a culture of open sharing in the humanities. In general, “issues surrounding incentivisation”[19] can be observed. Given the strong competition and the traditional humanities reputation system based on traditional ‘long story scientific publication formats’ (monographs, book chapters, or articles as significant scientific publications) as opposed to ‘data’, there is low motivation to publish research data.[20] This lack of incentives goes hand in hand with researchers’ (perceived) fear of being scooped or that someone else will be the first to publish something based on their data, e.g. an exciting manuscript or object they have found.[21] A research tradition that has been based on (and rewarded) secretiveness is not easy to change with nothing fundamental opposite as a reward. Other prejudices often brought up are: nobody will understand the data, nobody will need the data, someone will sell the data and last but not least, (perceived) lack of technical skills.[22]

“The inherent controversy in the meaning of “data” and the importance of personal interpretation on data for humanities researchers is not conducive to sharing.”[23]

#6 Even those who are willing to publish data, as it must also be duly acknowledged that there is a long and thriving tradition of humanities corpora collection and publication continued in the digital, e.g. at the academies, are facing obstacles. Especially legal issues or doubts about possible legal issues are often mentioned in this context: The legal regulations concerning research data are complicated and not internationally aligned and there are many actors involved in the production of humanities research data, not only humanists. This leads to the fact that humanities research is often based on data under copyright restriction (from cultural heritage institutions or other actors) which makes it difficult to publish them as ORD.[24]

#7 Additionally, we are dealing with is issues around the availability and sustainability of specialist support structures for humanities research data support as well as the lack of practical guidance.[25] Humanities data centers and other data services for the humanities are often dependent on third-party funding (project based).[26] This leads to issues of trust on the side of the researchers which may result in these services not being in high demand[27] (and may even result in an unwillingness to “Go Digital” at all), it also leads to the problem of sustaining “living systems”[28] (which need to be frequently updated, migrated, and curated). However, current efforts, especially from the Digital Humanities community, also have led to positive developments:

“For example, the emergence of linked open data over the past decade has been supported both by the establishment of effective standards for modeling and disseminating such data, and the growth of practices and social expectations supporting its creation. These developments have meant that expertly modeled data from specific domains can be accessed and combined flexibly, rather than remaining in isolation or striving for self-sufficiency.”[29]

Fellowship of the Data
Fellowship of the Data

#8 To sum up my observations so far: Humanities research data, in general, is rather heterogeneous, idiosyncratic, and complex[30] and humanists are ambivalent about the term “data”. Digital practices are already part of the research activities of many humanists, especially in the Digital Humanities, but they are not equally fully developed. This leads to the fact that the potential of digital research data and methods is not fully exploited, because the digital research process is not carefully planned, with other words many research data already exist in digital form, but they are not findable, quality controlled, and reusable.[31] All in all, the land of FAIR Research Data is still unknown territory for many humanists, or at least scary as if dragons would indeed live there. In the next part, therefore, I will argue for increased efforts for awareness raising and skills building and a “fellowship of the data”, a support system to facilitate the quest for FAIR data in the humanities.

Perpectives

#9 Naturally, I cannot offer immediate solutions, but I would like to point out some paths that need to be pursued with increased intensity in the near future to facilitate FAIR data in the humanities (and beyond). These paths can focus on different stakeholders in the FAIR ecosystem such as research institutions, funding bodies, or publishers, or individual researchers and research communities. In my opinion, special attention has to be paid to the researchers and research communities themselves, so that recommendations, policies, services, etc. are aligned and known to disciplinary practices and cultures.[32] If the researchers are not on our side in this quest, we are prone to lose the battle or at least experience a delay in realising the goals.

#10 The most urgent points in my opinion are the following. We need to work on incentives for the FAIR publication of research data, e.g. wider adoption of DORA (Declaration on Research Assessment).[33] We also need to Invest in the development of beneficial environments for aggregation (think EOSC, German NFDI):

“The interdisciplinary bundling of humanities data repositories and the development of adequate research tools and services for linked data represents a great opportunity for humanities research.”[34]

Another highly important task is educating the next (and this) generation of (digital) humanities researchers[35] to deal with the datafication[36] of research and education practices, but also “infrastructure people” in discipline-specific contexts. Humanists need to be aware of limits of publication tools and how they expose (or not) the underlying data model. We need to strive to offer more idiosyncratic options and take time for critical consideration of how early choices have an impact on how data is published and can be used (curatorial perspective). It is our task to prevent research data from becoming trapped (in specific formats or hardware). In an ideal world we “design our data without tool dependencies” = “tool-agnostic approach” vs. “tool-dependent approach”[37]. The progressing digital transformation of humanities research along with the increasing importance of digital research infrastructures calls not only for a certain level of “data literacy”[38] but even for an expansion of this concept to a certain level of “data infrastructure literacy”, a term recently coined by Gray et al. 2018[39].

Much discussed in the context I have just outlined is Research Data Management:

“Research Data Management describes the process to curate (or manage) research data along the research data lifecycle and includes various activities such as planning, producing, selection, analysis, archiving, and preparation for reuse. Because data are very heterogeneous, discipline and data specific solutions can be required.”[40]

Given this, admittedly not very appealing sounding definition, it may not come as a surprise that researchers in general still consider research data management [41]as an extra, tedious, time-consuming task that diverts them from “real research” and humanists especially consider it as almost opposed to the hermeneutic humanistic research practice[42]. However, acceptance by the researchers is the key success factor for establishing standards in Research Data Management.[43] Therefore, we need to show humanists the added value of Research Data Management for the planning process of digital projects. Lemaire (2018) has recently convincingly argued that RDM is a process already inherent in the research process itself (although at the moment rather implicit) and that it can be an instrument for reviewing the research concept.[44]

#12 The digital turn of humanities research and the requirement for FAIR research data, that is sustainable and quality controlled handling of research data, requires thorough planning of the digital research process before the start, a process that is guided and documented in a Research Data Management Plan.[45] Given all the advantages of Research Data Management planning, efforts should be increased regarding awareness raising and skills building[46] already as part of university curricula[47][, and tools for consequent data management planning (and handling)[48] with the end in mind, that is (if possible) the publication of FAIR research data.

RDMO[49]

#13 Given that the management of research data is increasingly regarded as a process of active support and care during the whole research process (and not only of producing a mere static document), tools are needed that support active Research Data Management Planning[50], e.g. by providing different and up-to-date status information to different participants of the research process. I am currently part of a project that is developing such a tool, the Research Data Management Organiser (RDMO).[51]

#14 In a nutshell, with RDMO the research data management process can be organised as a collaborative effort encompassing its different stakeholders, besides researchers, especially infrastructure partners such as libraries or computing centres. One of the use scenarios of RDMO is library staff using RDMO’s question catalogue to work out the data management strategy for projects with researchers and other partners/experts.[52] RDMO can be adapted to the requirements of communities or organisations (e.g. institutional or discipline-specific guidelines) and has multilingual capabilities. At the moment we are working with a range of very different institutions and communities in Germany to further improve this tool.

Conclusion

#15 On the one hand, researchers need to be aware of the issues at hand and take their responsibility (take the issues of digital research practices seriously!).[53] On the other hand, we need to work on institutional availability and sustainability of research data (management) and support (e.g. via data centres and local experts)[54] and clever and efficient connection between initiatives on different levels. We need to provide adequate RDM tools that support researchers to prepare their data for publication from the very beginning. Libraries already have an important role in this ecosystem and I dare to say that they are capable and designated to take a leading role in this field in the future if they invest in it: in heads and infrastructure(!).[55] Especially I would like to refer here to their role in the creation of vast digital corpora of open research data, which cannot be left as a task to researchers alone.[56] Last but not least, we also need to make research data management less scary: it’s not a scientific revolution and doesn’t mean that all skills learned so far in a typical humanities curriculum are to be thrown overboard, quite the opposite is true.[57]

#16 There is a lot at stake for the humanities, maybe the very question what we want the future of the humanities to be. When it comes to Open and FAIR research data in the humanities, I can only say it with Queen: “I want it all, and I want it now!”

Queen I want it all

“I want it all, I want all, I want it all, and I want it now.” Queen (1989).


To create this broad culture of FAIR data sharing in the humanities we have to roll up our sleeves, team up, and distribute hats:

  1. Embrace Open principles,
  2. bridge the gap between the digital and the humanities and look what we can learn from the Digital Humanities and other more data-savvy disciplines.[58]

What are your thoughts and suggestions on this topic? Do you agree? Do you have additional hints for me that lead to more discipline specific information and insight about the handling of research data, data sharing, and the FAIR principles in the humanities? Please, leave your thought below or contact me via Twitter or e-mail. Looking forward to discuss this topic further with you!

Notes

[1] See Stefan Winkler-Nees, Vorwort, In: Büttner, Stephan; Hobohm, Hans-Christoph; Müller, Lars (ed.): Handbuch Forschungsdatenmanagement, Bad Honnef 2011, p. 5.

[2] See Geisteswissenschaftliche Datenzentren im deutschsprachigen Raum, Grundsatzpapier zur Sicherung der langfristigen Verfügbarkeit von Forschungsdaten, Version 1.0, DHd AG Datenzentren, 2018, p. 9. Accessed from http://doi.org/10.5281/zenodo.1134760, 25.03.2019.

[3] See Wilkinson, M. D., et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018. Accessed from https://doi.org/10.1038/sdata.2016.18, 25.03.2019. Webseite FORCE11: https://www.force11.org/group/fairgroup/fairprinciples.

[4] See European Research Council (ERC), Guidelines on Implementation of Open Access to Scientific Publications and Research Data in projects supported by the European Council under Horizon 2020, Version 1.1., 21. April 2017. Accessed from http://ec.europa.eu/research/participants/data/ref/h2020/other/hi/oa-pilot/h2020-hi-erc-oa-guide_en.pdf, 25.03.2019. See also from the UK document Concordat on Open Research Data, 2016. Accessed from https://www.ukri.org/files/legacy/documents/concordatonopenresearchdata-pdf/, 25.03.2019 or the recommendation from the Steuerungsgremium der Schwerpunktinitiative „Digitale Information“ der Allianz der deutschen Wissenschaftsorganisationen (2017), Den digitalen Wandel in der Wissenschaft gestalten: Die Schwerpunktinitiative „Digitale Information“ der Allianz der deutschen Wissenschaftsorganisationen, Leitbild 2018 – 2022, 2017, Accessed from: http://doi.org/10.2312/allianzoa.015.

[5] See for example EC Directorate-General for Research & Innovation, H2020 Programme: Guidelines on FAIR Data Management in Horizon 2020, Version 3.0, 26. July 2016, Accessed from: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf, 25.03.2019, Deutsche Forschungsgemeinschaft /DFG), Leitlinien zum Umgang mit Forschungsdaten, 30.09.2015. Accessed from: http://www.dfg.de/download/pdf/foerderung/antragstellung/forschungsdaten/richtlinien_forschungsdaten.pdf, 25.03.2019.

[6] While German funders already give recommendations, other funders already issued more binding guidelines that include the demand for a DMP, such as the Schweizerische Nationalfonds (SNF). See Schweizerische Nationalfonds, Open Research Data, (o.J.). Accessed from: http://www.snf.ch/en/theSNSF/research-policies/open_research_data/Pages/default.aspx#FAIR%20Data%20Principles%20for%20Research%20Data%20Management, 25.03.2019.

[7] “Open research data (ORD) have the potential not only to deliver greater efficiencies in research, but to improve its rigour and reproducibility, to enhance its impact, and to increase public trust in its results.” Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 3. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 22.03.2019.

[8] See recommendation 7 “Sicherung und Aufbewahrung von Primärdaten” of the DFG-Denkschrift zur Sicherung der guten wissenschaftlichen Praxis, in: Deutsche Forschungsgemeinschaft (DFG), Sicherung guter wissenschaftlicher Praxis: Empfehlungen der Kommission „Selbstkontrolle in der Wissenschaft“, Weinheim 2013, p. 21-22. Accessed from: http://doi.org/10.1002/9783527679188.oth1, 25.03.2019.

[9] https://ec.europa.eu/research/openscience/index.cfm?pg=open-science-cloud.

[10] Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 8. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 25.03.2019.

[11] The term FAIR was launched in 2014. See https://www.force11.org/group/fairgroup/fairprinciples.

[12] Principles for Open Data in Science have been formulated in the Panton Principles, demanding that data should be placed in public domain. See Panton Principles, Principles for open data in science. Murray-Rust, Peter; Neylon, Cameron; Pollock, Rufus; Wilbanks, John; (19 Feb 2010). Retrieved 31.03.2019 from https://pantonprinciples.org/. This demand has been criticized as difficult to realize because of two main reasons 1) the legal system of some countries, including Germany, does not really allow complete renunciation of rights by the right holder (i.e. public domain) 2) it removes all obligations to quote, which remove an important incentive, see Ulrich Herb, Open Science in der Soziologie. Eine interdisziplinäre Bestandsaufnahme zur offenen Wissenschaft und eine Untersuchung ihrer Verbreitung in der Soziologie, Glückstadt 2015, p. 126, accessed from: https://doi.org/10.5281/zenodo.31234, 25.03.2019.

[13] European Research Council (ERC), Guidelines on Implementation of Open Access to Scientific Publications and Research Data in projects supported by the European Council under Horizon 2020, Version 1.1., 21. April 2017, p. 6. Accessed from http://ec.europa.eu/research/participants/data/ref/h2020/other/hi/oa-pilot/h2020-hi-erc-oa-guide_en.pdf, 25.03.2019. Not all data needs to be published. A selection of what data is relevant and interesting for scientific reuse, like archives have always done, is necessary. For guidelines see for example  Angus Whyte & Andrew Wilson, How to appraise and select research data for curation, Digital Curation Centre How-to Guides, Edinburgh 2010. Accessed from: http://www.dcc.ac.uk/resources/how-guides/appraise-select-data, 25.03.2019.

[14] European Commission Expert Group on FAIR Data, Turning FAIR into reality: Final Report and Action Plan from the European Commission Expert Group on FAIR Data, Brussels 2018, p. 18, accessed from: https://ec.europa.eu/info/publications/turning-fair-reality_en, 25.03.2019. The FAIR principles provide guidance on “how to facilitate knowledge discovery by assisting humans and machines in their discovery of, access to, integration and analysis of, task-appropriate scientific data and their associated algorithms and workflows” Quote from: https://www.force11.org/group/fairgroup/fairprinciples, accessed: 25.03.2019.

[15] See for example Christof Schöch, Big? Smart? Clean? Messy? Data in the Humanities, in: Journal of Digital Humanities, 2 (2013): 3. Accessed from: http://journalofdigitalhumanities.org/2-3/big-smart-clean-messy-data-in-the-humanities/, 25.03.2019, Miriam Posner, Humanities Data: A Necessary Contradiction | Miriam Posner’s Blog, 25.06.2015. Accessed from: http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/, 25.03.2019. Insightful discussions of the complexity of talking about data in the Humanities are also offered for example by Jennifer Edmond, Georgina Nugent-Folan, D2.1 Redefining what data is and the terms we use to speak of it, KPLEX (Knowledge Complexity) Deliverable D2.1, 2018. Accessed from: https://kplexproject.files.wordpress.com/2018/07/d2-1-redefining-what-data-is-and-the-terms-we-use-to-speak-of-it.pdf, 25.03.2019, Fabian Cremer, Lisa Klaffki, Timo Steyer, T., Der Chimäre auf der Spur: Forschungsdaten in den Geisteswissenschaften. o-bib 5 (2018): 2, p. 142–162. Accessed from: https://doi.org/10.5282/o-bib/2018H2S142-162, 25.03.2019.

[16] Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 7. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[17] See PARTHENOS, The FAIR principles and the EOSC concept in the research community of Digital Humanities, Language Studies and Cultural Heritage: An expeditionary survey, 2017, esp. p. 7-8, Accessed from: http://www.parthenos-project.eu/Download/PARTHENOS_FAIR_EOSC_survey.pdf, 25.03.2019.

[18] See Elke Brehm & Janna Neumann, Anforderungen an Open-Access-Publikationen von Forschungsdaten: Empfehlungen für einen offenen Umgang mit Forschungsdaten, in: o-bib 2018: 3, p. 1-16, p. 3. Accessed from: https://doi.org/10.5282/o-bib/2018H3S1-16, 25.03.2019.

[19] See for example Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 7. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[20] The humanities are no exception here. See for example Benedikt Fecher, Cornelius Puschmann, Über die Grenzen der Offenheit in der Wissenschaft: Anspruch und Wirklichkeit bei der Bereitstellung und Nachnutzung von Forschungsdaten, in: Information – Wissenschaft & Praxis 66 (2015): 2-3, p. 146-150, p. 147. Accessed from: https://doi.org/10.1515/iwp-2015-0026, 25.03.2019, Fabian Cremer, Lisa Klaffki, Timo Steyer, T., Der Chimäre auf der Spur: Forschungsdaten in den Geisteswissenschaften. o-bib 5 (2018): 2, p. 142–162, p. 148. Accessed from: https://doi.org/10.5282/o-bib/2018H2S142-162, 25.03.2019.

[21] See for example Christine L. Borgman, Big Data, Little Data, No Data: Scholarship in the networked World. Cambridge, Mass; London, 2015, p. 177-179, where she characterises humanities data as often being considered as “club goods” (p. 177), meaning access is only granted to very specific individuals, such as local researchers. She describes on the example of the Dead Sea Scrolls this practice of local control (“hoarding”, p. 178), which stems from the fact: “Once scholars obtain access to materials, they may wish to mine the in private until they are ready to publish.” (p. 178).

[22] See for general observations about the (not) sharing of research data for example Carol Tenopir et al., Data Sharing by Scientists: Practices and Perceptions, in: PLOS ONE 6 (6), 29.06.2011, p. e21101, https://doi.org/10.1371/journal.pone.0021101,Veerle van den Eynden & Libby Bishop, Incentives and motivations for sharing research data, a researcher’s perspective. A Knowledge Exchange Report, 2014, accessed from: http://repository.jisc.ac.uk/5662/1/KE_report-incentives-for-sharing-researchdata.pdf, 25.03.2019, Benedikt Fecher et al., What Drives Academic Data Sharing? PLOS ONE, 10(2015):2, p. e0118053. Accessed from: https://doi.org/10.1371/journal.pone.0118053, 25.03.2019, Ulrich Herb, Open Science in der Soziologie. Eine interdisziplinäre Bestandsaufnahme zur offenen Wissenschaft und eine Untersuchung ihrer Verbreitung in der Soziologie, Glückstadt 2015, p. 134-143, accessed from: https://doi.org/10.5281/zenodo.31234, 25.03.2019, Ben Kaden, Warum Forschungsdaten nicht publiziert werden, in: LIBREAS.Dokumente, LIBREAS.Projektberichte, 13.03.2018, accessed from https://libreas.wordpress.com/2018/03/13/forschungsdatenpublikationen/, 25.03.2019.

[23] Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 6. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[24] Concerning prevalent legal issues see for example Madeleine de Cock Buning et al., The legal status of research data in the  Knowledge Exchange partner countries, 2011. Accessed from: http://repository.jisc.ac.uk/6280/, 25.03.2019, Bastian Drees, Text und Data Mining: Herausforderungen und Möglichkeiten für Bibliotheken. Perspektive Bibliothek, 5(2016:1, p. 49–73, esp. p. 59-61. Accessed from: http://dx.doi.org/10.11588/pb.2016.1.33691, Elke Brehm & Janna Neumann, Anforderungen an Open-Access-Publikationen von Forschungsdaten: Empfehlungen für einen offenen Umgang mit Forschungsdaten, in: o-bib 2018: 3, p. 1-16, p. 8-11. Accessed from: https://doi.org/10.5282/o-bib/2018H3S1-16, 25.03.2019, Anne Lauber-Rönsberg, Philipp Krahn, Paul Baumann, Gutachten zu den rechtlichen Rahmenbedingungen des Forschungsdatenmanagements im Rahmen des DataJus-Projekts (Kurzfassung), 2018. Accessed from: https://tu-dresden.de/gsw/jura/igewem/jfbimd13/ressourcen/dateien/publikationen/DataJus_Kurzfassung_Gutachten_12-07-18.pdf?lang=de&set_language=de, 25.03.2019.

[25] “The time and effort required to make research data open and accessible in accordance with the FAIR principles (Findable, Accessible, Interoperable, Re-usable) can be considerable; and those researchers who are keen to adopt ORD practices may find themselves stymied by a lack of practical guidance and specialist support.” Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 23. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 25.03.2019. This report acknowledges that some materials have already been developed (p. 24), which are from the author’s perspective often too general,  (UW often too general), and calls to increase efforts for training and education (p. 24).

[26] See for example Rat für Informationsinfrastrukturen (RfII), Leistung aus Vielfalt: Empfehlungen zu Strukturen, Prozessen und Finanzierung des Forschungsdatenmanagements in Deutschland, 2016, p. 37-39, Accessed from http://www.rfii.de/?wpdmdl=1998, 25.03.2019, DHd AG Datenzentren, Geisteswissenschaftliche Datenzentren im deutschsprachigen Raum:  Grundsatzpapier zur Sicherung der langfristigen Verfügbarkeit von Forschungsdaten (Version 1.0). DHd AG Datenzentren, 2018, p. 24. Accessed from http://doi.org/10.5281/zenodo.1134760, 25.03.2019.

[27] See Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 7. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[28] “Lebende Forschungsanwendungen spielen in den Geisteswissenschaften eine zunehmend große Rolle in der digitalen Ergebnissicherung und -präsentation. Im Gegensatz zur Buchpublikation ist jedoch die dauerhafte Erhaltung, Betreuung und Bereitstellung dieser lebenden Systeme eine technische und organisatorische Herausforderung. Während es vergleichsweise einfach möglich ist reine Forschungsdaten in Datenarchiven für die Nachwelt zu konservieren, sind lebende Systeme Teil eines digitalen Ökosystems und müssen sich diesem, z.B. in Form von Updates, regelmäßig anpassen.“ Andreas Witt et al., Forschungsdatenmanagement in den Geisteswissenschaften an der Universität zu Köln, in: o-bib 5 (2018): 3, p. 104–117, p. 111. Accessed from https://doi.org/10.5282/o-bib/2018h3s104-117, 25.03.2019. See also the website of the Project SustainLife at Cologne University: https://dch.phil-fak.uni-koeln.de/sustainlife.html, accessed 25.03.2019.

[29] Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 20.

[30] See Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 6. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[31] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 237-238. Accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019.

[32] See Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 23. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 22.03.2019.

[33] The DORA declaration recommends to give credit for more than only articles, for example also for data sets and software, see: https://sfdora.org/read/, accessed 25.03.2019.

[34] Original: “In der interdisziplinären Bündelung geisteswissenschaftlicher Datenrepositorien und der Entwicklung adäquater Forschungswerkzeuge und -dienste für verknüpfte Daten liegt eine große Chance für die geisteswissenschaftliche Forschung.“ Torsten Schrade, Im Datenozean, in: F.A.Z., 2.12. 2018, Accessed from: https://www.faz.net/-in2-9h3jj, 25.03.2019.

[35] During the last years increasingly attention is being paid to Digital Humanities pedagogy and the development of specific Digital Humanities curricula. For Digital Humanities pedagogy see for example David B. Hirsch (ed.), Digital Humanities Pedagogy: Practices, Principles and Politics, 2012. Accessed from: http://www.openbookpublishers.com/reader/161, 25.03.2019, or Matthew K. Gold,  Debates in the Digital Humanities. Minneapolis, 2012, Section V. Accessed from: http://dhdebates.gc.cuny.edu/debates/1. For curricula see for example Patrick Sahle, DH studieren! Auf dem Weg zu einem Kern- und Referenzcurriculum der Digital Humanities. Göttingen: GOEDOC, Dokumenten- und Publikationsserver der Georg-August-Universität, 2013. Accessed from http://webdoc.sub.gwdg.de/pub/mon/dariah-de/dwp-2013-1.pdf, or IANUS, Statement zu minimalen IT-Kenntnissen für Studierende der Altertumswissenschaften, 2017. Accessed from https://www.ianus-fdz.de/projects/ausbildung_qualifizierung/wiki/Empfehlungen_zu_minimalen_IT-Kenntnissen, 25.03.2019. The need to not only being able to use tools and modeling systems, but to be able “to intervene in this ecology by designing more expressive modeling systems, more effective tools, and a compelling pedagogy through which colleagues and new scholars can gain an expert purchase on these questions as well” has been underlined recently by Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 20. See also ibid. p. 23. Data literacy, including data modelling literacy is indispensable to exercise control “over our data during its entire life cycle”, ibid. p. 12.

[36] The term datafication seems to have been coined in the publication by Kenneth Neil Cukier & Viktor Mayer-Schoenberger, The Rise of Big Data: How It’s Changing the Way We Think About the World. Foreign Affairs (2013). Accessed from https://www.foreignaffairs.com/articles/2013-04-03/rise-big-data, 25.03.2019.

[37] For this aspect see Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, esp. p. 13-15, quotes p. 14 and p. 15.

[38] “Die zunehmende Digitalität in den Geisteswissenschaften macht dabei den Aufbau einer Data Literacy, also einer grundlegenden Datenkompetenz von Lernenden, Lehrenden und Forschenden, unerlässlich.” Torsten Schrade, Im Datenozean, in: F.A.Z., 2.12. 2018, Accessed from: https://www.faz.net/-in2-9h3jj, 25.03.2019.

[39] See Jonathan Gray, Carolin Gerlitz & Liliana Bonegru, Data infrastructure literacy. Big Data & Society (2018), p. 1–13. Accessed from: https://doi.org/10.1177/2053951718786316, 25.03.2019.

[40] Translated from: AG Forschungsdaten der Schwerpunktinitiative “Digitale Information” der Allianz der deutschen Wissenschaftsorganisationen, Forschungsdatenmanagement: Eine Handreichung, 2018, p. 4. Accessed from: http://doi.org/10.2312/allianzoa.029, 25.03.2019

[41] Sometimes the term data curation seems to be used (wrongly) as a synonym.

[42] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 238. Accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019.

[43] See Andreas Witt et al., Forschungsdatenmanagement in den Geisteswissenschaften an der Universität zu Köln, in: o-bib 5 (2018): 3, p. 104–117, here p. 106. Accessed from https://doi.org/10.5282/o-bib/2018h3s104-117, 25.03.2019.

[44] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 244-245. Accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019. See also: “The point here is not that these costs are prohibitive or unjustified, but rather that good strategic planning involves balancing the costs and benefits, and focusing the effort in areas that offer a clear advantage.” Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, p. 8.

[45] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 245 (accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019), who describes that the biggest difference of the digitized research process in the humanities is that researchers need to plan the research process more detailed at an earlier stage, describing their methods more explicit in order to come to machine readable data (processes).

[46] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 245-246, accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019.

[47] See for example Andreas Witt et al., Forschungsdatenmanagement in den Geisteswissenschaften an der Universität zu Köln, in: o-bib 5 (2018): 3, p. 104–117, here p. 113. Accessed from https://doi.org/10.5282/o-bib/2018h3s104-117, 25.03.2019. The authors describe that Research Data Management is already established in the curriculum of the humanities faculty of Cologne University.

[48] “There is a need for new guidance and exemplars to ensure that data meets appropriate quality standards; for tools to standardise and automate data management, documentation and curation processes; and for an increased focus on improving research software, and on recruiting and retaining software engineers.” Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 8. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 22.03.2019.

[49] About RDMO see esp. the detailed article Heike Neuroth et al., Aktives Forschungsdatenmanagement. ABI Technik, 38(2018):1, p. 55–64. Accessed from https://doi.org/10.1515/abitech-2018-0008, 25.03.2019. See also the RDMO project website: https://rdmorganiser.github.io/, accessed 25.03.2019.

[50] There are several tools available for helping creating a Research Data Management Plan (DMP), but not with a focus on active data management.

[51] My home organisation, the University of Applied Sciences Potsdam (FHP), is currently developing together with the project partners AIP (Leibniz-Institut für Astrophysik Potsdam) and KIT (Karlsruhe Institute of Technology) funded by the Deutsche Forschungsgemeinschaft (DFG) such a tool, the Research Data Management Organiser (RDMO).

[52] Research Data Management should not be a sole task for researchers, but they definitely have to be on board.

[53] “As data creators, academics have a different, more knowing relationship to their data: they create data that is going to be a persistent part of the research environment, and they act as both its creators, managers, and consumers. The stakes of the modeling decisions for research data are thus much higher, and to the extent that these decisions are mediated through tools, there is  significant value—even a burden of responsibility—in understanding that mediation. And within the academy, the stakes for digital humanists are highest of all, since their research concerns not only the knowing and critical use of data models, media, and tools, but also their critical creation.” Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 11-12.

[54] For the recommendation of local contact persons see for example, Andreas Witt et al., Forschungsdatenmanagement in den Geisteswissenschaften an der Universität zu Köln, in: o-bib 5 (2018): 3, p. 104–117, esp. p. 106, 115.

[55] See Paul Ayris et al., LIBER Open Science Roadmap, 2018, p. 18-19. Accessed from:  http://doi.org/10.5281/zenodo.1303002, 25.03.2019.

[56] This point especially relates to the creation of corpora that are digitized and made accessible in meaningful ways for research purposes, e.g. HathiTrust Digital Library (https://www.hathitrust.org/) or the Deutsche Textarchiv (DTA) (http://www.deutschestextarchiv.de/). See for example the recommendations in Lisa Klaffki, Stefan Schmunk, Thomas Stäcker, Stand der Kulturgutdigitalisierung in Deutschland: Eine Analyse und Handlungsvorschläge des DARIAH–DE Stakeholdergremiums „Wissenschaftliche Sammlungen“, Göttingen: GOEDOC, Dokumenten- und Publikationsserver der Georg-August-Universität, 2018 (DARIAH-De Working Papers, 26), Accessed from: http://webdoc.sub.gwdg.de/pub/mon/dariah-de/dwp-2018-26.pdf, 04.04.2019.

[57] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 245-246, Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 3, Torsten Schrade, Im Datenozean, in: F.A.Z., 2.12. 2018, Accessed from: https://www.faz.net/-in2-9h3jj, 25.03.2019.

[58] See for example Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 5.

The text of this blog post is published under the license CC-BY 4.0.

Cite as: Ulrike Wuttke, “Here be dragons”: Open Access to Research Data in the Humanities, Blogpost, 09.04.2019, CC-BY 4.0. Link: https://ulrikewuttke.wordpress.com/2019/04/09/open-data-humanities/.

Advertisements

3 thoughts on ““Here be dragons”: Open Access to Research Data in the Humanities

  1. Pingback: “Here be dragons”: Open Access to Research Data in the Humanities – Digital Humanities Methods and Tools

  2. Pingback: Was bedeutet es, geisteswissenschaftliches Forschungsdatenmanagement im digitalen Zeitalter zu lehren? #dhiha8 | ulrikewuttke

  3. Pingback: DH-Kolloquium an der BBAW, 2.8.2019: U. Wuttke/J. Klar: How FAR is FAIR? | DHd-Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.