Category Archives: papers

Fellowship of the Data

“Here be dragons”: Open Access to Research Data in the Humanities

Slightly modified for reading version of my talk for the conference “Innovative Library in Digital Era” (ILIDE) 2019 Conference Jasná, Slowakia, 9.04.2019

Cite as: Ulrike Wuttke, “Here be dragons”: Open Access to Research Data in the Humanities, Blogpost, 09.04.2019, CC-BY 4.0. Link: https://ulrikewuttke.wordpress.com/2019/04/09/open-data-humanities/.

In this rather lengthy blog post (which is more like a pre-print of a future article to be), I discuss the paradigm shift to reusable, machine-readable data as one pillar of Open Science or Open Scholarship (the latter being a more inclusive term for the Arts and Humanities), for Humanities and Heritage researchers. I address key challenges and perspectives of Humanities Research Data Management with a focus on educational aspects and available tools, especially the Research Data Management Organizer (RDMO) tool (https://rdmorganiser.github.io/).
This blogpost was awarded in the Open Humanities Tools and Methods Blog Competition with a travel bursary and a poster presentation at the DARIAH Annual Event 2019 at Warsaw.

Resources:

Introduction

#1 It has been estimated by Stefan Winkler Nees (from the Deutsche Forschungsgemeinschaft-DFG) in 2011 that 90% of all digital research data is lost.[1] We don’t know how many of this data belonged to the Humanities and hopefully, these numbers are better today, but we can assume that still a lot of Humanities data (and other data) is lost, because of missing infrastructures, or because no one has taken care of the long-term availability of this data in time.[2]

Are Your Data FAIR?

But, even if data is not lost: Does the available data sparkle joy, to borrow a term from the ubiquitous Marie Kondo? Are these datasets accessible for research, well documented using standards, available in interoperable formats etc., in short: is it FAIR[3] data, too?

#2 The digital transformation of research has dramatically increased the creation, gathering, and use of research data in all disciplines. It also led to the development of the new paradigm of Open Science which not only promotes that published results should be open access, but also that the underlying data should be open (for example in the H2020 Open Research Data Pilot).[4] Together, these developments resulted in a heightened demand for the publication of research data, promoted by national and international funding agencies[5], and research institutions, often directly reflected in (funding) guidelines, which can be more or less binding.[6] Among the many reasons to underpin the demand for Open Research Data, efficiency (reuse), reproducibility (transparency), impact, and public trust are prominent.[7] Slowly, the responsible handling of research data and FAIR publication of research data is becoming an integral part of Good scientific practice.[8]

The vision goes beyond that: we are talking here about the grand vision of the European Open Science Cloud (EOSC)[9] and an even bigger vision for a global Open Science Ecosystem. To realise this vision and for humanities data playing a considerable role in it, our “efforts should be guided by the twin aims of ensuring that data meets the FAIR principles, and that it is effectively preserved in trusted, certified repositories”[10]. The FAIR principles have gained since their publication in 2016[11] great acceptance as guiding principle, because they take not only into account that not all data can be open[12], leading, for example, the European Research Council to formulate the principle “as open as possible, as closed as necessary”[13], but because they also put great emphasis on “the attributes data need to have to enable and enhance reuse, by humans and machines”, in a nutshell metadata.[14]

Vision for Humanities Data

In the following, I will first line out key challenges related to Open Access to Research Data in the humanities and then discuss some perspectives to improve the current situation. I will focus on educational aspects, and tools for research data management as keystones for Open & FAIR Research Data. As the German situation is best known to me, examples are mainly drawn from a German context.

“Here be dragons”

#4 The phrase “Hic sunt dracones” (transl. “Here be Dragons”), is used on some old maps of the world to describe an area that was unknown to the cartographer. I found it quite appropriate to summarize the ambivalence of humanists towards data and all these “fancy” concepts discussed by “infrastructure people” like FAIR Data, the EOSC, or Research Data Management. First of all, it must be said that humanities researchers tend to be ambivalent about the concept of ‘data’[15] and that “[t]here are issues surrounding […] the acceptance of the ‘research data concept’”[16]. In short, they just don’t use the word “data”, but talk about “sources”, “research materials” etc., which leads to the fact that the whole “data talk” doesn’t appeal to them. Additionally, an expeditionary survey conducted by PARTHENOS in 2017 among researchers in the domain of digital humanities, language studies, and cultural heritage showed that the FAIR Principles and the EOSC, concepts and recommendations, thriving among “infrastructure folks”, are relatively little known in the research communities themselves.[17] Often, the publication of research data only comes as an afterthought (if at all).[18] However, at the end of a project, it is often too late to publish the data in a meaningful way because of the lack of documentation and the lack of resources to prepare the data properly for publishing.

#5 Let me discuss some reasons that currently seem to form barriers for establishing a culture of open sharing in the humanities. In general, “issues surrounding incentivisation”[19] can be observed. Given the strong competition and the traditional humanities reputation system based on traditional ‘long story scientific publication formats’ (monographs, book chapters, or articles as significant scientific publications) as opposed to ‘data’, there is low motivation to publish research data.[20] This lack of incentives goes hand in hand with researchers’ (perceived) fear of being scooped or that someone else will be the first to publish something based on their data, e.g. an exciting manuscript or object they have found.[21] A research tradition that has been based on (and rewarded) secretiveness is not easy to change with nothing fundamental opposite as a reward. Other prejudices often brought up are: nobody will understand the data, nobody will need the data, someone will sell the data and last but not least, (perceived) lack of technical skills.[22]

“The inherent controversy in the meaning of “data” and the importance of personal interpretation on data for humanities researchers is not conducive to sharing.”[23]

#6 Even those who are willing to publish data, as it must also be duly acknowledged that there is a long and thriving tradition of humanities corpora collection and publication continued in the digital, e.g. at the academies, are facing obstacles. Especially legal issues or doubts about possible legal issues are often mentioned in this context: The legal regulations concerning research data are complicated and not internationally aligned and there are many actors involved in the production of humanities research data, not only humanists. This leads to the fact that humanities research is often based on data under copyright restriction (from cultural heritage institutions or other actors) which makes it difficult to publish them as ORD.[24]

#7 Additionally, we are dealing with is issues around the availability and sustainability of specialist support structures for humanities research data support as well as the lack of practical guidance.[25] Humanities data centers and other data services for the humanities are often dependent on third-party funding (project based).[26] This leads to issues of trust on the side of the researchers which may result in these services not being in high demand[27] (and may even result in an unwillingness to “Go Digital” at all), it also leads to the problem of sustaining “living systems”[28] (which need to be frequently updated, migrated, and curated). However, current efforts, especially from the Digital Humanities community, also have led to positive developments:

“For example, the emergence of linked open data over the past decade has been supported both by the establishment of effective standards for modeling and disseminating such data, and the growth of practices and social expectations supporting its creation. These developments have meant that expertly modeled data from specific domains can be accessed and combined flexibly, rather than remaining in isolation or striving for self-sufficiency.”[29]

Fellowship of the Data
Fellowship of the Data

#8 To sum up my observations so far: Humanities research data, in general, is rather heterogeneous, idiosyncratic, and complex[30] and humanists are ambivalent about the term “data”. Digital practices are already part of the research activities of many humanists, especially in the Digital Humanities, but they are not equally fully developed. This leads to the fact that the potential of digital research data and methods is not fully exploited, because the digital research process is not carefully planned, with other words many research data already exist in digital form, but they are not findable, quality controlled, and reusable.[31] All in all, the land of FAIR Research Data is still unknown territory for many humanists, or at least scary as if dragons would indeed live there. In the next part, therefore, I will argue for increased efforts for awareness raising and skills building and a “fellowship of the data”, a support system to facilitate the quest for FAIR data in the humanities.

Perpectives

#9 Naturally, I cannot offer immediate solutions, but I would like to point out some paths that need to be pursued with increased intensity in the near future to facilitate FAIR data in the humanities (and beyond). These paths can focus on different stakeholders in the FAIR ecosystem such as research institutions, funding bodies, or publishers, or individual researchers and research communities. In my opinion, special attention has to be paid to the researchers and research communities themselves, so that recommendations, policies, services, etc. are aligned and known to disciplinary practices and cultures.[32] If the researchers are not on our side in this quest, we are prone to lose the battle or at least experience a delay in realising the goals.

#10 The most urgent points in my opinion are the following. We need to work on incentives for the FAIR publication of research data, e.g. wider adoption of DORA (Declaration on Research Assessment).[33] We also need to Invest in the development of beneficial environments for aggregation (think EOSC, German NFDI):

“The interdisciplinary bundling of humanities data repositories and the development of adequate research tools and services for linked data represents a great opportunity for humanities research.”[34]

Another highly important task is educating the next (and this) generation of (digital) humanities researchers[35] to deal with the datafication[36] of research and education practices, but also “infrastructure people” in discipline-specific contexts. Humanists need to be aware of limits of publication tools and how they expose (or not) the underlying data model. We need to strive to offer more idiosyncratic options and take time for critical consideration of how early choices have an impact on how data is published and can be used (curatorial perspective). It is our task to prevent research data from becoming trapped (in specific formats or hardware). In an ideal world we “design our data without tool dependencies” = “tool-agnostic approach” vs. “tool-dependent approach”[37]. The progressing digital transformation of humanities research along with the increasing importance of digital research infrastructures calls not only for a certain level of “data literacy”[38] but even for an expansion of this concept to a certain level of “data infrastructure literacy”, a term recently coined by Gray et al. 2018[39].

Much discussed in the context I have just outlined is Research Data Management:

“Research Data Management describes the process to curate (or manage) research data along the research data lifecycle and includes various activities such as planning, producing, selection, analysis, archiving, and preparation for reuse. Because data are very heterogeneous, discipline and data specific solutions can be required.”[40]

Given this, admittedly not very appealing sounding definition, it may not come as a surprise that researchers in general still consider research data management [41]as an extra, tedious, time-consuming task that diverts them from “real research” and humanists especially consider it as almost opposed to the hermeneutic humanistic research practice[42]. However, acceptance by the researchers is the key success factor for establishing standards in Research Data Management.[43] Therefore, we need to show humanists the added value of Research Data Management for the planning process of digital projects. Lemaire (2018) has recently convincingly argued that RDM is a process already inherent in the research process itself (although at the moment rather implicit) and that it can be an instrument for reviewing the research concept.[44]

#12 The digital turn of humanities research and the requirement for FAIR research data, that is sustainable and quality controlled handling of research data, requires thorough planning of the digital research process before the start, a process that is guided and documented in a Research Data Management Plan.[45] Given all the advantages of Research Data Management planning, efforts should be increased regarding awareness raising and skills building[46] already as part of university curricula[47][, and tools for consequent data management planning (and handling)[48] with the end in mind, that is (if possible) the publication of FAIR research data.

RDMO[49]

#13 Given that the management of research data is increasingly regarded as a process of active support and care during the whole research process (and not only of producing a mere static document), tools are needed that support active Research Data Management Planning[50], e.g. by providing different and up-to-date status information to different participants of the research process. I am currently part of a project that is developing such a tool, the Research Data Management Organiser (RDMO).[51]

#14 In a nutshell, with RDMO the research data management process can be organised as a collaborative effort encompassing its different stakeholders, besides researchers, especially infrastructure partners such as libraries or computing centres. One of the use scenarios of RDMO is library staff using RDMO’s question catalogue to work out the data management strategy for projects with researchers and other partners/experts.[52] RDMO can be adapted to the requirements of communities or organisations (e.g. institutional or discipline-specific guidelines) and has multilingual capabilities. At the moment we are working with a range of very different institutions and communities in Germany to further improve this tool.

Conclusion

#15 On the one hand, researchers need to be aware of the issues at hand and take their responsibility (take the issues of digital research practices seriously!).[53] On the other hand, we need to work on institutional availability and sustainability of research data (management) and support (e.g. via data centres and local experts)[54] and clever and efficient connection between initiatives on different levels. We need to provide adequate RDM tools that support researchers to prepare their data for publication from the very beginning. Libraries already have an important role in this ecosystem and I dare to say that they are capable and designated to take a leading role in this field in the future if they invest in it: in heads and infrastructure(!).[55] Especially I would like to refer here to their role in the creation of vast digital corpora of open research data, which cannot be left as a task to researchers alone.[56] Last but not least, we also need to make research data management less scary: it’s not a scientific revolution and doesn’t mean that all skills learned so far in a typical humanities curriculum are to be thrown overboard, quite the opposite is true.[57]

#16 There is a lot at stake for the humanities, maybe the very question what we want the future of the humanities to be. When it comes to Open and FAIR research data in the humanities, I can only say it with Queen: “I want it all, and I want it now!”

Queen I want it all

“I want it all, I want all, I want it all, and I want it now.” Queen (1989).


To create this broad culture of FAIR data sharing in the humanities we have to roll up our sleeves, team up, and distribute hats:

  1. Embrace Open principles,
  2. bridge the gap between the digital and the humanities and look what we can learn from the Digital Humanities and other more data-savvy disciplines.[58]

What are your thoughts and suggestions on this topic? Do you agree? Do you have additional hints for me that lead to more discipline specific information and insight about the handling of research data, data sharing, and the FAIR principles in the humanities? Please, leave your thought below or contact me via Twitter or e-mail. Looking forward to discuss this topic further with you!

Notes

[1] See Stefan Winkler-Nees, Vorwort, In: Büttner, Stephan; Hobohm, Hans-Christoph; Müller, Lars (ed.): Handbuch Forschungsdatenmanagement, Bad Honnef 2011, p. 5.

[2] See Geisteswissenschaftliche Datenzentren im deutschsprachigen Raum, Grundsatzpapier zur Sicherung der langfristigen Verfügbarkeit von Forschungsdaten, Version 1.0, DHd AG Datenzentren, 2018, p. 9. Accessed from http://doi.org/10.5281/zenodo.1134760, 25.03.2019.

[3] See Wilkinson, M. D., et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018. Accessed from https://doi.org/10.1038/sdata.2016.18, 25.03.2019. Webseite FORCE11: https://www.force11.org/group/fairgroup/fairprinciples.

[4] See European Research Council (ERC), Guidelines on Implementation of Open Access to Scientific Publications and Research Data in projects supported by the European Council under Horizon 2020, Version 1.1., 21. April 2017. Accessed from http://ec.europa.eu/research/participants/data/ref/h2020/other/hi/oa-pilot/h2020-hi-erc-oa-guide_en.pdf, 25.03.2019. See also from the UK document Concordat on Open Research Data, 2016. Accessed from https://www.ukri.org/files/legacy/documents/concordatonopenresearchdata-pdf/, 25.03.2019 or the recommendation from the Steuerungsgremium der Schwerpunktinitiative „Digitale Information“ der Allianz der deutschen Wissenschaftsorganisationen (2017), Den digitalen Wandel in der Wissenschaft gestalten: Die Schwerpunktinitiative „Digitale Information“ der Allianz der deutschen Wissenschaftsorganisationen, Leitbild 2018 – 2022, 2017, Accessed from: http://doi.org/10.2312/allianzoa.015.

[5] See for example EC Directorate-General for Research & Innovation, H2020 Programme: Guidelines on FAIR Data Management in Horizon 2020, Version 3.0, 26. July 2016, Accessed from: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf, 25.03.2019, Deutsche Forschungsgemeinschaft /DFG), Leitlinien zum Umgang mit Forschungsdaten, 30.09.2015. Accessed from: http://www.dfg.de/download/pdf/foerderung/antragstellung/forschungsdaten/richtlinien_forschungsdaten.pdf, 25.03.2019.

[6] While German funders already give recommendations, other funders already issued more binding guidelines that include the demand for a DMP, such as the Schweizerische Nationalfonds (SNF). See Schweizerische Nationalfonds, Open Research Data, (o.J.). Accessed from: http://www.snf.ch/en/theSNSF/research-policies/open_research_data/Pages/default.aspx#FAIR%20Data%20Principles%20for%20Research%20Data%20Management, 25.03.2019.

[7] “Open research data (ORD) have the potential not only to deliver greater efficiencies in research, but to improve its rigour and reproducibility, to enhance its impact, and to increase public trust in its results.” Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 3. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 22.03.2019.

[8] See recommendation 7 “Sicherung und Aufbewahrung von Primärdaten” of the DFG-Denkschrift zur Sicherung der guten wissenschaftlichen Praxis, in: Deutsche Forschungsgemeinschaft (DFG), Sicherung guter wissenschaftlicher Praxis: Empfehlungen der Kommission „Selbstkontrolle in der Wissenschaft“, Weinheim 2013, p. 21-22. Accessed from: http://doi.org/10.1002/9783527679188.oth1, 25.03.2019.

[9] https://ec.europa.eu/research/openscience/index.cfm?pg=open-science-cloud.

[10] Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 8. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 25.03.2019.

[11] The term FAIR was launched in 2014. See https://www.force11.org/group/fairgroup/fairprinciples.

[12] Principles for Open Data in Science have been formulated in the Panton Principles, demanding that data should be placed in public domain. See Panton Principles, Principles for open data in science. Murray-Rust, Peter; Neylon, Cameron; Pollock, Rufus; Wilbanks, John; (19 Feb 2010). Retrieved 31.03.2019 from https://pantonprinciples.org/. This demand has been criticized as difficult to realize because of two main reasons 1) the legal system of some countries, including Germany, does not really allow complete renunciation of rights by the right holder (i.e. public domain) 2) it removes all obligations to quote, which remove an important incentive, see Ulrich Herb, Open Science in der Soziologie. Eine interdisziplinäre Bestandsaufnahme zur offenen Wissenschaft und eine Untersuchung ihrer Verbreitung in der Soziologie, Glückstadt 2015, p. 126, accessed from: https://doi.org/10.5281/zenodo.31234, 25.03.2019.

[13] European Research Council (ERC), Guidelines on Implementation of Open Access to Scientific Publications and Research Data in projects supported by the European Council under Horizon 2020, Version 1.1., 21. April 2017, p. 6. Accessed from http://ec.europa.eu/research/participants/data/ref/h2020/other/hi/oa-pilot/h2020-hi-erc-oa-guide_en.pdf, 25.03.2019. Not all data needs to be published. A selection of what data is relevant and interesting for scientific reuse, like archives have always done, is necessary. For guidelines see for example  Angus Whyte & Andrew Wilson, How to appraise and select research data for curation, Digital Curation Centre How-to Guides, Edinburgh 2010. Accessed from: http://www.dcc.ac.uk/resources/how-guides/appraise-select-data, 25.03.2019.

[14] European Commission Expert Group on FAIR Data, Turning FAIR into reality: Final Report and Action Plan from the European Commission Expert Group on FAIR Data, Brussels 2018, p. 18, accessed from: https://ec.europa.eu/info/publications/turning-fair-reality_en, 25.03.2019. The FAIR principles provide guidance on “how to facilitate knowledge discovery by assisting humans and machines in their discovery of, access to, integration and analysis of, task-appropriate scientific data and their associated algorithms and workflows” Quote from: https://www.force11.org/group/fairgroup/fairprinciples, accessed: 25.03.2019.

[15] See for example Christof Schöch, Big? Smart? Clean? Messy? Data in the Humanities, in: Journal of Digital Humanities, 2 (2013): 3. Accessed from: http://journalofdigitalhumanities.org/2-3/big-smart-clean-messy-data-in-the-humanities/, 25.03.2019, Miriam Posner, Humanities Data: A Necessary Contradiction | Miriam Posner’s Blog, 25.06.2015. Accessed from: http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/, 25.03.2019. Insightful discussions of the complexity of talking about data in the Humanities are also offered for example by Jennifer Edmond, Georgina Nugent-Folan, D2.1 Redefining what data is and the terms we use to speak of it, KPLEX (Knowledge Complexity) Deliverable D2.1, 2018. Accessed from: https://kplexproject.files.wordpress.com/2018/07/d2-1-redefining-what-data-is-and-the-terms-we-use-to-speak-of-it.pdf, 25.03.2019, Fabian Cremer, Lisa Klaffki, Timo Steyer, T., Der Chimäre auf der Spur: Forschungsdaten in den Geisteswissenschaften. o-bib 5 (2018): 2, p. 142–162. Accessed from: https://doi.org/10.5282/o-bib/2018H2S142-162, 25.03.2019.

[16] Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 7. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[17] See PARTHENOS, The FAIR principles and the EOSC concept in the research community of Digital Humanities, Language Studies and Cultural Heritage: An expeditionary survey, 2017, esp. p. 7-8, Accessed from: http://www.parthenos-project.eu/Download/PARTHENOS_FAIR_EOSC_survey.pdf, 25.03.2019.

[18] See Elke Brehm & Janna Neumann, Anforderungen an Open-Access-Publikationen von Forschungsdaten: Empfehlungen für einen offenen Umgang mit Forschungsdaten, in: o-bib 2018: 3, p. 1-16, p. 3. Accessed from: https://doi.org/10.5282/o-bib/2018H3S1-16, 25.03.2019.

[19] See for example Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 7. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[20] The humanities are no exception here. See for example Benedikt Fecher, Cornelius Puschmann, Über die Grenzen der Offenheit in der Wissenschaft: Anspruch und Wirklichkeit bei der Bereitstellung und Nachnutzung von Forschungsdaten, in: Information – Wissenschaft & Praxis 66 (2015): 2-3, p. 146-150, p. 147. Accessed from: https://doi.org/10.1515/iwp-2015-0026, 25.03.2019, Fabian Cremer, Lisa Klaffki, Timo Steyer, T., Der Chimäre auf der Spur: Forschungsdaten in den Geisteswissenschaften. o-bib 5 (2018): 2, p. 142–162, p. 148. Accessed from: https://doi.org/10.5282/o-bib/2018H2S142-162, 25.03.2019.

[21] See for example Christine L. Borgman, Big Data, Little Data, No Data: Scholarship in the networked World. Cambridge, Mass; London, 2015, p. 177-179, where she characterises humanities data as often being considered as “club goods” (p. 177), meaning access is only granted to very specific individuals, such as local researchers. She describes on the example of the Dead Sea Scrolls this practice of local control (“hoarding”, p. 178), which stems from the fact: “Once scholars obtain access to materials, they may wish to mine the in private until they are ready to publish.” (p. 178).

[22] See for general observations about the (not) sharing of research data for example Carol Tenopir et al., Data Sharing by Scientists: Practices and Perceptions, in: PLOS ONE 6 (6), 29.06.2011, p. e21101, https://doi.org/10.1371/journal.pone.0021101,Veerle van den Eynden & Libby Bishop, Incentives and motivations for sharing research data, a researcher’s perspective. A Knowledge Exchange Report, 2014, accessed from: http://repository.jisc.ac.uk/5662/1/KE_report-incentives-for-sharing-researchdata.pdf, 25.03.2019, Benedikt Fecher et al., What Drives Academic Data Sharing? PLOS ONE, 10(2015):2, p. e0118053. Accessed from: https://doi.org/10.1371/journal.pone.0118053, 25.03.2019, Ulrich Herb, Open Science in der Soziologie. Eine interdisziplinäre Bestandsaufnahme zur offenen Wissenschaft und eine Untersuchung ihrer Verbreitung in der Soziologie, Glückstadt 2015, p. 134-143, accessed from: https://doi.org/10.5281/zenodo.31234, 25.03.2019, Ben Kaden, Warum Forschungsdaten nicht publiziert werden, in: LIBREAS.Dokumente, LIBREAS.Projektberichte, 13.03.2018, accessed from https://libreas.wordpress.com/2018/03/13/forschungsdatenpublikationen/, 25.03.2019.

[23] Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 6. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[24] Concerning prevalent legal issues see for example Madeleine de Cock Buning et al., The legal status of research data in the  Knowledge Exchange partner countries, 2011. Accessed from: http://repository.jisc.ac.uk/6280/, 25.03.2019, Bastian Drees, Text und Data Mining: Herausforderungen und Möglichkeiten für Bibliotheken. Perspektive Bibliothek, 5(2016:1, p. 49–73, esp. p. 59-61. Accessed from: http://dx.doi.org/10.11588/pb.2016.1.33691, Elke Brehm & Janna Neumann, Anforderungen an Open-Access-Publikationen von Forschungsdaten: Empfehlungen für einen offenen Umgang mit Forschungsdaten, in: o-bib 2018: 3, p. 1-16, p. 8-11. Accessed from: https://doi.org/10.5282/o-bib/2018H3S1-16, 25.03.2019, Anne Lauber-Rönsberg, Philipp Krahn, Paul Baumann, Gutachten zu den rechtlichen Rahmenbedingungen des Forschungsdatenmanagements im Rahmen des DataJus-Projekts (Kurzfassung), 2018. Accessed from: https://tu-dresden.de/gsw/jura/igewem/jfbimd13/ressourcen/dateien/publikationen/DataJus_Kurzfassung_Gutachten_12-07-18.pdf?lang=de&set_language=de, 25.03.2019.

[25] “The time and effort required to make research data open and accessible in accordance with the FAIR principles (Findable, Accessible, Interoperable, Re-usable) can be considerable; and those researchers who are keen to adopt ORD practices may find themselves stymied by a lack of practical guidance and specialist support.” Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 23. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 25.03.2019. This report acknowledges that some materials have already been developed (p. 24), which are from the author’s perspective often too general,  (UW often too general), and calls to increase efforts for training and education (p. 24).

[26] See for example Rat für Informationsinfrastrukturen (RfII), Leistung aus Vielfalt: Empfehlungen zu Strukturen, Prozessen und Finanzierung des Forschungsdatenmanagements in Deutschland, 2016, p. 37-39, Accessed from http://www.rfii.de/?wpdmdl=1998, 25.03.2019, DHd AG Datenzentren, Geisteswissenschaftliche Datenzentren im deutschsprachigen Raum:  Grundsatzpapier zur Sicherung der langfristigen Verfügbarkeit von Forschungsdaten (Version 1.0). DHd AG Datenzentren, 2018, p. 24. Accessed from http://doi.org/10.5281/zenodo.1134760, 25.03.2019.

[27] See Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 7. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[28] “Lebende Forschungsanwendungen spielen in den Geisteswissenschaften eine zunehmend große Rolle in der digitalen Ergebnissicherung und -präsentation. Im Gegensatz zur Buchpublikation ist jedoch die dauerhafte Erhaltung, Betreuung und Bereitstellung dieser lebenden Systeme eine technische und organisatorische Herausforderung. Während es vergleichsweise einfach möglich ist reine Forschungsdaten in Datenarchiven für die Nachwelt zu konservieren, sind lebende Systeme Teil eines digitalen Ökosystems und müssen sich diesem, z.B. in Form von Updates, regelmäßig anpassen.“ Andreas Witt et al., Forschungsdatenmanagement in den Geisteswissenschaften an der Universität zu Köln, in: o-bib 5 (2018): 3, p. 104–117, p. 111. Accessed from https://doi.org/10.5282/o-bib/2018h3s104-117, 25.03.2019. See also the website of the Project SustainLife at Cologne University: https://dch.phil-fak.uni-koeln.de/sustainlife.html, accessed 25.03.2019.

[29] Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 20.

[30] See Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 6. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[31] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 237-238. Accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019.

[32] See Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 23. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 22.03.2019.

[33] The DORA declaration recommends to give credit for more than only articles, for example also for data sets and software, see: https://sfdora.org/read/, accessed 25.03.2019.

[34] Original: “In der interdisziplinären Bündelung geisteswissenschaftlicher Datenrepositorien und der Entwicklung adäquater Forschungswerkzeuge und -dienste für verknüpfte Daten liegt eine große Chance für die geisteswissenschaftliche Forschung.“ Torsten Schrade, Im Datenozean, in: F.A.Z., 2.12. 2018, Accessed from: https://www.faz.net/-in2-9h3jj, 25.03.2019.

[35] During the last years increasingly attention is being paid to Digital Humanities pedagogy and the development of specific Digital Humanities curricula. For Digital Humanities pedagogy see for example David B. Hirsch (ed.), Digital Humanities Pedagogy: Practices, Principles and Politics, 2012. Accessed from: http://www.openbookpublishers.com/reader/161, 25.03.2019, or Matthew K. Gold,  Debates in the Digital Humanities. Minneapolis, 2012, Section V. Accessed from: http://dhdebates.gc.cuny.edu/debates/1. For curricula see for example Patrick Sahle, DH studieren! Auf dem Weg zu einem Kern- und Referenzcurriculum der Digital Humanities. Göttingen: GOEDOC, Dokumenten- und Publikationsserver der Georg-August-Universität, 2013. Accessed from http://webdoc.sub.gwdg.de/pub/mon/dariah-de/dwp-2013-1.pdf, or IANUS, Statement zu minimalen IT-Kenntnissen für Studierende der Altertumswissenschaften, 2017. Accessed from https://www.ianus-fdz.de/projects/ausbildung_qualifizierung/wiki/Empfehlungen_zu_minimalen_IT-Kenntnissen, 25.03.2019. The need to not only being able to use tools and modeling systems, but to be able “to intervene in this ecology by designing more expressive modeling systems, more effective tools, and a compelling pedagogy through which colleagues and new scholars can gain an expert purchase on these questions as well” has been underlined recently by Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 20. See also ibid. p. 23. Data literacy, including data modelling literacy is indispensable to exercise control “over our data during its entire life cycle”, ibid. p. 12.

[36] The term datafication seems to have been coined in the publication by Kenneth Neil Cukier & Viktor Mayer-Schoenberger, The Rise of Big Data: How It’s Changing the Way We Think About the World. Foreign Affairs (2013). Accessed from https://www.foreignaffairs.com/articles/2013-04-03/rise-big-data, 25.03.2019.

[37] For this aspect see Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, esp. p. 13-15, quotes p. 14 and p. 15.

[38] “Die zunehmende Digitalität in den Geisteswissenschaften macht dabei den Aufbau einer Data Literacy, also einer grundlegenden Datenkompetenz von Lernenden, Lehrenden und Forschenden, unerlässlich.” Torsten Schrade, Im Datenozean, in: F.A.Z., 2.12. 2018, Accessed from: https://www.faz.net/-in2-9h3jj, 25.03.2019.

[39] See Jonathan Gray, Carolin Gerlitz & Liliana Bonegru, Data infrastructure literacy. Big Data & Society (2018), p. 1–13. Accessed from: https://doi.org/10.1177/2053951718786316, 25.03.2019.

[40] Translated from: AG Forschungsdaten der Schwerpunktinitiative “Digitale Information” der Allianz der deutschen Wissenschaftsorganisationen, Forschungsdatenmanagement: Eine Handreichung, 2018, p. 4. Accessed from: http://doi.org/10.2312/allianzoa.029, 25.03.2019

[41] Sometimes the term data curation seems to be used (wrongly) as a synonym.

[42] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 238. Accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019.

[43] See Andreas Witt et al., Forschungsdatenmanagement in den Geisteswissenschaften an der Universität zu Köln, in: o-bib 5 (2018): 3, p. 104–117, here p. 106. Accessed from https://doi.org/10.5282/o-bib/2018h3s104-117, 25.03.2019.

[44] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 244-245. Accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019. See also: “The point here is not that these costs are prohibitive or unjustified, but rather that good strategic planning involves balancing the costs and benefits, and focusing the effort in areas that offer a clear advantage.” Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, p. 8.

[45] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 245 (accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019), who describes that the biggest difference of the digitized research process in the humanities is that researchers need to plan the research process more detailed at an earlier stage, describing their methods more explicit in order to come to machine readable data (processes).

[46] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 245-246, accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019.

[47] See for example Andreas Witt et al., Forschungsdatenmanagement in den Geisteswissenschaften an der Universität zu Köln, in: o-bib 5 (2018): 3, p. 104–117, here p. 113. Accessed from https://doi.org/10.5282/o-bib/2018h3s104-117, 25.03.2019. The authors describe that Research Data Management is already established in the curriculum of the humanities faculty of Cologne University.

[48] “There is a need for new guidance and exemplars to ensure that data meets appropriate quality standards; for tools to standardise and automate data management, documentation and curation processes; and for an increased focus on improving research software, and on recruiting and retaining software engineers.” Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 8. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 22.03.2019.

[49] About RDMO see esp. the detailed article Heike Neuroth et al., Aktives Forschungsdatenmanagement. ABI Technik, 38(2018):1, p. 55–64. Accessed from https://doi.org/10.1515/abitech-2018-0008, 25.03.2019. See also the RDMO project website: https://rdmorganiser.github.io/, accessed 25.03.2019.

[50] There are several tools available for helping creating a Research Data Management Plan (DMP), but not with a focus on active data management.

[51] My home organisation, the University of Applied Sciences Potsdam (FHP), is currently developing together with the project partners AIP (Leibniz-Institut für Astrophysik Potsdam) and KIT (Karlsruhe Institute of Technology) funded by the Deutsche Forschungsgemeinschaft (DFG) such a tool, the Research Data Management Organiser (RDMO).

[52] Research Data Management should not be a sole task for researchers, but they definitely have to be on board.

[53] “As data creators, academics have a different, more knowing relationship to their data: they create data that is going to be a persistent part of the research environment, and they act as both its creators, managers, and consumers. The stakes of the modeling decisions for research data are thus much higher, and to the extent that these decisions are mediated through tools, there is  significant value—even a burden of responsibility—in understanding that mediation. And within the academy, the stakes for digital humanists are highest of all, since their research concerns not only the knowing and critical use of data models, media, and tools, but also their critical creation.” Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 11-12.

[54] For the recommendation of local contact persons see for example, Andreas Witt et al., Forschungsdatenmanagement in den Geisteswissenschaften an der Universität zu Köln, in: o-bib 5 (2018): 3, p. 104–117, esp. p. 106, 115.

[55] See Paul Ayris et al., LIBER Open Science Roadmap, 2018, p. 18-19. Accessed from:  http://doi.org/10.5281/zenodo.1303002, 25.03.2019.

[56] This point especially relates to the creation of corpora that are digitized and made accessible in meaningful ways for research purposes, e.g. HathiTrust Digital Library (https://www.hathitrust.org/) or the Deutsche Textarchiv (DTA) (http://www.deutschestextarchiv.de/). See for example the recommendations in Lisa Klaffki, Stefan Schmunk, Thomas Stäcker, Stand der Kulturgutdigitalisierung in Deutschland: Eine Analyse und Handlungsvorschläge des DARIAH–DE Stakeholdergremiums „Wissenschaftliche Sammlungen“, Göttingen: GOEDOC, Dokumenten- und Publikationsserver der Georg-August-Universität, 2018 (DARIAH-De Working Papers, 26), Accessed from: http://webdoc.sub.gwdg.de/pub/mon/dariah-de/dwp-2018-26.pdf, 04.04.2019.

[57] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 245-246, Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 3, Torsten Schrade, Im Datenozean, in: F.A.Z., 2.12. 2018, Accessed from: https://www.faz.net/-in2-9h3jj, 25.03.2019.

[58] See for example Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 5.

The text of this blog post is published under the license CC-BY 4.0.

Cite as: Ulrike Wuttke, “Here be dragons”: Open Access to Research Data in the Humanities, Blogpost, 09.04.2019, CC-BY 4.0. Link: https://ulrikewuttke.wordpress.com/2019/04/09/open-data-humanities/.

The magic of Lost and Found in Oude Kerk Amsterdam

On Friday 5th of July I had the pleasure to give a presentation on apocalyptic thinking and the afterlife in the 14th century during the XL edition of Lost & Found at the Oude Kerk, Amsterdam.

Lost & Found is a series of events organized by an amazing team of volunteers featuring diverse short presentations of (future) projects, music, and other interesting formats. I think they have not very often an academic presentation, but because this evening took place in the Oude Kerk, which was built in the 14th century and we were sitting, standing, and later dancing on the graves of such famous people as the Dutch seafaring hero Jacob van Heemskerck, it seemed more than appropriate to talk about heaven and hell and Judgement Day from a medieval perspective. Also for me giving my presentation at such a special place changed my perspective and let me somehow sense more which kind of impact a medieval sermon might have had on lay people.

The Oude Kerk is truly a very special church because it is the oldest church of Amsterdam, not in use anymore as a church but as a cultural place (like many other churches in the Netherlands due to the decreasing numbers of churchgoers), and it is at the heart of the Red Light District (The Wallen). Actually only a few meters from the church there are bars and the famous windows where the prostitutes present what they have to offer. Somehow it made me really sense how intermingled religion and worldly things have always been in real life and maybe more in a crowded medieval town than now.

The whole evening started with a performance of Jugedamos called the ‘Bible performance’. This performance set the right tone for an evening with very disparate presentations that were related to each other in a very special way. First Lotte Geeven presented a plan for a project to make the deepest hole of the world to record the sounds you hear from the deep. Some believe that the disturbing sounds you hear from the deep of the earth are the sounds of the tortured souls in hell… The problem is that making the hole and the recording equipment are very expensive, so if you feel a strong urge to sponsor this fascinating project (and maybe hear your predecessors screaming), don’t hesitate to contact Lotte.

Lost & Found Oude Kerk

My presentation was welcomed warmly. I could not get rid of the feeling that Jan van Boendale, Lodewijk van Velthem, and Jan van Leeuwen would have been more than pleased that their teachings were still found relevant by a lay public almost 700 years later. I was amazed how many questions people had afterwards and that the topic that interested them most were the apocalyptic people, those wild people who will come and persecute the Christians before the End. I was very happy to discuss later with Yassine El Idrissi, who was giving a presentation on his planned documentary about the war in Syria that will be broadcasted on Dutch TV, this motif’s occurrence in the Koran.

Lost & Found Oude Kerk

Very special to me was the presentation of a short movie by Tejal Shah with the title ‘There is a spider between us’ and the following Skype interview with the artist. Her movie addresses problems she experiences about talking about her sexual orientation with her parents and how she copes with her parents’ sexuality. She described in the interview how difficult it is to talk with your parents openly when you are actually living in two separate worlds, but love each other nevertheless.

Also very enchanting was the presentation of the magician Flip Hallema. I felt like a little child again, which I actually always do when animation movies, circus, and Cotton Candy are involved. Flip climbed on a chair and let sticks disappear and repaired and removed magic knots and told us fascinating stories from his long career as a magician. Highly dramatic was the introduction of the last act, the Utrecht based band Kids with Guns, because first the microphone had to be captured by a climber from under the church roof. Seeing somebody actually going up under the roof lets you really sense how small a human is in the universe of a church building. Later we were all dancing to Kids with Guns.

Click here to read the Dutch poem written by Ellen Deckwitz over this evening and to have a look at some more pics.

Upcoming: DigHum13 Summer School 2013

I just submitted my first ever poster abstract for the DH Summer School that will take place in Leuven (Belgium) from 18 to 20 September 2013! Pretty exciting, I must say. I am really curious how the jury will like my idea about a modular digital edition of the Vierde Partie of the High German Spiegel Historiael. Of course I strongly hope they will accept it so that I will be able to receive feed back and practical advice during the poster presentation from the senior researchers that will be present. But even without a poster, I can not wait to the end of September. The program is so to say mouthwatering…

The Berlin manuscript in its original binding

The Berlin manuscript in its original binding

Upcoming: 5 July 2013, 21:00, Presentation in de Oude Kerk, Amsterdam

I will give a short English presentation on my research as Kick-Off of Lost & Found XL in the venerable, 14th c. Oude Kerk, Amsterdam. Let’s see how the Apocalypse fits with the arts!

Lost & Found XL

An extra large summer night of stray images and sounds in the oldest building of the city with a specialist in medieval literature on apocalyptic prophecy as literary device, Kids with Guns and more.

Date
Friday 5 July 2013, 8.30pm doors open, 9pm start

Price
– reservations
– 10 euro pre sale, 12 euro at the door

See: http://www.lost.nl/

Antichrist met tekst Douce apocalypse p.49

The Dragon and the Beast from the Sea

The Douce Apocalypse, Bodleian Library, (Oxford, Great Britain), Ms. Douce 180, p. 48

THATCamp Ghent 2013, 28 May 2013

This year UGent was proud to host the first THATCamp (The Humanities and Technology Camp) ever in Ghent and I was there at this historic event. Like all first timers, there were some minor problems, especially frustrating that the WiFi at Zebrastraat didn’t always work, quite annoying when you are actually officially not allowed to bring a powerpoint presentation and show websites. Luckily, some of us, especially the newbies who had not been really aware of this rule, brought a PP so we could at least look at something. But enough about things not working, because the improvised sphere is actually what a THATCamp is all about, because it is an unconference: spontaneous, non-hierarchical, and most of all fun!

And fun it was, although sometimes I was quite overwhelmed by all the information brought together by so many specialists and enthusiasts. That is a great advantage, because a THATCamp is very accessible, everybody is welcome to contribute, to ask questions, and to brainstorm together, whether you are a technology geek, librarian, texual scholar, or just very curious.

I had supposed a session about TEI, its possibilities especially for manuscript description to share my experiences from the Berlin Training School (see my last post). My session was merged with the session proposed by Kathrien Deroo who has worked already on some projects using TEI. We were joined by Thomas Crombez and Les Harrison, who also shared their experiences. Les later also had his own session together with Sally Chambers about Scholarly editions in the digital age. Taking the two sessions together, the main question that seemed to bother the participants seemed to me: How does TEI improve editing, e.g. is it worth to take all the time to encode in TEI, what is the added value? What are the advantages and disadvantages compared to other technologies? On a more theoretical level it became clear that with the digital edition, either an edition with changing appearances because of an interactive interface or different digital editions on the web, we also need more theoretical reflection. What is the text of work? Is there a stable text? How does the idea of crowd sourcing go together with aiming at a controllable/citeable scholarly edition. Is the text of a work in fact nothing more than the item seen by the reader? How to create awareness in the minds of the students that “the text” is not the Holy Grail, but that it comes in many variations and that its context has to be considered too? What should be the starting point for a text going digital, the facsimily as it appears to be closest to the actual document? I got the idea that in the digital age there is still a gap between editorial theory and practice, but somehow this also did not very much surprise me, because in the past also theory and practice were sometimes something quite different. At least, again it became very clear that you can not just do something, start encoding and being overwhelmed by the new possibilities, you have to be aware and reflect on what you are doing and why and how it changes approaches to the textual material or maybe the text at all.

For these sessions and the others sessions I refer to the website of the THATCamp, where some notes will appear taken on Titanpad. Just follow the links on the schedule site.