Tag Archives: research data management

Home (sweet) Home: My “OpenMethods” desk

Cite as: Ulrike Wuttke, Home (sweet) Home: My “OpenMethods” desk, Blogpost, 01.04.2020, CC BY 4.0. Link: https://ulrikewuttke.wordpress.com/2020/04/01/my-openmethods-desk/

This is a special blog post initiated by our brilliant OpenMethods Chief Editor Erzsébet Tóth-Czifra (@etothczifra) who asked all OpenMethods-Editors to write a brief comment about their (work) life during the current COVID-19 (Corona) crisis. So I made myself a cup of tea, put on my favourite singer-songwriter Leonard Cohen and thought about the impact this event. It became probably one of the most personal posts on this blog. If you want to read good stuff about online learning and collaborating try for example searching Twitter for  #twittercampus or #virtualcollaboration

Picture: OpenMethods: Show your desk

While spring is breaking, the world is in a lock-down. In the fierce hands of the dangerous COVID-19 virus and so far the only way we can try to stop its impact is staying away from others as far and much as possible. For some reasons unknown for me this strategy is called social distancing, while what we have to do is physical distancing. Social contacts are still possible, but reduced more or less to call, video calls, social media. Only after a few days at home and lock-down here at Berlin, I deeply miss my family, friends, and colleagues, my choir, my fellow-yogis, going out just for fun.

While musing about things I miss, I do not forget that I am very privileged. I have a nice home, my living room sports a desk and even a real office chair, I have a phone, a computer and wifi to stay connected and can do my work from home. When I go out, only for food, being grateful for all people out there, who keep things going. They definitely should earn much more than clapping. For the rest, I stay home, try to remember myself to do my yoga in my living room with my favourite YouTube-Yoga-Channels. And besides my couch I have “Invisible Women” by Caroline Criado Perez. I really hope to start reading more in it soon.

Leisure: Yoga and Scrabble

So here I am, in my “Home Office”. My employer, the University of Applied Sciences Potsdam (FHP), sent almost everyone to work from home as soon as possible, as a safety measure, and my boss allowed me to stay home for safety reasons even earlier. Since last week, the university is in “emergency mode”, officially closed. But, luckily, that doesn’t mean I am out of work.

Even under these unusual circumstances, of course, my research goes on. I work for the FHP in the project RDMO and as a distributed team we are used to online collaboration, so we are set up for this and nothing really changes. Actually, most of my collaborations are in virtual teams and also many research projects I have contributed before. But face to face meetings are important from time to time, to keep the glue together. Now, many of them have been and will be cancelled or replaced by virtual meetings.

What is harder ist that next week, the students were supposed to return to the university I teach at and new ones to arrive. I was very much looking forward to the courses I am to teach this term. At the moment being, the start of the term has been postponed until 20th of April and then the courses will start online. I am not sure if I will see my students this semester in person this year. This makes me sad and I wonder how this feels for the students.

I will do my best to teach my students online and prepared a Moodle-course for my master course “Research Data Management”, as are so many of us already or preparing to do so, often with a lot of support from our institutions. Where would we be without our eLearning-specialists to guide us? All the teaching staff who now deep-dive into digital tools they had only vaguely heard of? Without the librarians and IT-ers who work hard to give us access? The administrative staff who keeps everything on?

So, on my (virtual) desk is at the moment a lot of material about Research Data Management. Preparing online courses and working in a locked down city, also the importance of unrestricted access becomes more and more an pressing issue. I have been advocating Open Science for a while, now this demand is more urgent and obvious for so many reasons. While I cannot solve problems like people not having access to the internet or suitable devices, one of the basic requirements for more equality in digital research and teaching is more Open Access to articles and books and more OER (Open Educational Resources), Open Data, digital skills, and communication tools we can trust (and pay). I see progress and maybe even a fast-forward of the Open revolution coming, all hands are “on deck” now, and I fear that we go back to business once this crisis is over.

One of the great Open initiatives I am involved in as Deputy Chief Editor is the Digital Humanities metablog OpenMethods, a DARIAH-ERIC initiative. OpenMethods highlights curated Open Access content about Digital Humanities Methods and Tools in many languages. The curation is done by the OpenMethods Editorial Team, a very diverse group of Digital Humanities experts from around the globe. Mostly we meet online, either in a virtual meeting, or reacting to each other’s comments on the nominated posts. Last year, some of us had the chance to come together at DH Utrecht, where we promoted OpenMethods to the community. Some of us met there for the first time!

So, if you come across some great DH stuff, let us know. You can send a link, tweet to us or join us. Learn more about how to help OpenMethods grow by following this link. Spread the word about useful resources in your social networks, make your stuff Open, write a blogpost about what keeps you going, what makes you think, and new skills and perspectives you gained from this experience. I have some things coming to you too in the near future to spread the Open Revolution, like a short annotated list I dubbed “The medievalist essential guide to Open Science (Communication)”.

I truly sense the spirit that we are in this together. I am deeply grateful for my professional network and fellow tweeps (people on twitter) who share tips, ressources, and sometimes silliness. We even created a virtual space on discord for German speaking Digital Humanities. A place to “have a coffee” or an “evening beer”, hangout together and discuss eLearning or professional topics, but also to “gossip in the kitchen”! And I follow a virtual “Thursday night TEI evening class”, an initiative from great folks from the TEI who now live accompany the TEI MOOC. I joined the course especially to finally ask everything about XSLT, I didn’t dare to ask yet!

My kitchen company, I have wild ideas how to animate these cuties for a new episode of the Akademie der Wissensschafe @AKWissensschafe

Did someone just say kitchen? As an extreme coffee addict and collector of DH mugs, I created with Torsten Roeder (@torstenroeder) a little Twitter project dedicated to DH mugs. It’s called DH in a mug and you can find it under @dh_mug #DHinaMug and also #HomeOfficeDH.

Definition of a DH mug: “a DH mug” is a mug with an imprint related to Digital Humanities, a project, an institution, conference, or other nerdy stuff, but any other drinking devices are fine too, if they have a DH imprint. (Don’t get me started though on defining DH!)

So, check out the virtual collection and tweet your #mugshots to @dh_mug: the mug you use now in your HomeOfficeDH or once you are reunited with your favourite mug at your office. Tell a little story about you and your mug, where is it from? What makes it special to you? Like the little “mug story” below:

That’s so far for today. Take care, show care, and stay safe. It’s okay to be scared, it’s okay to be yourself. Be your best whatever that may be be now and be there for others if they need a you.

Maybe this is a chance to grow together virtually closer, but I am really looking forward to the real thing!

What are your thoughts? You are welcome to discuss with me on Twitter (@uwuttke) or leave a comment below.

People sorting papers with the phases of the research data lifecycle

Was bedeutet es, geisteswissenschaftliches Forschungsdatenmanagement im digitalen Zeitalter zu lehren? #dhiha8

Cite as: Ulrike Wuttke, Was bedeutet es, geisteswissenschaftliches Forschungsdatenmanagement im digitalen Zeitalter zu lehren? #dhiha8, Blogpost, 01.07.2019 (CC-BY 4.0).

Am 17. und 18. Juni 2019 fand am Deutschen Historischen Institut Paris die internationale Tagung “Teaching History in the Digital Age – International Perspectives” statt. In ihrem Mittelpunkt standen die Herausforderungen in der Lehre angesichts der zunehmenden Digitalisierung der Geschichtswissenschaften. Die Tagung fiel in drei Teile: 1) zwei Hands-On Workshops, 2) Vorträge und 3) Barcamp, sowie eine virtuelle “Blogparade” und es wurde fleißig unter dem Hashtag #dhiha8 getweetet. 

Da ich es nicht geschafft habe, im Vorfeld einen Beitrag zur Blogparade zu liefern, habe ich beschlossen, meine persönlichen Eindrücke von dem von mir geleiteten Forschungsdatenmanagement-Workshop “Is your Research Future Proof? Data Management Techniques & Tools for Digital Historians” in diesem Blogbeitrag zu teilen.

Im Mittelpunkt der gesamten Tagung stand die Frage, was es bedeutet, Geschichte im digitalen Zeitalter zu lehren. Nun habe ich zwar nie wirklich Geschichte gelehrt, abgesehen von ein paar Gastseminaren, aber als Digitale Humanista mit den Schwerpunkten mediävistische Literaturwissenschaft, Editionswissenschaften, Open Science und Forschungsdatenmanagement, habe ich mir anlässlich der DHI-Tagung ein paar Gedanken zu diesem Thema gemacht, mit denen ich meinen Bericht einleiten möchte.

Wer und wie bewahren wir geschichtliche Quellen im digitalen Zeitalter? Eine Annäherung aus infrastruktureller Sicht

Die Quelle der Geschichte ist die Vergangenheit. Ein großer Teil unseres Wissens über die Vergangenheit beruht seit der Erfindung der Schrift auf schriftlichen Quellen. Quellen sind also der Forschungsschatz der Geschichtsforschung schlechthin. Hier kommen sich Literaturgeschichte und Geschichte sehr nahe, auch wenn sie unterschiedliche Quellen und Methoden zentral stellen. Ein wichtiger Aspekt scheint mir daher in der Lehre der Geschichte die Zeitlichkeit von Quellen, in dem Sinne eines Verständnisses für die Fragilität dieser Quellen, zu sein.

Was verstehe ich unter Zeitlichkeit? Welche historischen Quellen wir untersuchen können, scheint vom Zufall abhängig zu sein. Viele historische Quellen sind im Verlauf der Geschichte unwiederbringlich verloren gegangen. Doch es ist nicht allein der Zufall, der über die Überlieferung entscheidet. In vielen Fällen danken wir den Erhalt historischer Quellen der gezielten Sammlung und Bestandserhaltungsmaßnahmen durch Bibliotheken, Archive und in Einzelfällen Privatpersonen, die durch die Aufnahme dieser Quellen in ihren Bestand, also den gezielten Akt des Sammelns und Bewahrens, einen großen Anteil daran, ob eine Quelle überhaupt zum Objekt der historischen Forschung werden werden kann. Dann erst treffen Historiker*innen eine Auswahl, ob und wie diese Quellen in ihr Forschungskonzept passen.

Durch diese ihnen inhärente fragile Zeitlichkeit sind die Quellen der historischen Forschung immer grundsätzlich von der Vernichtung bedroht, existieren viele Leerstellen und bedarf es gezielter Maßnahmen, damit die Quellen auch in der Zukunft der Forschung zur Verfügung stehen. Diese Aufgabe wurde für analoge Objekte lange Zeit vor allem durch Archive und Bibliotheken erfüllt. Durch die zunehmende Digitalisierung der Forschung und Lehre verändert sich diese Situation und werden althergebrachte Rollenverteilungen und Tradierungswege in Frage gestellt. Dazu kommt, dass digitale Quellen fragiler sind als analoge Quellen. Vereinfacht ausgedrückt, verhält es sich so, dass während man analoge Quellen am Besten “in Ruhe lässt” (wenig Licht, wenig Bewegung etc.), digitale Quellen ständig “betreut” werden müssen, also einen viel höheren Betreuungsaufwand erfordern. Mit anderen Worten, digital codierte Informationen sind flüchtiger und erfordern intensivere Erhaltungsmaßnahmen (Stichworte: Forschungsdatenmanagement, Datenkuration etc.) als analoge Quellen. Grundvoraussetzung hier ist natürlich, dass diese Quellen (aka Forschungsdaten) überhaupt zur Verfügung stehen, eine Frage bei der Wissenschaftler*innen auch stark individuell gefordert sind (Stichwort Open Science und FAIRe Daten) und dann gemeinsame Anstrengungen aller beteiligter Akteure, um dieses Ziel zu erreichen. Für weiterführende Gedanken hierzu verweise ich gerne auf meinen Blogpost “Here be dragons: Open Access to Research Data in the Humanities” (2019)

https://platform.twitter.com/widgets.js

Ein wichtiger Aspekt ist angesichts dieser Überlegungen die Bewusstseinsförderung und die Vermittlung konkreter Technikenbei (zukünftiger) Historiker*innen im Bereich Forschungsdaten und nachhaltiges Forschungsdatenmanagement. Diese Aspekte sollten sowohl schon in der Lehre eine Rolle spielen, auch wenn die Curriculum-Entwicklung noch in den Kinderschuhen steckt (über dieses Thema haben wir ausführlicher während einer Sessions des Barcamps gesprochen), als auch nach dem Studium. Ein probates Mittel für die außercurriculare Vermittlung von Forschungsdatenkompetenz sind Workshops, wie der von mir angeleitete Workshop vor dem Anfang der Tagung, von dem ich im Folgenden detaillierter berichten möchte. 

Workshop “Is your Research Future Proof? Data Management Techniques & Tools for Digital Historians”

Durch verschiedene Impulsvorträge und spielerische Übungen wurden die Teilnehmer*innen für aktives Forschungsdatenmanagement,  z. B. mit einem Datenmanagementplanungstool wie RDMO (Research Data Management Organiser) sensibilisiert. Das Ziel von aktivem Forschungsdatenmanagement ist die Produktion und nachhaltige Bereitstellung von Daten nach den FAIR-Prinzipien. Im Workshop kamen Punkte wie die Grundprinzipien von Open Science und die Relevanz der FAIR-Prinzipien für die digitalen Geisteswissenschaften zur Sprache, sowie verschiedene Datentypen und -formate in den Geisteswissenschaften und der Lebenszyklus von Forschungsdaten und die Themengebiete eines Datenmanagementplans (Data Collection, Data Documentation, Ethics and Legal Compliance, Storage and Backup, Selection and Preservation, Data Sharing and Responsibility and Resources). Viele der Übungen und inhaltliche Anregungen verdanke ich übrigens dem Train-the-Trainer Konzept zum Thema Forschungsdatenmanagement von FD-Mentor.  

Angeregte Diskussion um die Reihenfolge des Forschungsdatenlebenzyklus
Das Ergebnis einer der Gruppen: Forschungsdatenlebenzyklus mit WOW-Effekt!

Bei der Übung zum Datenlebenszyklus wurde deutlich, dass Geisteswissenschaftler*innen unter Preservation etwas anderes verstehen als FDM-Expert*innen, Preservation war für die Teilnehmer*innen alles Speicherung… Diese Beobachtung weist darauf hin, dass wir uns stärker diese “Sprachunterschiede” bewusst machen müssen. Sie können auch Auswirkungen darauf haben, wie Geisteswissenschaftler*innen die Fragen eines Datemenmanagementplanungstools verstehen (oder im schlimmsten Fall nicht verstehen…)

Während des Workshops war auch Zeit für kritische Diskussionen, z. B. dass der für gutes Datenmanagement notwendige Aufwand noch in keinem guten Verhältnis zur (fehlenden) Anerkennung steht. Hier sind Forschungseinrichtungen und -förderer gefragt, entsprechende Anreize zu liefern. Warum nicht nachhaltiges Forschungsdatenmanagement und die Bereitstellung von FAIRen Forschungsdaten besser belohnen, z. B. als Teil von Ausschreibungsprofilen oder bei der Fördermittelvergabe (“Hand in your CV and a data management plan”, How have you contributed to Open Science and what are your plans in the future?). Richtungsweisend könnte diesbezüglich DORA sein. 

Angeregt diskutiert wurde durch die Teilnehmer*innen die Frage, ob Datenmanagementpläne die Illusion erwecken, dass geisteswissenschaftliche Forschung ein vorhersehbarer Prozess ist, was von den Teilnehmer*innen verneint wurde. Ich habe den Standpunkt vertreten, dass in fast allen Fällen, in denen man Mittel einwerben möchte, ein Forschungsplan erforderlich ist, dessen Bestandteil zunehmend ein dedizierter DMP ist. Letzterer ist nicht in Stein gemeißelt, sondern sollte im Projektverlauf angepasst werden, besonders unter dem Vorzeichen des “aktiven Datenmanagements”, d. h. dem Projektverlauf entsprechend geupdated werden. Der Forschungsdatenmanagementplan ist eine ausgezeichnete Chance, den geplanten Projektverlauf gezielt von der digitalen Seite zu betrachten (mehr hierzu in einem lesenwerten Artikel von Marina Lemaire). 

Auch wurde die Frage gestellt, ab wann man eigentlich einen Datenmanagementplan  schreiben sollte, besonders wenn man eigentlich nicht besonders “digital” forscht? Hierzu kamen mir im Nachhinein zwei Gedanken. 

  1. Durch die zunehmende Digitalisierung aller Bereiche der Gesellschaft, nicht nur der Wissenschaft, stehen wir momentan vor der fast paradoxen Situation, dass wir potenziell über eine ungekannte Menge an Daten verfügen, die zum Einen schnell wieder verloren gehen können und zum Anderen auch nicht alle aufgehoben werden müssen. Eine sinnvolle Selektion gehörte schon immer zum Bestandsmanagement von Bibliotheken und Archiven. Für die Forschung bedeutet die zunehmende Digitalisierung einen Paradigmenwechsel, der mehr Verantwortung auf die einzelnen Wissenschaftler*innen überträgt, nicht zuletzt auch durch die damit einhergehende Relevanz offener Prinzipien (Open Science). Ein Paradigmenwechsel, der auch, ich möchte fast sagen, vor allem seinen Niederschlag in der Lehre finden sollte, wobei die Grenzen zwischen Basic Skills des Personal Archiving und wissenschaftlichem Forschungsdatenmanagement fließend sind, wobei gerade im letzteren Bereich sehr viel zusätzliche Unterstützungsangebote existieren (sollten!).  
  2. Studierende und Forschende sollten die Grundbegriffe und -prinzipien eines nachhaltigen Forschungsdatenmanagements unter dem Open Science Paradigma kennen. So werden sie selbst ermächtigt, sich den neuen Herausforderungen zu stellen bzw. diese Kenntnisse weiterzugeben. Vielleicht ist es sogar kontraproduktiv, das Ziel “Schreiben eines Datenmanagementplans (DMP)” zu sehr in den Mittelpunkt von Trainings zu stellen, da es um Methoden geht, die eigentlich in Fleisch und Blut übergehen sollten? Letztendlich ist der DMP ein Mittel zum Zweck, sich zu allen wichtigen Bereichen Gedanken zu machen und die Ergebnisse strukturiert festzuhalten.

Workshop: Ablauf und Feedback

Abschließend ein paar Gedanken zum Ablauf des Workshops unter Einbeziehung des Feedbacks der Teilnehmer*innen. 

Es ist das erste Mal und daher etwas unheimlich, aber ich hege die Hoffnung, dass andere Trainer*innen hiervon etwas für ihre eigenen Trainings etwas haben könnten.

Eigene Einschätzung:

  • Workshop ist sehr gut gelaufen und gutes Feedback 
  • Relativ víel Stoff und Übungen geplant für 3h, erfordert straffes Zeitmanagement, da wir die Pause überzogen hatten, war am Ende die Zeit etwas knapp  
  • Die Teilnehmer*innen haben den meisten Lernerfolg, wenn sie selbst etwas machen bzw. diskutieren können 
  • Es war relativ wenig Zeit, um RDMO auszuprobieren, auch wenn das nicht unmittelbar im Mittelpunkt stand
  • Vorheriges Bereitstellen der Folien erleichtert Partizipation, sogar von “außerhalb” (Danke für’s Feedback!) 

Feedback der Teilnehmer*innen (alles Wissenschaftler*innen):

  • Gesamteindruck der Teilnehmer: Zwischen 1 und 2 (1 ist die Beste Note) 
  • Größte Schwäche des Workshops: mehr Raum für Diskussionen lassen, z. B. über spezifische Forschungsansätze der Teilnehmer*innen, noch mehr gezielte Nachfragen, ob es Fragen gibt, mehr Zeit zum Schreiben des eigenen DMP, noch mehr konkrete Beispiele, z. B. wie Geisteswissenschaftler*innen Daten produzieren, publizieren, relevante Projekte 
  • Bedeutendste Stärke des Workshops: klare Struktur, interessante Selektion der Themengebiete, gute Übersicht zu Open Science, interessante Werkzeuge, Links, Materialien etc., Pädagogik, Präsentation, interaktive Gruppenarbeit  
  • Weitere Kommentare: Dank für alle interessanten Werkzeuge, weniger zu Open Access, mehr konkrete Arbeit am DMP, mehr Zeit für Diskussionen, mehr konkrete Beispiele
  • die Mehrzahl der Teilnehmer*innen bewertete ihren eigenen Lernerfolg und die praktische Anwendbarkeit der Workshopinhalte als “very good” bis “good” auch diejenigen, die schon viel Erfahrung hatten 

Lessons Learned: 

  • Wichtig ist: Management der Erwartungshaltung (Was wird im Workshop behandelt? An wen richtet sich der Workshop?)
  • Erkennen, dass man es nicht allen Recht machen kann, aber stärker versuchen auf Interessen der Teilnehmer*innen einzugehen (z. B. Themenabfrage) 
  • Weniger Stoff ist mehr! Aber schwieriger Balanceakt, weil Gruppen qua Vorkenntnissen und Disziplin oft gemischt sind, schwer bestimmte Aspekte als bekannt vorauszusetzen (z. B. in Zukunft weniger auf Open Science im Allgemeinen eingehen und Fragen des geisteswissenschaftlichen FDM noch stärker in den Vordergrund rücken)  
  • Noch mehr Zeit für Diskussionen und Gruppenaktivitäten einplanen bzw. “Zeitpuffer” 
  • Grundsätzliche Frage: Was sind allgemeine Aspekte geisteswissenschaftlichen Forschungsdatenmanagements, was sind disziplinspezifische Aspekte und was ist spezifisches Digital Humanities-Wissen? Inwieweit überlappen sich dieser Bereiche (DINI UAG Schulungen und Lehre) 
  • Coole Incentives mitbringen, z. B. Sticker wie die von @MelImming und @Protohedgehog 😉

Some Research Data Management resources I recommend: 

Siehe auch:
Jan-Luca Albrecht; Ina Serif, Tagungsbericht: Teaching History in the Digital Age – International Perspectives #dhiha8, 17.06.2019 – 18.06.2019 Paris, in: H-Soz-Kult, 26.08.2019, <www.hsozkult.de/conferencereport/id/tagungsberichte-8405>.

Fellowship of the Data

“Here be dragons”: Open Access to Research Data in the Humanities

Slightly modified for reading version of my talk for the conference “Innovative Library in Digital Era” (ILIDE) 2019 Conference Jasná, Slowakia, 9.04.2019

Cite as: Ulrike Wuttke, “Here be dragons”: Open Access to Research Data in the Humanities, Blogpost, 09.04.2019, CC-BY 4.0. Link: https://ulrikewuttke.wordpress.com/2019/04/09/open-data-humanities/.

In this rather lengthy blog post (which is more like a pre-print of a future article to be), I discuss the paradigm shift to reusable, machine-readable data as one pillar of Open Science or Open Scholarship (the latter being a more inclusive term for the Arts and Humanities), for Humanities and Heritage researchers. I address key challenges and perspectives of Humanities Research Data Management with a focus on educational aspects and available tools, especially the Research Data Management Organizer (RDMO) tool (https://rdmorganiser.github.io/).
This blogpost was awarded in the Open Humanities Tools and Methods Blog Competition with a travel bursary and a poster presentation at the DARIAH Annual Event 2019 at Warsaw.

Resources:

Introduction

#1 It has been estimated by Stefan Winkler Nees (from the Deutsche Forschungsgemeinschaft-DFG) in 2011 that 90% of all digital research data is lost.[1] We don’t know how many of this data belonged to the Humanities and hopefully, these numbers are better today, but we can assume that still a lot of Humanities data (and other data) is lost, because of missing infrastructures, or because no one has taken care of the long-term availability of this data in time.[2]

Are Your Data FAIR?

But, even if data is not lost: Does the available data sparkle joy, to borrow a term from the ubiquitous Marie Kondo? Are these datasets accessible for research, well documented using standards, available in interoperable formats etc., in short: is it FAIR[3] data, too?

#2 The digital transformation of research has dramatically increased the creation, gathering, and use of research data in all disciplines. It also led to the development of the new paradigm of Open Science which not only promotes that published results should be open access, but also that the underlying data should be open (for example in the H2020 Open Research Data Pilot).[4] Together, these developments resulted in a heightened demand for the publication of research data, promoted by national and international funding agencies[5], and research institutions, often directly reflected in (funding) guidelines, which can be more or less binding.[6] Among the many reasons to underpin the demand for Open Research Data, efficiency (reuse), reproducibility (transparency), impact, and public trust are prominent.[7] Slowly, the responsible handling of research data and FAIR publication of research data is becoming an integral part of Good scientific practice.[8]

The vision goes beyond that: we are talking here about the grand vision of the European Open Science Cloud (EOSC)[9] and an even bigger vision for a global Open Science Ecosystem. To realise this vision and for humanities data playing a considerable role in it, our “efforts should be guided by the twin aims of ensuring that data meets the FAIR principles, and that it is effectively preserved in trusted, certified repositories”[10]. The FAIR principles have gained since their publication in 2016[11] great acceptance as guiding principle, because they take not only into account that not all data can be open[12], leading, for example, the European Research Council to formulate the principle “as open as possible, as closed as necessary”[13], but because they also put great emphasis on “the attributes data need to have to enable and enhance reuse, by humans and machines”, in a nutshell metadata.[14]

Vision for Humanities Data

In the following, I will first line out key challenges related to Open Access to Research Data in the humanities and then discuss some perspectives to improve the current situation. I will focus on educational aspects, and tools for research data management as keystones for Open & FAIR Research Data. As the German situation is best known to me, examples are mainly drawn from a German context.

“Here be dragons”

#4 The phrase “Hic sunt dracones” (transl. “Here be Dragons”), is used on some old maps of the world to describe an area that was unknown to the cartographer. I found it quite appropriate to summarize the ambivalence of humanists towards data and all these “fancy” concepts discussed by “infrastructure people” like FAIR Data, the EOSC, or Research Data Management. First of all, it must be said that humanities researchers tend to be ambivalent about the concept of ‘data’[15] and that “[t]here are issues surrounding […] the acceptance of the ‘research data concept’”[16]. In short, they just don’t use the word “data”, but talk about “sources”, “research materials” etc., which leads to the fact that the whole “data talk” doesn’t appeal to them. Additionally, an expeditionary survey conducted by PARTHENOS in 2017 among researchers in the domain of digital humanities, language studies, and cultural heritage showed that the FAIR Principles and the EOSC, concepts and recommendations, thriving among “infrastructure folks”, are relatively little known in the research communities themselves.[17] Often, the publication of research data only comes as an afterthought (if at all).[18] However, at the end of a project, it is often too late to publish the data in a meaningful way because of the lack of documentation and the lack of resources to prepare the data properly for publishing.

#5 Let me discuss some reasons that currently seem to form barriers for establishing a culture of open sharing in the humanities. In general, “issues surrounding incentivisation”[19] can be observed. Given the strong competition and the traditional humanities reputation system based on traditional ‘long story scientific publication formats’ (monographs, book chapters, or articles as significant scientific publications) as opposed to ‘data’, there is low motivation to publish research data.[20] This lack of incentives goes hand in hand with researchers’ (perceived) fear of being scooped or that someone else will be the first to publish something based on their data, e.g. an exciting manuscript or object they have found.[21] A research tradition that has been based on (and rewarded) secretiveness is not easy to change with nothing fundamental opposite as a reward. Other prejudices often brought up are: nobody will understand the data, nobody will need the data, someone will sell the data and last but not least, (perceived) lack of technical skills.[22]

“The inherent controversy in the meaning of “data” and the importance of personal interpretation on data for humanities researchers is not conducive to sharing.”[23]

#6 Even those who are willing to publish data, as it must also be duly acknowledged that there is a long and thriving tradition of humanities corpora collection and publication continued in the digital, e.g. at the academies, are facing obstacles. Especially legal issues or doubts about possible legal issues are often mentioned in this context: The legal regulations concerning research data are complicated and not internationally aligned and there are many actors involved in the production of humanities research data, not only humanists. This leads to the fact that humanities research is often based on data under copyright restriction (from cultural heritage institutions or other actors) which makes it difficult to publish them as ORD.[24]

#7 Additionally, we are dealing with is issues around the availability and sustainability of specialist support structures for humanities research data support as well as the lack of practical guidance.[25] Humanities data centers and other data services for the humanities are often dependent on third-party funding (project based).[26] This leads to issues of trust on the side of the researchers which may result in these services not being in high demand[27] (and may even result in an unwillingness to “Go Digital” at all), it also leads to the problem of sustaining “living systems”[28] (which need to be frequently updated, migrated, and curated). However, current efforts, especially from the Digital Humanities community, also have led to positive developments:

“For example, the emergence of linked open data over the past decade has been supported both by the establishment of effective standards for modeling and disseminating such data, and the growth of practices and social expectations supporting its creation. These developments have meant that expertly modeled data from specific domains can be accessed and combined flexibly, rather than remaining in isolation or striving for self-sufficiency.”[29]

Fellowship of the Data
Fellowship of the Data

#8 To sum up my observations so far: Humanities research data, in general, is rather heterogeneous, idiosyncratic, and complex[30] and humanists are ambivalent about the term “data”. Digital practices are already part of the research activities of many humanists, especially in the Digital Humanities, but they are not equally fully developed. This leads to the fact that the potential of digital research data and methods is not fully exploited, because the digital research process is not carefully planned, with other words many research data already exist in digital form, but they are not findable, quality controlled, and reusable.[31] All in all, the land of FAIR Research Data is still unknown territory for many humanists, or at least scary as if dragons would indeed live there. In the next part, therefore, I will argue for increased efforts for awareness raising and skills building and a “fellowship of the data”, a support system to facilitate the quest for FAIR data in the humanities.

Perpectives

#9 Naturally, I cannot offer immediate solutions, but I would like to point out some paths that need to be pursued with increased intensity in the near future to facilitate FAIR data in the humanities (and beyond). These paths can focus on different stakeholders in the FAIR ecosystem such as research institutions, funding bodies, or publishers, or individual researchers and research communities. In my opinion, special attention has to be paid to the researchers and research communities themselves, so that recommendations, policies, services, etc. are aligned and known to disciplinary practices and cultures.[32] If the researchers are not on our side in this quest, we are prone to lose the battle or at least experience a delay in realising the goals.

#10 The most urgent points in my opinion are the following. We need to work on incentives for the FAIR publication of research data, e.g. wider adoption of DORA (Declaration on Research Assessment).[33] We also need to Invest in the development of beneficial environments for aggregation (think EOSC, German NFDI):

“The interdisciplinary bundling of humanities data repositories and the development of adequate research tools and services for linked data represents a great opportunity for humanities research.”[34]

Another highly important task is educating the next (and this) generation of (digital) humanities researchers[35] to deal with the datafication[36] of research and education practices, but also “infrastructure people” in discipline-specific contexts. Humanists need to be aware of limits of publication tools and how they expose (or not) the underlying data model. We need to strive to offer more idiosyncratic options and take time for critical consideration of how early choices have an impact on how data is published and can be used (curatorial perspective). It is our task to prevent research data from becoming trapped (in specific formats or hardware). In an ideal world we “design our data without tool dependencies” = “tool-agnostic approach” vs. “tool-dependent approach”[37]. The progressing digital transformation of humanities research along with the increasing importance of digital research infrastructures calls not only for a certain level of “data literacy”[38] but even for an expansion of this concept to a certain level of “data infrastructure literacy”, a term recently coined by Gray et al. 2018[39].

Much discussed in the context I have just outlined is Research Data Management:

“Research Data Management describes the process to curate (or manage) research data along the research data lifecycle and includes various activities such as planning, producing, selection, analysis, archiving, and preparation for reuse. Because data are very heterogeneous, discipline and data specific solutions can be required.”[40]

Given this, admittedly not very appealing sounding definition, it may not come as a surprise that researchers in general still consider research data management [41]as an extra, tedious, time-consuming task that diverts them from “real research” and humanists especially consider it as almost opposed to the hermeneutic humanistic research practice[42]. However, acceptance by the researchers is the key success factor for establishing standards in Research Data Management.[43] Therefore, we need to show humanists the added value of Research Data Management for the planning process of digital projects. Lemaire (2018) has recently convincingly argued that RDM is a process already inherent in the research process itself (although at the moment rather implicit) and that it can be an instrument for reviewing the research concept.[44]

#12 The digital turn of humanities research and the requirement for FAIR research data, that is sustainable and quality controlled handling of research data, requires thorough planning of the digital research process before the start, a process that is guided and documented in a Research Data Management Plan.[45] Given all the advantages of Research Data Management planning, efforts should be increased regarding awareness raising and skills building[46] already as part of university curricula[47][, and tools for consequent data management planning (and handling)[48] with the end in mind, that is (if possible) the publication of FAIR research data.

RDMO[49]

#13 Given that the management of research data is increasingly regarded as a process of active support and care during the whole research process (and not only of producing a mere static document), tools are needed that support active Research Data Management Planning[50], e.g. by providing different and up-to-date status information to different participants of the research process. I am currently part of a project that is developing such a tool, the Research Data Management Organiser (RDMO).[51]

#14 In a nutshell, with RDMO the research data management process can be organised as a collaborative effort encompassing its different stakeholders, besides researchers, especially infrastructure partners such as libraries or computing centres. One of the use scenarios of RDMO is library staff using RDMO’s question catalogue to work out the data management strategy for projects with researchers and other partners/experts.[52] RDMO can be adapted to the requirements of communities or organisations (e.g. institutional or discipline-specific guidelines) and has multilingual capabilities. At the moment we are working with a range of very different institutions and communities in Germany to further improve this tool.

Conclusion

#15 On the one hand, researchers need to be aware of the issues at hand and take their responsibility (take the issues of digital research practices seriously!).[53] On the other hand, we need to work on institutional availability and sustainability of research data (management) and support (e.g. via data centres and local experts)[54] and clever and efficient connection between initiatives on different levels. We need to provide adequate RDM tools that support researchers to prepare their data for publication from the very beginning. Libraries already have an important role in this ecosystem and I dare to say that they are capable and designated to take a leading role in this field in the future if they invest in it: in heads and infrastructure(!).[55] Especially I would like to refer here to their role in the creation of vast digital corpora of open research data, which cannot be left as a task to researchers alone.[56] Last but not least, we also need to make research data management less scary: it’s not a scientific revolution and doesn’t mean that all skills learned so far in a typical humanities curriculum are to be thrown overboard, quite the opposite is true.[57]

#16 There is a lot at stake for the humanities, maybe the very question what we want the future of the humanities to be. When it comes to Open and FAIR research data in the humanities, I can only say it with Queen: “I want it all, and I want it now!”

Queen I want it all

“I want it all, I want all, I want it all, and I want it now.” Queen (1989).


To create this broad culture of FAIR data sharing in the humanities we have to roll up our sleeves, team up, and distribute hats:

  1. Embrace Open principles,
  2. bridge the gap between the digital and the humanities and look what we can learn from the Digital Humanities and other more data-savvy disciplines.[58]

What are your thoughts and suggestions on this topic? Do you agree? Do you have additional hints for me that lead to more discipline specific information and insight about the handling of research data, data sharing, and the FAIR principles in the humanities? Please, leave your thought below or contact me via Twitter or e-mail. Looking forward to discuss this topic further with you!

Notes

[1] See Stefan Winkler-Nees, Vorwort, In: Büttner, Stephan; Hobohm, Hans-Christoph; Müller, Lars (ed.): Handbuch Forschungsdatenmanagement, Bad Honnef 2011, p. 5.

[2] See Geisteswissenschaftliche Datenzentren im deutschsprachigen Raum, Grundsatzpapier zur Sicherung der langfristigen Verfügbarkeit von Forschungsdaten, Version 1.0, DHd AG Datenzentren, 2018, p. 9. Accessed from http://doi.org/10.5281/zenodo.1134760, 25.03.2019.

[3] See Wilkinson, M. D., et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018. Accessed from https://doi.org/10.1038/sdata.2016.18, 25.03.2019. Webseite FORCE11: https://www.force11.org/group/fairgroup/fairprinciples.

[4] See European Research Council (ERC), Guidelines on Implementation of Open Access to Scientific Publications and Research Data in projects supported by the European Council under Horizon 2020, Version 1.1., 21. April 2017. Accessed from http://ec.europa.eu/research/participants/data/ref/h2020/other/hi/oa-pilot/h2020-hi-erc-oa-guide_en.pdf, 25.03.2019. See also from the UK document Concordat on Open Research Data, 2016. Accessed from https://www.ukri.org/files/legacy/documents/concordatonopenresearchdata-pdf/, 25.03.2019 or the recommendation from the Steuerungsgremium der Schwerpunktinitiative „Digitale Information“ der Allianz der deutschen Wissenschaftsorganisationen (2017), Den digitalen Wandel in der Wissenschaft gestalten: Die Schwerpunktinitiative „Digitale Information“ der Allianz der deutschen Wissenschaftsorganisationen, Leitbild 2018 – 2022, 2017, Accessed from: http://doi.org/10.2312/allianzoa.015.

[5] See for example EC Directorate-General for Research & Innovation, H2020 Programme: Guidelines on FAIR Data Management in Horizon 2020, Version 3.0, 26. July 2016, Accessed from: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf, 25.03.2019, Deutsche Forschungsgemeinschaft /DFG), Leitlinien zum Umgang mit Forschungsdaten, 30.09.2015. Accessed from: http://www.dfg.de/download/pdf/foerderung/antragstellung/forschungsdaten/richtlinien_forschungsdaten.pdf, 25.03.2019.

[6] While German funders already give recommendations, other funders already issued more binding guidelines that include the demand for a DMP, such as the Schweizerische Nationalfonds (SNF). See Schweizerische Nationalfonds, Open Research Data, (o.J.). Accessed from: http://www.snf.ch/en/theSNSF/research-policies/open_research_data/Pages/default.aspx#FAIR%20Data%20Principles%20for%20Research%20Data%20Management, 25.03.2019.

[7] “Open research data (ORD) have the potential not only to deliver greater efficiencies in research, but to improve its rigour and reproducibility, to enhance its impact, and to increase public trust in its results.” Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 3. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 22.03.2019.

[8] See recommendation 7 “Sicherung und Aufbewahrung von Primärdaten” of the DFG-Denkschrift zur Sicherung der guten wissenschaftlichen Praxis, in: Deutsche Forschungsgemeinschaft (DFG), Sicherung guter wissenschaftlicher Praxis: Empfehlungen der Kommission „Selbstkontrolle in der Wissenschaft“, Weinheim 2013, p. 21-22. Accessed from: http://doi.org/10.1002/9783527679188.oth1, 25.03.2019.

[9] https://ec.europa.eu/research/openscience/index.cfm?pg=open-science-cloud.

[10] Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 8. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 25.03.2019.

[11] The term FAIR was launched in 2014. See https://www.force11.org/group/fairgroup/fairprinciples.

[12] Principles for Open Data in Science have been formulated in the Panton Principles, demanding that data should be placed in public domain. See Panton Principles, Principles for open data in science. Murray-Rust, Peter; Neylon, Cameron; Pollock, Rufus; Wilbanks, John; (19 Feb 2010). Retrieved 31.03.2019 from https://pantonprinciples.org/. This demand has been criticized as difficult to realize because of two main reasons 1) the legal system of some countries, including Germany, does not really allow complete renunciation of rights by the right holder (i.e. public domain) 2) it removes all obligations to quote, which remove an important incentive, see Ulrich Herb, Open Science in der Soziologie. Eine interdisziplinäre Bestandsaufnahme zur offenen Wissenschaft und eine Untersuchung ihrer Verbreitung in der Soziologie, Glückstadt 2015, p. 126, accessed from: https://doi.org/10.5281/zenodo.31234, 25.03.2019.

[13] European Research Council (ERC), Guidelines on Implementation of Open Access to Scientific Publications and Research Data in projects supported by the European Council under Horizon 2020, Version 1.1., 21. April 2017, p. 6. Accessed from http://ec.europa.eu/research/participants/data/ref/h2020/other/hi/oa-pilot/h2020-hi-erc-oa-guide_en.pdf, 25.03.2019. Not all data needs to be published. A selection of what data is relevant and interesting for scientific reuse, like archives have always done, is necessary. For guidelines see for example  Angus Whyte & Andrew Wilson, How to appraise and select research data for curation, Digital Curation Centre How-to Guides, Edinburgh 2010. Accessed from: http://www.dcc.ac.uk/resources/how-guides/appraise-select-data, 25.03.2019.

[14] European Commission Expert Group on FAIR Data, Turning FAIR into reality: Final Report and Action Plan from the European Commission Expert Group on FAIR Data, Brussels 2018, p. 18, accessed from: https://ec.europa.eu/info/publications/turning-fair-reality_en, 25.03.2019. The FAIR principles provide guidance on “how to facilitate knowledge discovery by assisting humans and machines in their discovery of, access to, integration and analysis of, task-appropriate scientific data and their associated algorithms and workflows” Quote from: https://www.force11.org/group/fairgroup/fairprinciples, accessed: 25.03.2019.

[15] See for example Christof Schöch, Big? Smart? Clean? Messy? Data in the Humanities, in: Journal of Digital Humanities, 2 (2013): 3. Accessed from: http://journalofdigitalhumanities.org/2-3/big-smart-clean-messy-data-in-the-humanities/, 25.03.2019, Miriam Posner, Humanities Data: A Necessary Contradiction | Miriam Posner’s Blog, 25.06.2015. Accessed from: http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/, 25.03.2019. Insightful discussions of the complexity of talking about data in the Humanities are also offered for example by Jennifer Edmond, Georgina Nugent-Folan, D2.1 Redefining what data is and the terms we use to speak of it, KPLEX (Knowledge Complexity) Deliverable D2.1, 2018. Accessed from: https://kplexproject.files.wordpress.com/2018/07/d2-1-redefining-what-data-is-and-the-terms-we-use-to-speak-of-it.pdf, 25.03.2019, Fabian Cremer, Lisa Klaffki, Timo Steyer, T., Der Chimäre auf der Spur: Forschungsdaten in den Geisteswissenschaften. o-bib 5 (2018): 2, p. 142–162. Accessed from: https://doi.org/10.5282/o-bib/2018H2S142-162, 25.03.2019.

[16] Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 7. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[17] See PARTHENOS, The FAIR principles and the EOSC concept in the research community of Digital Humanities, Language Studies and Cultural Heritage: An expeditionary survey, 2017, esp. p. 7-8, Accessed from: http://www.parthenos-project.eu/Download/PARTHENOS_FAIR_EOSC_survey.pdf, 25.03.2019.

[18] See Elke Brehm & Janna Neumann, Anforderungen an Open-Access-Publikationen von Forschungsdaten: Empfehlungen für einen offenen Umgang mit Forschungsdaten, in: o-bib 2018: 3, p. 1-16, p. 3. Accessed from: https://doi.org/10.5282/o-bib/2018H3S1-16, 25.03.2019.

[19] See for example Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 7. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[20] The humanities are no exception here. See for example Benedikt Fecher, Cornelius Puschmann, Über die Grenzen der Offenheit in der Wissenschaft: Anspruch und Wirklichkeit bei der Bereitstellung und Nachnutzung von Forschungsdaten, in: Information – Wissenschaft & Praxis 66 (2015): 2-3, p. 146-150, p. 147. Accessed from: https://doi.org/10.1515/iwp-2015-0026, 25.03.2019, Fabian Cremer, Lisa Klaffki, Timo Steyer, T., Der Chimäre auf der Spur: Forschungsdaten in den Geisteswissenschaften. o-bib 5 (2018): 2, p. 142–162, p. 148. Accessed from: https://doi.org/10.5282/o-bib/2018H2S142-162, 25.03.2019.

[21] See for example Christine L. Borgman, Big Data, Little Data, No Data: Scholarship in the networked World. Cambridge, Mass; London, 2015, p. 177-179, where she characterises humanities data as often being considered as “club goods” (p. 177), meaning access is only granted to very specific individuals, such as local researchers. She describes on the example of the Dead Sea Scrolls this practice of local control (“hoarding”, p. 178), which stems from the fact: “Once scholars obtain access to materials, they may wish to mine the in private until they are ready to publish.” (p. 178).

[22] See for general observations about the (not) sharing of research data for example Carol Tenopir et al., Data Sharing by Scientists: Practices and Perceptions, in: PLOS ONE 6 (6), 29.06.2011, p. e21101, https://doi.org/10.1371/journal.pone.0021101,Veerle van den Eynden & Libby Bishop, Incentives and motivations for sharing research data, a researcher’s perspective. A Knowledge Exchange Report, 2014, accessed from: http://repository.jisc.ac.uk/5662/1/KE_report-incentives-for-sharing-researchdata.pdf, 25.03.2019, Benedikt Fecher et al., What Drives Academic Data Sharing? PLOS ONE, 10(2015):2, p. e0118053. Accessed from: https://doi.org/10.1371/journal.pone.0118053, 25.03.2019, Ulrich Herb, Open Science in der Soziologie. Eine interdisziplinäre Bestandsaufnahme zur offenen Wissenschaft und eine Untersuchung ihrer Verbreitung in der Soziologie, Glückstadt 2015, p. 134-143, accessed from: https://doi.org/10.5281/zenodo.31234, 25.03.2019, Ben Kaden, Warum Forschungsdaten nicht publiziert werden, in: LIBREAS.Dokumente, LIBREAS.Projektberichte, 13.03.2018, accessed from https://libreas.wordpress.com/2018/03/13/forschungsdatenpublikationen/, 25.03.2019.

[23] Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 6. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[24] Concerning prevalent legal issues see for example Madeleine de Cock Buning et al., The legal status of research data in the  Knowledge Exchange partner countries, 2011. Accessed from: http://repository.jisc.ac.uk/6280/, 25.03.2019, Bastian Drees, Text und Data Mining: Herausforderungen und Möglichkeiten für Bibliotheken. Perspektive Bibliothek, 5(2016:1, p. 49–73, esp. p. 59-61. Accessed from: http://dx.doi.org/10.11588/pb.2016.1.33691, Elke Brehm & Janna Neumann, Anforderungen an Open-Access-Publikationen von Forschungsdaten: Empfehlungen für einen offenen Umgang mit Forschungsdaten, in: o-bib 2018: 3, p. 1-16, p. 8-11. Accessed from: https://doi.org/10.5282/o-bib/2018H3S1-16, 25.03.2019, Anne Lauber-Rönsberg, Philipp Krahn, Paul Baumann, Gutachten zu den rechtlichen Rahmenbedingungen des Forschungsdatenmanagements im Rahmen des DataJus-Projekts (Kurzfassung), 2018. Accessed from: https://tu-dresden.de/gsw/jura/igewem/jfbimd13/ressourcen/dateien/publikationen/DataJus_Kurzfassung_Gutachten_12-07-18.pdf?lang=de&set_language=de, 25.03.2019.

[25] “The time and effort required to make research data open and accessible in accordance with the FAIR principles (Findable, Accessible, Interoperable, Re-usable) can be considerable; and those researchers who are keen to adopt ORD practices may find themselves stymied by a lack of practical guidance and specialist support.” Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 23. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 25.03.2019. This report acknowledges that some materials have already been developed (p. 24), which are from the author’s perspective often too general,  (UW often too general), and calls to increase efforts for training and education (p. 24).

[26] See for example Rat für Informationsinfrastrukturen (RfII), Leistung aus Vielfalt: Empfehlungen zu Strukturen, Prozessen und Finanzierung des Forschungsdatenmanagements in Deutschland, 2016, p. 37-39, Accessed from http://www.rfii.de/?wpdmdl=1998, 25.03.2019, DHd AG Datenzentren, Geisteswissenschaftliche Datenzentren im deutschsprachigen Raum:  Grundsatzpapier zur Sicherung der langfristigen Verfügbarkeit von Forschungsdaten (Version 1.0). DHd AG Datenzentren, 2018, p. 24. Accessed from http://doi.org/10.5281/zenodo.1134760, 25.03.2019.

[27] See Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 7. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[28] “Lebende Forschungsanwendungen spielen in den Geisteswissenschaften eine zunehmend große Rolle in der digitalen Ergebnissicherung und -präsentation. Im Gegensatz zur Buchpublikation ist jedoch die dauerhafte Erhaltung, Betreuung und Bereitstellung dieser lebenden Systeme eine technische und organisatorische Herausforderung. Während es vergleichsweise einfach möglich ist reine Forschungsdaten in Datenarchiven für die Nachwelt zu konservieren, sind lebende Systeme Teil eines digitalen Ökosystems und müssen sich diesem, z.B. in Form von Updates, regelmäßig anpassen.“ Andreas Witt et al., Forschungsdatenmanagement in den Geisteswissenschaften an der Universität zu Köln, in: o-bib 5 (2018): 3, p. 104–117, p. 111. Accessed from https://doi.org/10.5282/o-bib/2018h3s104-117, 25.03.2019. See also the website of the Project SustainLife at Cologne University: https://dch.phil-fak.uni-koeln.de/sustainlife.html, accessed 25.03.2019.

[29] Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 20.

[30] See Open Research Data Task Force, Case Studies: Annex to the final report of the Open Research Data Task Force, 2018, p. 6. Accessed from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775379/Case-studies-ORDTF-July-2018.pdf, 25.03.2019.

[31] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 237-238. Accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019.

[32] See Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 23. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 22.03.2019.

[33] The DORA declaration recommends to give credit for more than only articles, for example also for data sets and software, see: https://sfdora.org/read/, accessed 25.03.2019.

[34] Original: “In der interdisziplinären Bündelung geisteswissenschaftlicher Datenrepositorien und der Entwicklung adäquater Forschungswerkzeuge und -dienste für verknüpfte Daten liegt eine große Chance für die geisteswissenschaftliche Forschung.“ Torsten Schrade, Im Datenozean, in: F.A.Z., 2.12. 2018, Accessed from: https://www.faz.net/-in2-9h3jj, 25.03.2019.

[35] During the last years increasingly attention is being paid to Digital Humanities pedagogy and the development of specific Digital Humanities curricula. For Digital Humanities pedagogy see for example David B. Hirsch (ed.), Digital Humanities Pedagogy: Practices, Principles and Politics, 2012. Accessed from: http://www.openbookpublishers.com/reader/161, 25.03.2019, or Matthew K. Gold,  Debates in the Digital Humanities. Minneapolis, 2012, Section V. Accessed from: http://dhdebates.gc.cuny.edu/debates/1. For curricula see for example Patrick Sahle, DH studieren! Auf dem Weg zu einem Kern- und Referenzcurriculum der Digital Humanities. Göttingen: GOEDOC, Dokumenten- und Publikationsserver der Georg-August-Universität, 2013. Accessed from http://webdoc.sub.gwdg.de/pub/mon/dariah-de/dwp-2013-1.pdf, or IANUS, Statement zu minimalen IT-Kenntnissen für Studierende der Altertumswissenschaften, 2017. Accessed from https://www.ianus-fdz.de/projects/ausbildung_qualifizierung/wiki/Empfehlungen_zu_minimalen_IT-Kenntnissen, 25.03.2019. The need to not only being able to use tools and modeling systems, but to be able “to intervene in this ecology by designing more expressive modeling systems, more effective tools, and a compelling pedagogy through which colleagues and new scholars can gain an expert purchase on these questions as well” has been underlined recently by Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 20. See also ibid. p. 23. Data literacy, including data modelling literacy is indispensable to exercise control “over our data during its entire life cycle”, ibid. p. 12.

[36] The term datafication seems to have been coined in the publication by Kenneth Neil Cukier & Viktor Mayer-Schoenberger, The Rise of Big Data: How It’s Changing the Way We Think About the World. Foreign Affairs (2013). Accessed from https://www.foreignaffairs.com/articles/2013-04-03/rise-big-data, 25.03.2019.

[37] For this aspect see Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, esp. p. 13-15, quotes p. 14 and p. 15.

[38] “Die zunehmende Digitalität in den Geisteswissenschaften macht dabei den Aufbau einer Data Literacy, also einer grundlegenden Datenkompetenz von Lernenden, Lehrenden und Forschenden, unerlässlich.” Torsten Schrade, Im Datenozean, in: F.A.Z., 2.12. 2018, Accessed from: https://www.faz.net/-in2-9h3jj, 25.03.2019.

[39] See Jonathan Gray, Carolin Gerlitz & Liliana Bonegru, Data infrastructure literacy. Big Data & Society (2018), p. 1–13. Accessed from: https://doi.org/10.1177/2053951718786316, 25.03.2019.

[40] Translated from: AG Forschungsdaten der Schwerpunktinitiative “Digitale Information” der Allianz der deutschen Wissenschaftsorganisationen, Forschungsdatenmanagement: Eine Handreichung, 2018, p. 4. Accessed from: http://doi.org/10.2312/allianzoa.029, 25.03.2019

[41] Sometimes the term data curation seems to be used (wrongly) as a synonym.

[42] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 238. Accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019.

[43] See Andreas Witt et al., Forschungsdatenmanagement in den Geisteswissenschaften an der Universität zu Köln, in: o-bib 5 (2018): 3, p. 104–117, here p. 106. Accessed from https://doi.org/10.5282/o-bib/2018h3s104-117, 25.03.2019.

[44] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 244-245. Accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019. See also: “The point here is not that these costs are prohibitive or unjustified, but rather that good strategic planning involves balancing the costs and benefits, and focusing the effort in areas that offer a clear advantage.” Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, p. 8.

[45] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 245 (accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019), who describes that the biggest difference of the digitized research process in the humanities is that researchers need to plan the research process more detailed at an earlier stage, describing their methods more explicit in order to come to machine readable data (processes).

[46] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 245-246, accessed from: https://doi.org/10.5282/o-bib/2018H4S237-247, 25.03.2019.

[47] See for example Andreas Witt et al., Forschungsdatenmanagement in den Geisteswissenschaften an der Universität zu Köln, in: o-bib 5 (2018): 3, p. 104–117, here p. 113. Accessed from https://doi.org/10.5282/o-bib/2018h3s104-117, 25.03.2019. The authors describe that Research Data Management is already established in the curriculum of the humanities faculty of Cologne University.

[48] “There is a need for new guidance and exemplars to ensure that data meets appropriate quality standards; for tools to standardise and automate data management, documentation and curation processes; and for an increased focus on improving research software, and on recruiting and retaining software engineers.” Open Research Data Task Force, Realising the potential: Final Report of the Open Research Data Task Force. N.P., 2018, p. 8. Accessed from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/775006/Realising-the-potential-ORDTF-July-2018.pdf, 22.03.2019.

[49] About RDMO see esp. the detailed article Heike Neuroth et al., Aktives Forschungsdatenmanagement. ABI Technik, 38(2018):1, p. 55–64. Accessed from https://doi.org/10.1515/abitech-2018-0008, 25.03.2019. See also the RDMO project website: https://rdmorganiser.github.io/, accessed 25.03.2019.

[50] There are several tools available for helping creating a Research Data Management Plan (DMP), but not with a focus on active data management.

[51] My home organisation, the University of Applied Sciences Potsdam (FHP), is currently developing together with the project partners AIP (Leibniz-Institut für Astrophysik Potsdam) and KIT (Karlsruhe Institute of Technology) funded by the Deutsche Forschungsgemeinschaft (DFG) such a tool, the Research Data Management Organiser (RDMO).

[52] Research Data Management should not be a sole task for researchers, but they definitely have to be on board.

[53] “As data creators, academics have a different, more knowing relationship to their data: they create data that is going to be a persistent part of the research environment, and they act as both its creators, managers, and consumers. The stakes of the modeling decisions for research data are thus much higher, and to the extent that these decisions are mediated through tools, there is  significant value—even a burden of responsibility—in understanding that mediation. And within the academy, the stakes for digital humanists are highest of all, since their research concerns not only the knowing and critical use of data models, media, and tools, but also their critical creation.” Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 11-12.

[54] For the recommendation of local contact persons see for example, Andreas Witt et al., Forschungsdatenmanagement in den Geisteswissenschaften an der Universität zu Köln, in: o-bib 5 (2018): 3, p. 104–117, esp. p. 106, 115.

[55] See Paul Ayris et al., LIBER Open Science Roadmap, 2018, p. 18-19. Accessed from:  http://doi.org/10.5281/zenodo.1303002, 25.03.2019.

[56] This point especially relates to the creation of corpora that are digitized and made accessible in meaningful ways for research purposes, e.g. HathiTrust Digital Library (https://www.hathitrust.org/) or the Deutsche Textarchiv (DTA) (http://www.deutschestextarchiv.de/). See for example the recommendations in Lisa Klaffki, Stefan Schmunk, Thomas Stäcker, Stand der Kulturgutdigitalisierung in Deutschland: Eine Analyse und Handlungsvorschläge des DARIAH–DE Stakeholdergremiums „Wissenschaftliche Sammlungen“, Göttingen: GOEDOC, Dokumenten- und Publikationsserver der Georg-August-Universität, 2018 (DARIAH-De Working Papers, 26), Accessed from: http://webdoc.sub.gwdg.de/pub/mon/dariah-de/dwp-2018-26.pdf, 04.04.2019.

[57] See Marina Lemaire, Vereinbarkeit von Forschungsprozess und Datenmanagement in den Geisteswissenschaften: Forschungsdatenmanagement nüchtern betrachtet. o-bib 5 (2018):4, p. 237–247, here p. 245-246, Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 3, Torsten Schrade, Im Datenozean, in: F.A.Z., 2.12. 2018, Accessed from: https://www.faz.net/-in2-9h3jj, 25.03.2019.

[58] See for example Julia Flanders & Fotis Jannidis, Data Modeling in a digital humanities context: An introduction. In: Julia Flanders & Fotis Jannidis (ed), The shape of data in the digital humanities: Modeling texts and text-based resources, London, New York, 2019, p. 3–25, here p. 5.

The text of this blog post is published under the license CC-BY 4.0.

Cite as: Ulrike Wuttke, “Here be dragons”: Open Access to Research Data in the Humanities, Blogpost, 09.04.2019, CC-BY 4.0. Link: https://ulrikewuttke.wordpress.com/2019/04/09/open-data-humanities/.