Abstract
The digital is still rather unusual in a literary archive. One reason for this may be that an author’s estate usually only reaches the archive after his death or at least when he is at an advanced age. In other words, the archive reflects our present with a time lag. But born-digital estates require fast action. Digital data is fragile. In this insight, we look at the challenges of born-digital literary estates.
The analog and born-digital archive
This insight1This insight is based on a conference talk held by the authors: ‘Studying Born-Digital’ Full reference in Zotero Librarybriefly highlights the very basic aspects of the shift taking place regarding born-digital data in a literary archive—especially in the German Literature Archive Marbach (DLA, one of the largest literary archives in Europe).2The following publication from the German Literature Archive Marbach deals with the current challenges and perspectives for the future of a literary archive: Full reference in Zotero Library. One example is the technically challenging medium of computer games, to which the DLA is also dedicated. For further information, see Full reference in Zotero Library When studying written artefacts in an archive, one deals increasingly with so called “born-digital” data.3This was part of the project ‘Archivierung, Erschließung und Erforschung von Born-digitals’ (2021–2023) of the research network Marbach, Weimar, Wolfenbüttel, https://www.mww-forschung.de/born-digitals (last access 08.05.2022)Authors produce, edit, and save their texts, manuscripts, ideas etc. on computers and hand over these computers and data carriers, rather than boxes full of material of interest for literary studies, to the archive (see also Full reference in Zotero Library). At the DLA, so-called digital curators, who could also be called data librarians or data archaeologists, have become increasingly important since the mid-2000s, in wake of these observed changes to the handling of written artefacts.4You can read about the initial work on digital estates in Full reference in Zotero Library and Full reference in Zotero Library.In the case of digital archival objects, the reading and interpreting capability of the material—the data—must first be prepared at great effort and expense. One cannot just open a box to see what material—that is data—has been acquired. A manual inspection of the files is possible for a smaller digital estate, but rather impractical for some estates due to the size and nature of the files. In this context, analysis prior to acquisition may be the preferred method, or one might opt for a surprise post-acquisition of an estate. Within the DLA, a comprehensive array of mediums has traditionally been archived, with special focus on printed matter and paper-based materials. Consequently, the institution’s staff have cultivated specialised expertise in managing and preserving these materials over decades.
Questions of impermanence
The focus here lies not expressly on digitised materials or the process of digitising material for preservation, but rather on born-digital data, including literature as data. While the archive’s engagement with digital media is still emerging, it is evident that the methodologies and practices inherent to analog materials cannot be seamlessly transposed onto their digital counterparts. This dichotomy between analog and digital archiving and data underscores several salient points. Firstly, the durability of printed materials affords archivists ample time for meticulous curation and preservation efforts, in contrast to the more fragile nature of digital artefacts, which necessitate prompt attention and archival protocols to mitigate the risks of decay and obsolescence. Indeed, the advent of digital archiving has introduced a sense of impermanence, particularly exemplified by the ephemeral nature of internet-based data, where websites can swiftly deteriorate and vanish from the virtual landscape. Moreover, the rise of digital media has caused an acceleration in archival practices, prompting a paradigm shift wherein archives no longer serve solely as repositories of historical knowledge but also function as caches for contemporary information. This transformation reflects broader societal trends towards the digitisation of information, underscoring the evolving role and relevance of archives in the digital age.
In essence, four key domains emerge as paramount when navigating a born-digital estate. In our inquiry, we reflected on the distinctions between working with digital materials within an archive and, more critically, its implications for research endeavours.
Questions of authenticity
In the realm of digital literary estates, preserving the authenticity and integrity of the author’s work in its original form is paramount. This necessitates a meticulous approach to handling digital files to mitigate the risk of accidental alterations or corruption. When managing digital literary archives, it is imperative to maintain the files exactly as they were left by the author, without any traces of intermediaries or readers accessing the files. Each instance of opening a file has the potential to introduce modifications, thereby jeopardising the fidelity of the original content. This phenomenon is particularly pronounced in formats prone to automatic updates or metadata changes upon access. The challenge lies in developing strategies to safeguard digital files from unintended modifications while still allowing authorised individuals to access and study them. One approach involves employing read-only or non-destructive file access methods, ensuring that the content remains unaltered during examination. Additionally, implementing robust metadata management practices can help track any alterations or annotations made to the files, thereby maintaining a clear record of the archival process.
Preserving the authenticity and integrity of digital literary estates requires in all processing stages and phases a vigilant approach to file management and access control. By prioritising strategies that minimise the risk of unintended modifications and corruption, custodians can uphold the integrity of the author’s work for future generations of scholars and readers. In this regard, archivists and researchers alike confront multifaceted inquiries regarding the maintenance of provenance trails, enabling the retrospective reconstruction of data modification histories and the attribution of authorial contributions within digital files. The intersection of authenticity concerns with metadata protocols further complicates matters, as metadata schema must evolve to accommodate the intricacies of digital provenance tracking. Additionally, the formulation of backup and long-term-preservation strategies assumes critical importance in safeguarding data integrity. In sum, the digital landscape precipitates a reevaluation of traditional notions of authenticity, prompting a concerted effort to devise innovative methodologies and best practices for preserving the reliability of digital archives.
Question of readability
The question of readability in the context of digital files presents a distinct set of challenges, diverging markedly from the accessibility afforded by analog materials. Unlike their analog counterparts, digital files cannot be simply opened without special devices (electricity, computer/interpreter/output device, matching filesystem etc.), the risk of inadvertent alteration or, in the worst-case scenario, irreversible damage. This inherent fragility underscores the need for conscientious handling and meticulous preservation strategies to ensure the long-term readability and accessibility of digital assets within archival contexts. Moreover, the widespread use of proprietary file formats and software dependencies further exacerbates concerns surrounding digital readability, as compatibility issues and obsolescence pose formidable barriers to the seamless access and interpretation of digital data. Consequently, archivists and researchers develop strategies for mitigating these challenges, embracing interoperable technologies and open formats to enhance the readability and longevity of digital archives. In doing so, they endeavour to safeguard the integrity of digital materials while fostering equitable access and dissemination of knowledge within scholarly communities.
Question of material
The question of materiality in archival contexts presents a nuanced examination of the unique characteristics and challenges associated with analogue and digital artefacts. In the analogue domain, archival acquisitions often comprise finalised iterations of literary or artistic works, such as completed letters, manuscripts, or publishing contracts. By contrast, the digital realm affords a granular insight into the developmental stages of a file, allowing for detailed scrutiny of its development history.
In digital archives, the nature of archival material diverges significantly from its analogue counterpart. Rather than curated collections of discrete artefacts, digital estates frequently encompass entire computing systems, presenting archivists with novel logistical and methodological considerations. Today, we confront the multifaceted implications of managing digital estates, contemplating not only the voluminous nature of digital acquisitions but also the attendant challenges of evaluation, preserving, and curating digital objects.
Central to our discussion is the distinction between hardware preservation and the management of digital objects themselves, with a concerted focus on the latter. Although the preservation of hardware remains an important consideration, our considerations focus primarily on the preservation of digital data authored by individuals whose creative contributions give these data cultural and scientific significance. Pertinently, the notion of authenticity in digital archives underscores the relevance of accessing digital materials within their original contexts—an aspect we defer for further exploration.
Moreover, the dissemination of data through various backup mechanisms engenders complexities in storage and resource allocation, precipitating logistical challenges for archivists. As digital archives have to deal with the exponential growth of data volumes, the thoughtful management of storage resources assumes importance, necessitating strategic planning and allocation strategies to optimise storage efficiency and accommodate expanding archival collections.
Question of personal rights
In the case of recently deceased authors, it must always be borne in mind that the entire estate is rarely usable and cannot be researched as a whole at all or at a later point in time—in the case of born-digital estates, these are also referred to as dark archives Full reference in Zotero Library, because rarely can everything be made easily accessible (online), even though it is digital. Not only are not all files freely accessible, but not everything in an estate is relevant for a literary archive or for literary scholars (e.g. system files, program files, temporary files). The question of personal rights in the context of archival practice also presents a complex intersection of ethical, legal, and methodological considerations, particularly within the digital domain. Digitality and born-digitals do not necessarily equate to increased accessibility.
Indeed, the confluence of personal rights and scholarly inquiry engenders a lot of challenges for researchers seeking to engage with digital archive’s data. Notably, the prevalence of legal constraints and ethical considerations necessitates a cautious approach to archival analysis, wherein all analyses conducted must be regarded as snapshots rather than comprehensive evaluations.
A consideration in navigating the complexities of personal rights within digital archives is the acquisition of entire computing systems, which often contain vast repositories of personal data, including complete email correspondences and other sensitive information. This presents archivists with a challenge: to balance the preservation imperatives of digital materials with the ethical obligations to safeguard individuals’ privacy and autonomy.
Conclusion
These considerations underscore the dynamic nature of the workflow development process, which must continually adapt to accommodate the unique characteristics of each new type of data. This is particularly salient in the case of emerging data source such as social media data, which are gradually finding their way into the archival domain. The incorporation of social media data into archival collections entails a flexible and iterative approach to workflow design, wherein protocols and methodologies are continually refined to address the evolving challenges and opportunities presented by these novel data sources. Unlike traditional archival materials, social media data pose distinct challenges in terms of volume, velocity, and variety, necessitating innovative strategies for data procurement, processing, and preservation. Moreover, the integration of social media data into archival workflows mandates interdisciplinary collaboration and engagement with diverse stakeholders, including data scientists, social media analysts, legal experts, and archivists. Completely new approaches must therefore be sought, and this must occur at a pace significantly higher than what archives are accustomed to, necessitated by the digital realm.