Session 2

Thursday 14:00 - 15:30

High Tor 3

Chair: Katherine Rogers

Digital Text Analysis of Herman Melville’s Marginalia in Shakespeare

  • Christopher Ohge

Institute of English Studies, School of Advanced Study, University of London

In an erased annotation, Herman Melville commented on William Shakespeare’s character Parolles, the dissembling rogue in All’s Well That Ends Well, in a manner that forecasts his tenth book, The Confidence-Man. Invoking an unexceptionable mathematical formula to describe Parolles’ duplicitous nature as an ingrained quality of the human condition, Melville inscribed in the margin:

As 2 & 2 made 4 in Noah's time, as now,

so man [-?-]s ever. Here we have a

character very common in the Rail Road

Car of the [?most mighty] nineteenth century.

Quantities appear frequently in Moby-Dick and other writings by Melville, and are not scarce in his marginalia either. In light of his predisposition to unite the quantifiable and the profound, Melville’s marginalia in his 1837 copy of the Dramatic Works of William Shakespeare offer a strong case for digital text analysis among books that survive from his library. Thirty-one plays are marked in the 7-volume set, comprising 681 distinct passages with marginalia that can be attributed to Melville.

Many high-profile studies of distant reading to date have been aimed at broad swaths of literary output oriented by regions and decades. Yet a lifetime of an author’s reading is one of the more fascinating big data sets. Combining distant with close reading, I am exploring with a team at Melville's Marginalia Online hitherto unknown or imperfectly understood evidence of the author’s engagement with Shakespeare, which will soon be replicable for broader use at site. I propose to present the range of our visualizations, from lexical variety graphs and sentiment analyses of Melville’s marked words, to stylistic analyses comparing whole texts of Melville’s readings with his own work (all of which use the R programming language). We have also transformed OCR data in XML with XSLT to create tables of Melville's markings that can be sorted by word counts (see the attached appendices for examples). These visualizations allow for a “re-mixing” of Melville’s reading, illustrating the promise of understanding literary influence with digital text analysis techniques. Overall, this approach to small sets of texts demonstrates the potential for using digital methods to oscillate between numerical results of text analysis and close readings informed and enhanced by those results.

Sample Visualizations

Appendix 1: Word counts of Melville’s markings in his 7-volume set of Shakespeare’s plays



Appendix 2: Table (partial) of Melville’s markings in Shakespeare, sortable by word count


Appendix 3: Lexical variety comparisons of marked passages in the Tragedies


Appendix 4: Sentiment word frequencies in Melville’s markings

Abundance and Access: Early Modern Letters in Contemporary and Digital Archives

  • Elizabeth Williamson

University of Exeter

Letters stand as one of the most extensive sources of information on daily life in the early modern period and the study of epistolary culture(s) is a vital and growing area in Renaissance studies. Access to such archives and collections is rapidly expanding – and changing – in the wake of mass digitization, online editions, OCR and federated search. In this paper I explore the extension of the narrative of archival history and epistolary provenance into the digital realm. Specifically, I compare the contextual afterlife of early modern letters in nascent state archives to their representation in the digital world, with particular emphasis on classification and metadata, surrogacy and access. Going beyond paralleled modern and early modern anxieties around information overload (the standard comparison of the print and digital revolution) allows me to explore access, search, and retrieval; control, preservation, and loss, then and now. This is an under-studied area ripe for discussion. I argue that there is a ready parallel to be found between the burgeoning administrative and institutional drive to preservation found in the early modern period – what essentially amounts to the evolution of the state archive – and the informational anxieties of the internet age, where that largest of archives can offer everything and nothing, excess and restriction, results or dead ends. I explore tensions around archives facilitating both preservation and forgetting, which finds its apotheosis in the endless loss and abandonment of digital data, and engage with digital methods of retrieval as strict gatekeepers (a roulette of keyword search, privations of metadata, and dreams of distant reading). Finally, I will introduce the concept of copia, fundamental to early modern humanism and classical pedagogy, as a way of exploring these twin pressures of abundance and lack, of meaningful quantity and meaningless repetition.

Recovering narratives: reading through the digital library

  • Kate Simpson

Edinburgh Napier University

Close to the very end of his life in April 1873, explorer David Livingstone mentions a little girl who has joined his expedition. He talks about how she has been able to keep up because she “walks wonderfully,” and how he, upon finding out she is part of the group, sends extra food to her as she has been “weakened greatly” by the starvation his party was enduring whilst they were mired in the Bangweulu wetlands of modern day Zambia. This short entry does not make it into the final published narrative of Livingstone’s Last Journals (1874), but the entry allows insight into the lived reality of travel in nineteenth-century central Africa. The little girl merits just over a page in Livingstone’s diary, and we do not even know her name, let alone what subsequently happened to her when Livingstone died, yet this elusive mention strikes at the very heart of the potential of a digital museum and library like Livingstone Online. While revisionist and postcolonial scholarship has engaged in a substantial reappraisal of the European explorer in Africa, the digital library, as a technology of recovery, is today extending and expediting the process. The digital library enables the user to explore information contained in explorers’ original documents – written in situ, during their travels – which often reveal complexities that are lost in the official expeditionary narratives that they published on their return.

This paper will explore the possibilities of digital humanities to take facilitate the identification of new narratives in historical data via a case study of the lost, or muted, voices that can be identified in the digitised and encoded documents held by Livingstone Online.