Digital Text Analysis of Herman Melville’s Marginalia in Shakespeare

In an erased annotation, Herman Melville commented on William Shakespeare’s character Parolles, the dissembling rogue in All’s Well That Ends Well, in a manner that forecasts his tenth book, The Confidence-Man. Invoking an unexceptionable mathematical formula to describe Parolles’ duplicitous nature as an ingrained quality of the human condition, Melville inscribed in the margin:

As 2 & 2 made 4 in Noah's time, as now,

so man [-?-]s ever. Here we have a

character very common in the Rail Road

Car of the [?most mighty] nineteenth century.

Quantities appear frequently in Moby-Dick and other writings by Melville, and are not scarce in his marginalia either. In light of his predisposition to unite the quantifiable and the profound, Melville’s marginalia in his 1837 copy of the Dramatic Works of William Shakespeare offer a strong case for digital text analysis among books that survive from his library. Thirty-one plays are marked in the 7-volume set, comprising 681 distinct passages with marginalia that can be attributed to Melville.

Many high-profile studies of distant reading to date have been aimed at broad swaths of literary output oriented by regions and decades. Yet a lifetime of an author’s reading is one of the more fascinating big data sets. Combining distant with close reading, I am exploring with a team at Melville's Marginalia Online hitherto unknown or imperfectly understood evidence of the author’s engagement with Shakespeare, which will soon be replicable for broader use at site. I propose to present the range of our visualizations, from lexical variety graphs and sentiment analyses of Melville’s marked words, to stylistic analyses comparing whole texts of Melville’s readings with his own work (all of which use the R programming language). We have also transformed OCR data in XML with XSLT to create tables of Melville's markings that can be sorted by word counts (see the attached appendices for examples). These visualizations allow for a “re-mixing” of Melville’s reading, illustrating the promise of understanding literary influence with digital text analysis techniques. Overall, this approach to small sets of texts demonstrates the potential for using digital methods to oscillate between numerical results of text analysis and close readings informed and enhanced by those results.

Sample Visualizations

Appendix 1: Word counts of Melville’s markings in his 7-volume set of Shakespeare’s plays



Appendix 2: Table (partial) of Melville’s markings in Shakespeare, sortable by word count


Appendix 3: Lexical variety comparisons of marked passages in the Tragedies


Appendix 4: Sentiment word frequencies in Melville’s markings