Session 12 — Language Analysis
Friday 14:00 - 15:30
Chair: Michael Pidd
At the height of his career, Irish author Thomas Moore (1779-1852) was one of the most famous and respected poets, satirists and biographers of the English language. His most enduring work, the Irish Melodies, played a central role in the creation of a sense of Irish identity based on the remembrance and memory of the past. It is therefore deeply ironic that during the last few years of his life, Moore’s cognitive abilities entered a rapid decline, marked by forgetfulness and confusion. Moore himself remarked in 1847 that he was “sinking into a mere vegetable.”
Recent studies (Le et al. 2011, Hirst and Wei Feng 2012) have shown that Alzheimer’s disease causes specific alterations to the language employed by its sufferers. Linguistic analysis can measure these transformations, and has shown that they can appear as much as ten or twenty years before a formal diagnosis.
Moore’s literary productions during the 1840s, chiefly volumes three and four of his History of Ireland, have long been regarded as embodying his decline. This has generally been attributed to his lack of familiarity with history writing, with Joep Leerssen calling them “uninspired, pale digests of received knowledge.” However, in the same period, his meticulous and exhaustive diary entries also undergo a sharp decline in both length and complexity.
This paper will describe the results and methodology of an exploratory study to investigate whether the characteristics of the linguistic changes in Moore’s History of Ireland (1835-1846) and personal diary are consistent with those observed in other authors known to have suffered from dementia.
These are challenging texts to analyse digitally, because they include frequent quotations, abundant footnotes and use multiple languages, and because no corrected digital edition of them exists. The study will discuss the challenges presented by this analysis, in terms of creating digital versions of the texts encoded in TEI XML, using textual analysis tools to examine their structural, linguistic and syntactic traits, and evaluating the resulting data in the light of the questions: How did Moore’s writing change over time toward the end of his career? Are those changes consistent with the onset of dementia? What do such changes mean for our understanding of Moore’s later writings?
Hirst, Graeme, and Vanessa Wei Feng. 2012. “Changes in Style in Authors with Alzheimer’s Disease.” English Studies 93 (3): 357–370. doi:10.1080/0013838X.2012.668789.
Le, Xuan, Ian Lancashire, Graeme Hirst, and Regina Jokel. 2011. “Longitudinal Detection of Dementia through Lexical and Syntactic Changes in Writing: a Case Study of Three British Novelists.” Literary and Linguistic Computing 26 (4) ( 12–1): 435–461. doi:10.1093/llc/fqr013.
Leerssen, Joep. Remembrance and Imagination : Patterns in the Historical and Literary Representation of Ireland in the Nineteenth Century. Cork: Cork University Press in association with Field Day, 1996.
University of Sheffield
Analysing the natural language in historical sources presents several particular challenges, arising not only from the nature of the documents and the differing forms of language used but also from the varying quality of the digital versions of these documents. These challenges become even more problematic when attempting to extract meaning from large, disparate datasets. This paper will consider these challenges, in relation to three Humanities Research Institute Projects.
Connected Histories currently brings together twenty-two digital datasets related to early modern and nineteenth century Britain with a single federated search that allows sophisticated searching of names, places and dates. Manuscripts Online, its sister site, enables users to search twenty online primary resources relating to written and early printed culture in Britain during the period 1000 to 1500. Digital Panopticon is an on-going project that attempts to bring together existing and new genealogical, biometric and criminal justice datasets to explore the impact of the different types of penal punishments, particularly transportation, on the lives of 66,000 people sentenced at The Old Bailey between 1780 and 1875.
This paper will consider the feedback loop inherent in NLP approaches – how our failures not only improve our processing techniques but can also improve our datasets. It will discuss how NLP and associated techniques not only add an interpretative layer to our datasets but can also raise research questions about the assumptions that we make about historical sources.