This article is based on my experiences creating digital resources for humanities scholars over the past twenty years, leading to the recently launched website, Digital Panopticon: Tracing London convicts in Britain and Australia, 1780-1925. The Digital Panopticon is the culmination of a series of digital humanities projects undertaken at the Digital Humanities Institute (DHI) at Sheffield, with a variety of collaborators. We started with the Old Bailey Online, founded by Tim Hitchcock and me in 1999, which contains all the published reports (the Proceedings) of trials at the Old Bailey, London’s central criminal court, between 1674 and 1913. After completing that project in 2008, we moved on, working with a wide range of collaborators, to create (just to name the major projects) London Lives 1690-1800: Crime, Poverty and Social Policy in the Metropolis, Connected Histories: British History Sources, 1500-1900, Locating London’s Past, and the Digital Panopticon. In addition to including a common dataset (the Old Bailey Online), these projects share several features: they focus on the provision of historical data to a public as well as an academic audience; they reuse and link together a wide range of datasets, enabling innovative digital research; and they make the data as accessible and as searchable as possible. They thus represent a particular type of digital humanities project, which we might call the creation of a ‘public-facing research resource’. While this is a major form of output in the digital humanities, these are of course not necessarily ‘typical’ research projects—examples of a wide range of other types of project were presented at the Digital Humanities Congress.

As a kind of ‘autocritique’, this article reflects on what we can learn from these major projects about the possibilities and limitations of this type of digital humanities research. These projects secured (and spent) a lot of public money, they attracted large public audiences, as well as academic users, and they have contributed to, and shaped, academic scholarship, particularly in British social history. And yet the compromises they have been forced to make have had their costs and consequences. Publishing digital resources in this form arguably creates misleading certainties, distorts academic research, and elevates public expectations of what they should deliver above the needs of academic scholars. I will explore all of these points, and conclude with a modest suggestion of where we should go from here in terms of designing online research resources.

1. The Digital Panopticon

The Digital Panopticon was a flagship digital humanities project, receiving one of just three ‘large grants’ awarded by the UK Arts and Humanities Research Council, under their ‘Digital Transformations’ funding initiative. A collaboration between the Universities of Liverpool, Oxford, Sheffield, Sussex, and Tasmania, technical work was carried out by the Digital Humanities Institute at Sheffield, primarily by Jamie McLaughlin.1 The purpose of the project was to assemble, on a single, searchable, publicly available platform, all available records about criminals convicted at the Old Bailey between 1780 and 1868, and determine the impact of punishment on the rest of their lives. Uniquely, this was a period of competing penal systems in Britain, with two major punishments operating side by side – transportation to Australia, and imprisonment in Britain – so we can ask: which was most effective at reforming convicts? 

To answer these questions, the Digital Panopticon brought together fifty datasets, primarily those containing information about judicial and penal processes. The foundational database was the Old Bailey Online; this project extracted trial reports for the 110,000 people convicted at the Old Bailey between 1780 and 1868, the population from which convicts selected for transportation to Australia was derived. This was the period in which detailed records first began to kept of the personal circumstances of each convict,2 facilitating detailed analysis and accurate record linkage. The wide variety of additional databases included:

  • criminal registers (listing everyone committed to prison to await trial)
  • transportation registers and convict indents (listing all the convicts shipped to Australia)
  • records of the convicts who arrived in Australia (primarily those who were shipped to Tasmania, which had previously been digitised and linked by the Founders and Survivors project)
  • prison and hulks registers
  • prison licences (an early form of parole)

Also included were two sets of civil records: the nineteenth-century British censuse,s and records of births, marriages and deaths from FreeBMD.

The four million records assembled pertain to some 250,000 individuals, and the project linked records together using automated record linkage, creating ‘life archives’ of all the records pertaining to individual convicts. The website, launched in September 2017, includes a comprehensive search facility which enables searching by a wide range of personal characteristics and life events, as well as a wide range of background and contextual information. Search results can be visualised as bar charts or life charts, and can be downloaded for further analysis.

The life archive of James Gardner, convicted in 1822 of highway robbery at the age of 17 and initially sentenced to death, provides a good example of the information assembled by the project. We can see that his sentence was respited, and he was pardoned on condition of being transported to Australia. His early years in Australia apparently passed without incident, and he was initially given a ticket of leave (a form of parole) in 1832. Various minor infractions, however, led to his freedom being revoked and then granted again several times over a period of twenty years (we sometimes think that transportation was an effective form of punishment, in terms of offering convicts a chance to build a new life, but it does not seem to have led to a reformed life for James, at least initially). At the age of 40 he applied for, and was granted permission to, marry, and shortly afterwards he was granted a conditional pardon, after which no further misbehaviour was recorded. (Marriage, as modern criminologists have found, seems to have led to his desistance from crime.) For all these records, you can click to see more details, or follow a link through to an external website where the full record is available (in these cases, from the Tasmanian Archives).

2. Strengths

First, and fundamentally, the Digital Panopticon and the projects which preceded it are excellent examples of the propensity for digital humanists to work collaboratively and interdisciplinarily.  Needless to say, these are skills which scholars in the Arts and Humanities are increasingly being urged to develop, given the growing importance of challenge-based research funding, particularly in the UK. The Digital Panopticon was a partnership between five universities in the UK and Australia, and involved historians, criminologists, and, of course, computer programmers. 

The Digital Panopticon also brought in partners from the private sector. This is something we first did with Connected Histories, a federated search facility which searches across 25 separately located British History databases, listed here. Some of those databases are commercial (as indicated by access by ‘subscription’ only on this list). These include British Newspapers 1600-1900, House of Commons Parliamentary Papers, and the John Johnson Collection of Printed Ephemera.  For these, we worked with the relevant commercial organisations (ProQuest and Gale) to obtain access to their digitised collections so that we could index them and include basic information from them in our freely accessible federated search function. We did not provide full access to these commercial sites, unless users or their institutions were subscribers, but we did drive traffic towards them—that was what the commercial providers got in return for allowing us to use their data. 

In the case of the Digital Panopticon, we faced the problem that the UK National Archives (TNA), which holds many key British criminal justice records (particularly those of imprisonment), had sold the rights to digital access to some of their major collections to the genealogical companies Findmypast and Ancestry, who of course charge users to access them. In our view this was a problematic decision by the TNA, since these are public records, but the TNA could not afford to digitise these sources without a commercial partner, so we were stuck with the fact that these records are only available digitally behind a pay wall. Therefore, we negotiated a similar agreement with these two companies to that which we had done for Connected Histories: Findmypast and Ancestry provided us with copies of their data, which we were able to use behind the scenes to identify and link to relevant records pertaining to convicts in the Digital Panopticon. The requirements of the agreement meant that we could only display very limited information from those records (often only name and date), while providing links to the full records held by these companies located behind their pay walls—users who click on the relevant links are welcomed with a screen which invites them to open up their wallets and subscribe to the service. This has prompted some complaints from users, who wonder what this publicly funded project is doing driving traffic to commercial sites, but in our view this was a bargain worth striking, if in return we obtained and can provide free access to some of the data these commercial organisations provide. If public funding for digitisation is going to continue to be very limited (and there is no reason to think that this will change), we have to learn to work with commercial partners like these, along the lines of the agreements we negotiated.

This is related to the second strength I want to point out: the Digital Panopticon is an excellent example of the reuse and repurposing of digital resources. As noted, this project is based on fifty datasets, but the project itself engaged in very little digitisation; the vast majority of evidence it provides access to comes from datasets created by other organisations, both academic and commercial. The Old Bailey Online itself has been reused in a wide range of digital humanities projects, both those which provide searchable online resources and projects which developed new research tools, as you can see here. The key to the utility of the Old Bailey Online (or any dataset) for reuse by other projects is the quality of the original resource.  For the Old Bailey Online, the text was digitised to a high level of accuracy (by double rekeying) and extensively manually marked up, and it represents a complete edition of the original source. The data can thus be reused without introducing error into any new project. Given the lack of resources to pay for new digitisation, clearly the reuse and linking together of existing resources is the way of the future, but those resources need to be reliable.

As an aside, in spite of this, I do not want to ignore the problems associated with the selectivity bias in the records which have been digitised, which is accentuated by patterns of reuse. Vast amounts of important records and text have not yet been digitised, and this can distort historical research. With respect to the history of crime in Britain, this means historians are increasingly focusing on London and the Old Bailey Online, at the expense of studying crime and judicial practice in the lower courts and in the rest of the country. More generally, of course, it is the records of Western, white history which have been digitised, inevitably privileging research on topics in those fields (by making it so much easier and productive), while creating vast deserts of subjects which have few, if any, digitised records, which discourages research.3 I do not have a solution to this problem, which is an issue for funding bodies and commercial partners, but it is a real problem.

Where there are concentrations of digital records, this linking together of existing resources can enable ground-breaking new research. Like the Old Bailey Online, it will take time for the full research potential of the Digital Panopticon to be realised (completed ten years ago, the Old Bailey is still generating new research, most of it on topics we had never imagined the resource could be used for), but to give an early example from the Digital Panopticon, by joining together the records of punishment sentences at the Old Bailey with hitherto separate records of actual punishments, we can see how post-trial judicial decisions affected the fate of convicts.

Figure 1: What actually happened to defendants sentenced to death?

As the example visualisation in Figure 1, from the website’s home page, of 763 Old Bailey convicts sentenced to death between 1810 and 1815 demonstrates, the majority of capital convicts were not actually executed.  Instead, most were transported, with small numbers of convicts punished in a number of other ways, as you can see on the right. And as the Sankey diagram clearly indicates, the type of offence the defendant was convicted of was a major factor in determining these penal outcomes—those executed were most likely to have been convicted of murder, burglary, highway robbery, coining and forgery (primarily of banknotes); offences involving violence, or subversion of the currency of the realm. These were the most serious crimes tried at the Old Bailey, and unsurprisingly it was those convicted of these offences who were executed.

Figure 2: Sentence outcomes broken down by offence category in pie charts

It is easier to see these patterns using the website’s pie charts function: Figure 2 shows the same data, broken down by offences, with those executed in apricot. Even for those convicted of the most serious offences, not everyone was executed--other factors intervened.

Figure 3: Gender by penal outcome

Further analysis identifies some of the other factors which shaped penal outcomes: the convicts’ gender and age, as well as whether or not jurors had recommended the convict to mercy. In Figure 3 we can see that, among capital convicts, women were far less likely to be executed (red) and more likely to be imprisoned (blue), than men. Since the records rarely contain explicit justifications from the authorities for why convicts received particular penal outcomes (and particularly so with respect to gender, where the cultural expectations were so embedded that no one felt the need to state them), studying the patterns of actual penal decision-making provides valuable evidence of the likely factors which shaped those outcomes. We are also able to examine what happened to convicts who were originally sentenced to transportation, as it is a less well known fact that between a quarter and a third of convicts ordered for transportation never actually made it to Australia, for a variety of reasons. Here, we are again asking questions about the role of offence and personal characteristics (gender, age, etc.) in determining convicts’ fates; in the case of transportation, occupation may have placed a key role, given their expected role in building the new colonies.

By joining up records in this way, we can write up the lives of individual convicts, so we can now tell the life stories of people who previously only appeared as names in disparate bodies of records; the website London Lives also facilitated this for eighteenth-century plebeian Londoners (see here for a list of biographies from the Digital Panopticon and here for a similar list for London Lives).  This is the Digital Panopticon biography of James Gardner (together with his two confederates, convicted of the same crime), whose life archive we saw earlier. This telling of the life stories of otherwise poor and marginalised people has given new life to the historiographical approach of ‘history from below’, which has been carried out by a number of historians and criminologists, and stands as a strong counterpart to the histories of the privileged which historians so often write.

Figure 4: Life chart of defendants convicted of murder at the Old Bailey

More generally, the Digital Panopticon and associated projects allow quantitative and qualitative methods to be brought together. Both the Digital Panopticon and London Lives allow numerous individual life stories to be written, but the Digital Panopticon also allows us to see we how those individuals fit within broader patterns, as illustrated by the life charts function. Figure 4 summarises the life archives of all the defendants convicted of murder at the Old Bailey (blue = men, orange = women; the horizontal axis provides the years and the vertical axis lists the judicial and penal events the convicts experienced). By seeing where the dots concentrate in horizontal lines, we can see how many of these convicts were executed, transported, and imprisoned; by using the pie charts function, we can quantify this. On the live website, rolling your mouse over any line allows you to see individual convict narratives, in this case that of Amelia Ann Francis (in the lower right hand corner). The red line highlights her life, summarised in the box: she was convicted at the Old Bailey in 1893, imprisoned, and released on a prison licence the following year. Clicking on this line takes you to her full life archive, where we learn that she had been convicted of the reduced charge of manslaughter and sentenced to three years in prison, but she was released a little over a year early. This may seem like an unusual outcome in a murder case, but if we look at the life chart we can see several other cases like hers; she was one of a number of women in the late nineteenth century (orange lines; lower right hand corner) who were discharged from prison after short periods of incarceration following trials for murder—here we have identified a topic which is worth investigating further.

Figure 5: Defendants’ voices and silences in the Old Bailey courtroom, 1781-1880

More ambitiously, by treating all the digitised text in the Old Bailey Online as data, we can take advantages of ‘digital affordances’ to, in the words of Tim Hitchcock, ‘fundamentally change the questions we ask, and the methods we use to answer them’.4 For example, we can incorporate linguistic analysis. Combining the Old Bailey Corpus, a database of direct speech in the Old Bailey Proceedings created by the linguist Magnus Huber, with the Old Bailey Online, Sharon Howard has explored the effect of defendant testimonies on their chances of acquittal (see her blog). Rather surprisingly, saying nothing at the trial was the best way to be found innocent, though saying a lot was also reasonably successful—saying just a few words (under 100) was likely to be fatal.  She then went on to look at what defendants said, and identified some key things not to say when you are brought before the jury at the Old Bailey: do not say you have nothing to say, or simply plead for mercy, or claim you had just found the stolen object in question, or (again surprisingly) say that you committed the crime because you were in distress. These defences were all unlikely to do anything other than secure you a conviction, and quite possibly a harsher sentence.  Much more can be done with this research, but it shows the merits of combining linguistic analysis with the Old Bailey online tagged data.

Figure 6: Virtual reality Old Bailey courtroom

Space and sound can also be brought into the analysis. Tim Hitchcock and Ben Jackson at the Sussex Humanities Lab have created a three-dimensional virtual reality Old Bailey courtroom (based on the original construction plans), peopled it with courtroom actors (skeletons, since we do not know what they wore), and used automated speech reading facilities to allow the testimonies in individual trials to be ‘spoken’ (for more about this project, see here). While there are numerous methodological challenges here, this combination of spatial and aural analysis potentially allows us to think about how testimonies were understood in the courtroom—how a speaker would have been viewed and heard in court, and how this varied depending on whether you were a defendant in the dock (on the right, under the mirror), a witness in the witness stand on the left (in red, currently speaking), a lawyer standing in front of the judges (out of sight), or a judge sitting on high on the bench in the background. We can see from this how power relations in the courtroom were reinforced by its design and the use of space.

In sum, a composite collaborative resource like the Digital Panopticon has the potential to facilitate a wide range of innovative research. And of course, all the new methodologies developed in the course of these projects have strong potential for reuse by other projects.  For example, the record linkage algorithms and visualisation techniques developed for the Digital Panopticon have been identified for use in two new projects which plan to analyse large bodies of disparate sources about groups of individuals: the career trajectories of British Army officers, 1790-1830, and the lives of intellectuals interned on the Isle of Man during World War II.

Finally, to conclude my observations on the strengths of this type of digital humanities project, by creating a freely available resource which attracts public interest, the Digital Panopticon demonstrates the strong public impact these projects can have. Ten years after its launch, the Old Bailey Online continues to attract around 445,000 users a year. In the sixteen months following its launch, the Digital Panopticon website was consulted by 99,000 users, was used in university teaching and schools, and featured in a public exhibition at the London Metropolitan Archives. Google Analytics suggests that a large proportion of the users of the Digital Panopticon, in addition to academic researchers and their students, are family historians, as the majority of its users are in their core demographic group of women over 55, and a large number are from Australia, where interest in the lives of transported convicts remains high. Of course, it has been relatively easy for the Digital Panopticon, and the Old Bailey Online before it, to plug into the enormous public demand for online sources for researching family history; other academic projects often do not have such a straightforward opportunity for public impact. One can, of course, condescendingly dismiss genealogy as an insignificant pastime, but certainly it is not so for its practitioners. I see family history as a sort of gateway drug into the study of the broader history of crime and criminal justice; we have many examples of people who have researched their ancestors transported to Australia who were then prompted to find out more about this unique form of punishment and the crimes which led to convicts being transported, thereby increasing public understanding of the history of crime and criminal justice.

Not only does the public benefit from the availability of resources like this, but so does digital humanities as a field, since the Old Bailey Online and the Digital Panopticon, and many other publicly available websites, showcase the benefits of using digital methodologies in humanities research and its widespread potential for public ‘impact’, thereby justifying the investment by funders, who are currently obsessed, at least in the UK, with achieving public ‘impact’. The Arts and Humanities Research Council has widely touted the success of the Old Bailey project as an example, to the public and to government, of the benefits of funding Arts and Humanities research more generally.

3. Weaknesses

But this has led some to complain that projects like the Old Bailey Online and the Digital Panopticon are vulgar publicity machines, seeking to attract public attention and fame for the project leaders. Around a decade ago, after describing the then recently published Old Bailey Online using the backhanded compliment of ‘an impressive resource that was impressively funded as well’, implying that the money was not well spent, a distinguished historian attacked Tim Hitchcock and me as being:

‘among the academic impresarios of Britain's new enterprise culture […] They are there to make history trendy and relevant, and to pump up the volume […]  Like Tony Blair, Hitchcock and Shoemaker are masters of spin.’5

Should we have followed this line of thought, hidden our light under a bushel and not created the Old Bailey Online? The answer is obvious. There are, however, some more pertinent criticisms that can be made of the Digital Panopticon and similar projects. 

First, despite its name, the Panopticon is not all-seeing, and therefore the project’s title is misleading (and thus an example of ‘spin’). One could argue, therefore, that the image of the digital humanities it presents to the world is flawed, since it promises more than it can deliver. Despite our best efforts it does not take long for many users to identify the limitations of the Digital Panopticon (you may have already spotted some in the figures). This is not to undermine the considerable achievements of the project; the limitations I am about to discuss were largely inevitable, given the project’s arguably unrealistic ambitions.

Figure 7: Jeremy Bentham’s ‘panopticon’ (1787-91)6


We were inspired to call this project the Digital Panopticon by the work of Jeremy Bentham, the late eighteenth-century English Utilitarian philosopher who, in his ambition to use punishment to reform convicts (as opposed to seeking retribution, as with execution, or exile, as with transportation), designed what he called a ‘panopticon’, a new kind of prison where the inmates would be under constant surveillance from a central point. His theory was that, after being constantly watched, prisoners would internalise this surveillance and reform their characters and habits, preparing them for a return to society.

Figure 8: The Panopticon… doesn’t work!7 

But research by Zoe Alker and Nick Webb at the University of Liverpool indicates that this level of surveillance was impossible to achieve.  Alker and Webb created a three-dimensional virtual reality version of the panopticon based on the extensive notes Bentham compiled in developing his vision (the building was never built, though many nineteenth-century prisons adopted aspects of the design, notably the idea of a central observation point). What this model showed, incontrovertibly, was that total surveillance was not possible—there would always have been places to hide (as you can also clearly see in the earlier two-dimensional plan on the lower right).

Of course, we did not actually think this kind of total knowledge was possible. Given the nature of the databases included, the Digital Panopticon could never tell the full cradle-to-grave story of convicts. We learn little about their families and friends, their work and leisure activities, and their interests and passions. Instead, the records we have assembled define convict lives by their interactions with judicial institutions, and thus primarily as criminals.  This is a history of interactions with criminal justice; it is unlikely to have been how these individuals defined their own lives. We are telling a very partial story, very much from the point of view of the authorities. Unfortunately, it is very difficult to hear ‘the criminal voice’ in these records, though a new project on convict tattoos we are working on, analysing the physical descriptions of convict bodies found in the Digital Panopticon records, allows us to examine one aspect of convict agency, through the ways in which they marked their own bodies.

A second problem concerns the quality of the data in the Digital Panopticon, including the datasets which the project ingested. Not all records are as clean as those found in the Old Bailey Online (and of course even that website has some errors). Hardly a day goes by without someone emailing the project to tell us about an error of some sort: a name has been mis-transcribed in a record; or the birth or death date is wrong; or the convict sailed to Australia not on that ship but on a different one. For the most part, these are not the fault of the Digital Panopticon project, but are errors either in the original evidence or in the digitisation process which created the datasets we ingested, and we cannot correct them. And, of course, not all the records we have linked together should be linked together, or (more often) we have missed links that should be there. This is not surprising: records were linked together by automated processes (though through an iterative process which included some manual checking), and these linkages are not without error. As noted, the four million records in the Digital Panopticon pertain to some 250,000 distinct individuals, and we will never get all the links right. Most errors are minor, but you could argue that they undermine the credibility of the site, and perhaps of the digital humanities more generally, by suggesting that our data are not reliable. One thing we had hoped to do in this project (but we ran out of time and money) was to develop means of indicating to our users that the links are not always certain, and to indicate the types of errors that users should look out for. We had hoped to give users some indication of the probable certainty of the links provided on the site, and to encourage users to check anything that looks wrong. 

In a very simple way, we did achieve this with death records from FreeBMD (a project which is transcribing the indexes to the civil registers of births, marriages and deaths in England), since there were just too many people with the same names, ages and birthplaces in nineteenth-century England, and there is not enough additional information in the records to corroborate linkages. For example, take Amelia Acton, a recidivist thief who was convicted and imprisoned several times between the ages of 31 and 46. As her life archive indicates, she was released from the House of Detention for the final time at the age of 51, and that is the last record we have of her in the Digital Panopticon records, except that a woman with her name and age appears in an official death record 29 years later, at the age of 80 (these records can be found at the bottom of her life archive). Our display puts this record in blue and marks it as a ‘possible death record’, indicating that the name and year of birth match exactly (unfortunately we don’t have the date of her birth in the convict records). But as we have no other corroborating information, we cannot be certain this is a correct link, so we have marked it up in this way, and left it up to the user to decide. This is a blunt tool; with more time we could have done much more along these lines to indicate probable levels of certainty in linkages between different types of records. As the emails we receive suggest, not all the linkages displayed in red and beige are correct either.

We have always tried to make it clear to users that digital records are no more definitive than any other type of record, but it is hard to counter the illusion of accuracy conveyed by a computer screen, and not everyone has the skills, time or inclination to critique the online evidence they are using.

Ideally, of course, we would correct errors when they are discovered. We considered crowdsourcing (both for the original record linkage and for corrections) but decided that it was not the answer, given the size of our data and the fact that it could easily introduce new errors and would require oversight, but we did provide a reporting mechanism, ‘report an error’ (in the upper right hand corner of each life archive), which is why we receive so many messages. But the funding has run out for the project, and our ability to implement corrections is very limited. This is, of course, a well-known major problem with research funding for internet-based, grant-funded research resources—how to maintain a website once the funding has run out. A website is a living thing, always requiring changes, not only those imposed by the underlying infrastructure of software and the internet, but also in response to new information coming to light, but this is not recognised by research funding mechanisms.  And yet it is necessary to keep updating a site if it is to appear authoritative. This is a topic which Jamie McLaughlin addresses in this volume, so I will not say anything more, except to observe that it is easy for readers to understand that a book cannot include any references to material that became available after its date of publication, but I do not think users of the internet are that forgiving.

There are more important issues to consider. Despite our attempts to create substantial public impact, driven by external agendas as well as our own commitment to making history publicly accessible, the ultimate purpose of the Digital Panopticon is to facilitate innovative scholarly research. I have already mentioned some research facilitated by the project, but how well does the website itself make that possible? Several features on the website are designed to do this. These include the visualisations, which allow users to see overall patterns in the 250,000 lives documented. As can be seen in this article, and in our Visualisation Gallery, these include life charts, graphs, bar charts and pie charts, and Sankey diagrams.

Figure 9: Digital Panopticon search options

But, despite the wide range of search criteria available (Figure 9), most of which can be visualised, ultimately some research users are frustrated by the fact that not all forms of search are possible, as they are often unable to drill down to answer their specific research questions. They also cannot edit the data, add additional data, or conduct more sophisticated types of statistical analysis. Here we come to the limitations of delivering a research resource on a web-based, publicly available platform. The advantages of this form of delivery are clear: all users are consulting the same body of evidence, and those searches can be referenced (using the ‘cite this search’ function) and replicated, promoting the good research practices of open data (and open analysis). But however much we develop, at significant expense, search and visualisation techniques suitable for answering an increasingly wide variety of research questions, we will never be able to anticipate all the types of analysis researchers will want to conduct, and we will always end up with a resource which does not live up to every researcher’s expectations. In fact, we have found it impossible even to meet the expectations of everyone involved in the project, let along the wider research community.

Figure 10: Examples of the download facility, with snapshots of search results (top, with options to visualise, download, or cite this search) from the website; and a downloaded TSV file imported into Excel offline (bottom)

Therefore, we added a download facility, which allows users to use the search functions to define the parameters of the data they wish to analyse, and then download the data as a TSV or JSON (JavaScript Object Notation) formatted file for analysis offline. To ensure the server is not overloaded, there is a 5,000 line limit for downloads (and for visualisations as well), but this is perhaps not too much of a problem for downloads since you can download your data in parts, and then reassemble it offline. Once downloaded, users can correct and add data, and manipulate it using their own data analysis programme, whether it is Excel or something more sophisticated. This is a very useful facility, but of course we have lost the advantage of the reproducibility of the results, unless the researcher then deposits their modified data somewhere. And we still have the potential problems of the limited range of search criteria which can be used to select the data, and the limited number of fields in the downloadable information. This information comes in 43 columns (the first 18 of which are displayed in the bottom part of Figure 10), providing information about various personal characteristics of the convict—gender, occupation, religion, height, complexion, eye colour, etc— their crimes, and the trials and punishments they received. But these 43 columns still represent only a fraction of the scores of fields recorded in the 50 datasets in the Digital Panopticon, and one has to allow for multiple records (such as trials) for a single convict.  Even with this facility, the DHI still receives requests from members of the project team for specific commissioned downloads, containing additional information tailored to their research needs; given resource constraints, it is not possible to do this for the wider research community. The bottom line is that it is simply not practical to deliver a totally flexible and comprehensive research resource like this online.  

4. The Way Forward

Figure 11: Web-based research resource

Looking to the future, perhaps what we need to create is a new type of research-focused online digital platform which could host multiple data sources, including those created by users from their own research or adapted from the data available on the platform; which both links records together automatically and allows users to do this themselves (whether manually or through specified algorithms); and which provides a comprehensive search facility to identify records for further analysis.  Users would be able to conduct a variety of types of computational analysis (linguistic, statistical, data mining) online, but also download data for analysis offline. They would then be able to post their results, and any modified or new databases, back onto the site. 

As you can see from the crude diagram (and title) in Figure 11, such a resource would involve creating different levels of user, and user engagement, with the site. There could be, for example, public users, who would be able to search and analyse data, suggest corrections, and enter into dialogue with other users (black and green lines); and academic researchers, who would in addition be able upload their own datasets and link records together (red lines).  Additional levels of user engagement are possible.

To make this manageable, such a platform would need a specific subject focus, where the range of data involved shares some common characteristics. As a historian of crime, I can imagine a platform for historical data on British criminal justice, where trial records, records of punishments (executions, transportation, imprisonment), and qualitative data such as depositions, petitions, and newspaper reports could be hosted, interlinked where appropriate, and made available for both online and offline analysis. Historians of crime and criminal justice are forever creating their own Access and Excel datasets, and this would allow them to upload those datasets, and combine them as appropriate with the other existing data. A side benefit would be that such a platform would encourage greater collaboration between scholars working on these topics. While such a site would be less publicly focused than the Digital Panopticon, and perhaps more difficult for inexperienced users to consult, it would still be publicly available to anyone who wished to invest the time and energy to familiarise themselves with how it works.

Such a resource would, of course, need to be actively maintained, to ensure its continuing functionality and prevent inappropriate use. It would thus need a funding mechanism that met the cost of this oversight over time, in contrast to the current short termism of existing funding arrangements. If all that was required was maintenance of the site (rather the creation of new datasets or functionalities), however, the cost would be modest.

No doubt there are other subjects which would benefit from such a web-based resource.  Perhaps a body of texts written by a single author, or from a single genre, or a collection of images or objects—the possibilities are endless.

The way forward for internet-based research resources, therefore, is for projects like the Digital Panopticon to rebalance the competing demands of public impact/accessibility and the needs of research more towards the latter, without excluding anyone who wants to consult the material. That would require an appreciation on the part of the funders, notably the UK Research Councils, that in the rush to obtain ‘impact’ they have become in danger of neglecting the needs academic research. As someone who worries that he has fallen into that trap, this is the primary lesson I have learned from the Digital Panopticon project.

