How language technology can assist legal scholarly research

Within legal AI, natural language processing (NLP) techniques provide valuable text and knowledge base derived information for a wide variety of legal analysis. NLP techniques provide a bridge between the linguistic surface structure and the underlying conceptual content. Identifying, extracting, and formalizing legal knowledge remains a highly knowledge and labour intensive task, creating a significant bottleneck between the semantic content of the source material, expressed in natural language, and computer-based, automatic use of that content. Increasingly, natural language processing (NLP) techniques are applied to assist knowledge acquisition from text.

In this paper we concentrate on a close reading setting that involves deep text interpretation by legal scholars in order to address their research questions.

The inclusion of NLP techniques into legal interpretation workflows customised to scholars’ research needs entails a set of automated analysis tasks that assist legal scholars, and a feedback procedure between automated results and manual legal interpretation. This integration of manual and automatic analysis of textual material aims at maximizing the acquisition and exploration of the conceptual structure of the legal domains. The NLP results are therefore not a one-step solution, but provide textually derived information and structured knowledge in an incremental and flexible way. This knowledge can then be used for further exploration and interpretation by experts, and eventually, if required, formally modelled in the form of an ontology.

NLP methodologies use techniques such as linguistic analysis, named entity recognition, term extraction and relation extraction.. The extracted information is associated with the source texts through text metadata in the form of annotations. For the creation and presentation of these annotations we make use of state of the art tools such as GATE ( Together with information from external resources such as term banks and ontologies, this will result in an integrated knowledge structure that makes semantic content explicit and accessible for manual expert interpretation and evaluation. This knowledge base will be built semi-automatically through a collaborative effort involving language technology and legal expertise for interpretation and modelling.

In order to illustrate this workflow we will present a case study for the integration of NLP tasks into a scholarly workflow.