5. Designing an ontology

There are already numerous data ontologies for different knowledge domains, and the Web Ontology Language (OWL) is a very sophisticated XML schema designed specifically for creating ontologies. The XML examples on page 4.3 do not use OWL in order to simplify them, and I am of the view that OWL is over-engineered for the requirements of most arts, humanities and social science researchers (there are those who like over-engineered solutions, of course!)

The ontology which has received most attention in the arts and humanities is CIDOC which describes concepts that are relevant to cultural heritage and museum documentation.

Creating a new ontology from scratch, or even refining an existing model, is first and foremost a research process and an extension of the typology development that is common at the outset of social science research. This is different to the process of applying the ontology to data, which is often an editorial process in that we have to identify, judge and classify what we consider to be significant entities.

The process of ontology development is usually iterative, because most knowledge domains and most primary and secondary sources are too conceptually complex to be fully understood and described at the outset. Instead, we have to begin with a first version of the ontology and then elaborate and refine it in response to us applying it to our data.

Creating an ontology requires subject experts who understand their data and the data characteristics (entities, attributes and relationships) that are significant within their knowledge domain. Subject experts can begin drafting a data ontology using the handwritten approach, but it also requires a data scientist or a research software engineer who is able to translate what the subject experts want into an appropriate technical schema.

Technical schemas for ontologies can be created manually using a software program as simple as a text editor or database software, or by using software programs that are specifically designed for ontology engineering, such as Protégé.

Based on the experience of a project such as Beyond the Multiplex, an ontology can become too complex too quickly, and we need to balance its size and detail against the time which it will take to implement the ontology using real data. There is no point developing a model that is too big to implement consistently across all your data, and there is no point developing a model that has entities which are not relevant to your knowledge domain or topic of research. The tendency to over-complicate an ontology usually requires a process of rationalisation at a later stage, which can be cumbersome if the model has already been applied widely to data. This is especially the case where entities that are actually synonyms of each other have crept into the model. For example, in an ontology concerned with audience behaviour, it is not helpful to have sad and depressed as distinct entities.

previous page | next page

URL for this page
https://www.dhi.ac.uk/blogs/ontology-guide/designing-an-ontology