1. Overview

A selection of responses from the critique of existing resources:

  • There is a lack of clarity about what content is available and how it can be searched.
  • Contextual information for dating is missing: e.g. How have dates been recorded? Is it according to the date on a pamphlet or the date they know it is?
  • Search is often not language-independent.
  • Scrolling back to the top to check search terms or parameters is an annoyance.
  • Having to develop ‘work-arounds’ by adapting search methods to the vagaries of certain resources.
  • ‘Swiss-army’ interfaces - that attempt to provide apparatus for all conceivable users can become cluttered and difficult to use; simplicity on the surface is preferred with the option to dig deeper and look ‘under the bonnet’ to set parameters at a finite level.
  • Frustration with attempting all kinds of variations on search and searching within items but nothing useful coming back.
  • Mistrust of tool if different results are returned for the same search.
  • Keywords do not bring up results that are known to be there.
  • Having to resort to using the browser’s search function to find specific words in the text.
  • Having to resort to the index to find search term.
  • Titling issues: same terms are searched over and over yet new results keep appearing.
  • Text partnerships are making more texts searchable but non-expert inputting means mistakes are made in deciphering letters etc.
  • Results are often returned in an incoherent way with no indication of how they have been ordered.
  • Certain resources are difficult to search in a nuanced way and cannot be scoped for literary analysis, for example: You need to know what you’re looking for.
  • The metadata attached to entries can be very inconsistent or too broad or too specific: inappropriate categorisation can hinder searching.
  • Many tools are not suited to linguistic research.
  • Fuzzy matching is not useful for linguistic research.
  • Attitudes to grammatical forms are often unclear
  • ‘Grab and go’ – many researchers prefer to export data and work with it outside the search interface.

Four main themes emerged from the Design Groups: searching, personalising, citing and sharing. The key aspects within these themes are as follows:

1. Searching:

  • Context and help.
  • Browsability.
  • Layering/ordering.
  • Self-refining.
  • The “third” page.
  • Visualising a search network.

2. Personalising:

  • Account.
  • Saving searches and creating subsets.
  • Preferences.
  • Ownership.

3. Citing:

  • Reliability of results.
  • Validity.
  • Trusting tools.
  • Open access to data.

4. Sharing:

  • Research community.
  • Affiliation.
  • Feedback (reporting errors).
  • Future proofing.

These themes also reflect many of the findings from the survey and focus groups and present a background structure of search. In this section the specific aspects within each theme will be explored and illustrated using extracts from Design Group and meeting transcripts, including participant designs and prototype screenshots where appropriate.

2. Searching

2.1. Context and Help

It has been a recurring observation that researchers want and need to know where they are with a resource. Many digital resources do not locate the material they hold within the broader context of the research area and this can be a problem. Researchers want:

 “Really detailed description all the way through of how the records are stored or how you’d find the resource”.

They also want to know extra details and information, such as date and place of publication, author, printer, version, and have these easily accessible.

This extends to looking at individual search results where it’s important to know:

“Where it is in the newsbook and whether this is… headline news or not whether it’s at the top of the title page or on to the second page.”

It is not something that is felt should be built into search but should at least be accessible information in order to evaluate for oneself.

There is also the contextual information from seeing the original source material. This depends on the quality of the digitisation, unfortunately fairly poor in the case of the Thomason tracts; yet for most researchers access to the original digitised images is extremely important as they would only use a transcription as a method of checking. Figure 14, below, shows how the group designed for the original and transcription to sit side by side and for keyword search terms to be highlighted in the transcription.

Figure 1: Prototype screenshot: result page with keyword highlighting.

chapter10-1

Throughout the project we have found that very few people read search help documents and as such contextual help has been incorporated (see bottom of fig. 14 above) providing guidance where necessary, for example, for navigating the digitised images.

2.2. Browsability

The discourse between newsbook issues and a desire to link themes meant that users were interested in more flexible ways of moving through the resource (fig. 15) but also to accommodate tangential search patterns following particular trains of thought while retaining the search paths already taken.

“There’s a discourse between issues that would be nice to just be able to pull one thing up then you can go back or forward.”

Figure 2: Options for browsing around a result page: previous/next issue; title/previous/next page.

chapter10-2

2.3. Layering/Ordering

The group wanted different ways of layering and ordering: chronologically, alphabetically, chronologically by newsbook title, by relevance, and so on. They wanted to have a lot of control over the process and for records to include as much detail as possible.

“Having it come up chronologically and quite detailed chronologically so like 8th June 9th June rather than EEBO you have to continually redo it and if you click through on a heading it’ll just sort it alphabetically again and it’s also only sorted by years so 1641 1642 which can be quite misleading because sometimes they go by the date on the pamphlets and sometimes they go by the date that they know it is.”

Figure 3: Prototype screenshot: Sort by…

chapter10-3

2.4. Self-Refining

There was a definite sense of not having everything predefined and getting involved in the refining process from the beginning of a search:

“Flexibility to be able to say we don’t need that, we don’t need that, we do need this.”

The group came up with a show/discard tool (fig.17), putting the user back in control of their search results and allowing them to decide relevance for themselves. This was expanded by the developers to include a list of options, which incorporated various other refining choices requested by the group, such as only showing hits from certain titles, years, months or the current issue (figs 18 and 19).

Figure 4: Group design detail: Show/discard tool.

chapter10-4

Figure 5: Prototype screenshot: Show tool.

chapter10-5

Figure 6: Prototype screenshot: Discard tool.

chapter10-6

2.5. The “Third” Page

One of the aims of the project is to locate search more firmly in actual research practice and as such the boundaries of search were pushed as far as possible, arriving at the idea of a “third” page, which might dig deeper into notions of voice and authority and refining those. The conclusion drawn from this discussion was that:

“To a certain extent that third page is in your head… if we go further especially into the issues like voice, I don’t know how you would represent that in search”.

2.6. Visualising a Search Network

The group were thinking about search in broad terms, locating it within their own research practice, and thus linking search to how they do their research and how they get meaning from it. This entails visualising a network of search findings and looking for links that might add to their understanding of a particular set of search results. One participant offered a diagram of how she might visualise a search during the Interaction Analysis Workshop.

Figure 7: Conceptual research network 1.

chapter10-7

3. Personalising

3.1. Account

The ability to personalise a site was given a lot of weight throughout discussions and the idea of some form of account system was raised time and again. The group wanted to be able to save searches and results into their own categories, and label, annotate and comment on results (fig. 21). At the same time they were conscious that:

“If it’s doing something new it has to work within your workflow or it has to work within your routines”.

Figure 8: Group design detail: Account.

chapter10-8

This could explain why what they came up with had much in common with a web browser favourites tool, which allows you to create folders and subfolders in which to save search lists and specific webpages.

This manifested itself in the prototype as a personal ‘workspace’, figure 22 below.

Figure 9: Prototype workspace.

chapter10-9

3.2. Saving Searches and Creating Subsets

There was some resistance from the developers at first due to previous experience of implemented account systems for digital resources, which have had minimal uptake. As an initial compromise, they provided a sessional bookmarking function, which the group were happy with but they continued to press for an account system. From here, they wanted to be able to save pages of results and individual results into their own categories and subsets so they could compare and link themes. They also wanted to be able to comment on and label results and asked for a non-linear search history that would trace the more complex search pathways, to show that, “it’s not just that you searched for this, but you searched for this, then this”.

In the third Design Group, led by scepticism on the part of the development team, we made a point of taking time for participants to test these particular elements and give feedback on whether and how they would use these functions and if not, why not; whether it be because of the way in which they had been developed or because they realised through using them that they were not actually necessary. The group immediately took to the idea of having their own workspace, and while they had improvements to suggest, they left us in no doubt that they would use it within their workflow and find it extremely useful for their own purposes as well as for other activities, such as teaching.

3.3. Preferences

The group also wanted to be able to set their preferences and for those to be maintained as part of a user profile on their account:

“Something where you set it and then for the entirety of the session it did that so like the kind of need to continually reset how you want to view results could get really irritating”.

This was raised in reference to ways of ordering results, either chronologically, alphabetically or otherwise and anything else for which they subsequently chose to offer options for configuring the site to a user’s personal specification, such as relevance.

3.4. Ownership

Ownership will be elaborated on more fully in the following sections, briefly however, a personal account would allow a user some ownership over search processes. The participants were interested in the idea of an option to keep things private or share search findings selectively with the wider research community or with students for teaching purposes.

“I’m kind of thinking it would be very useful for teaching, to show an example search… and they can then show their searches so we can discuss them in class.”

Figure 10: Option to make annotations public.

chapter10-10

An account is not only useful in terms of mapping, managing and manipulating your results, it is also practical in terms of a methodology and accountability and citation that will be covered in the following sections. In fact, the account became an important aspect of the resource design that could no longer be put off.

4. Citing

4.1. Reliability of Results

A resource can very quickly lose the trust of its users and from that point it is difficult to win it back. The group complained about identical searches on sites producing different sets of search results and not providing any or enough information about provenance, edition, date or whatever the case may be.

The way the design has evolved has led to the onus being on the researcher to work with the results:

“What I prefer about the work that’s already been done here is it gives me everything and then I get to choose which bits I keep and get rid of”.

The participants explained that some sites which attempted to define search aspects too narrowly made them afraid of missing potentially significant results. They would much rather do this themselves and feel satisfied that they have captured everything.

4.2. Validity

The account system begins to address some of the questions surrounding overall resource validity because users can be responsible for the work they are doing and illustrate their methodology effectively. One participant commented:

“Can you own up to the fact that you looked and this was the result you got from all of your searching, and just to be able to have that comfort of saying, actually I can”.

In addition, it was concluded that to enhance claims of validity, any shared comments and notes should be accompanied by institutional affiliation.

4.3. Trusting Tools

When the observations from the Design Groups were initially reviewed, the developers were surprised that the participants were asking for things they felt were already provided by the browser itself, such as bookmarks and favourites. We concluded that while they might be aware of these, they perhaps do not completely trust them, hence a desire to internalise these features.

Participants were adamant about a way of retracing their steps within the resource because of not being entirely sure what the browser back button would do, for example. Seen from this perspective, the developers readily acknowledged why users might be nervous about this and that it is indeed often the case with some sites where the back button might return a user to the homepage or cause them to leave the site completely, commenting that:

“Everyone’s used to tiptoeing around their computer expecting everything to break, which goes for every aspect of using a computer”.

Trust also comes from being a part of the user community sharing the resource, where it does not remain static but feedback is evaluated and acted upon.

 “The trust comes from ‘now I’m a logged in user, I can save stuff’”.

Participants returned again and again to the idea of simplicity, in that a lot sites that try to provide a ‘Swiss army’ interface lose users’ trust by trying to define everything for them and becoming too complex and unmanageable. When discussing ‘look and feel’, it was noted how content they were to leave the resource as it was, basic and clutter-free, with all the essentials and no unnecessary frills.

4.4. Open Access to Data

With the huge push towards open access to research data, researchers will become increasingly responsible for their data curation rather than just scholarly output in terms of publications. The group talked a lot about persistence of URLs and project version control to ensure that citations of search threads they might draw from the resource would remain valid.

There are also arguments to be made around citing the digital resources to promote their prestige rather than searching out references for the corresponding analogue material, since some researchers feel that online resources do not have the same validity as printed ones, but also because the users may have mediated the data and that should to be recognised in some way.

5. Sharing

5.1. Research Community

From the outset, the recurrence of the notion of engaging and connecting a research community has been striking. Most scholars we have spoken to seem willing to share certain aspects of their research and how they have used digital resources, whether this is with colleagues or students. For example, having a “feedback loop built in so future researchers would say people looking for these certain texts might try these terms”.

When speaking about individual search methodologies something that almost everyone said was that often, in terms of searching for literature and resources, they would make contact with colleagues in the field as one of the first ports of call.

“I just try and arrange a meeting with the people I know in the same way as dissertation students of MA or PhD actually come to me because I may be able to make links they wouldn’t think of by searching they would find standard texts but I could say do you know there’s another body of literature here, completely different wording so you’d never find it by search but they’re talking about related things and it could be relevant because of that.”

The willingness of academics to feed back the insider knowledge they have gathered over the course of their careers was something we did not initially expect to find. Throughout the Design Groups the participants have continued to place a high value on this type of ‘search’ and have made a concerted effort to build this research community aspect into their resource design with the functionality to share comments, corrections and notes with other users.

5.2. Affiliation

As previously mentioned, the addition of a method of commenting and note-taking must involve a process of logging in to join the community of users and thus identifying oneself as belonging to an academic institution.

“If you have access to this because you belong to a research institute or university that has access that allows you to tag…”

In order for contributions of this kind to have validity with other users they must be able to cite their sources and for these to be traceable and reputable.

5.3. Feedback (Reporting Errors)

The accuracy of the rekeying done on the vast amount of newsbooks data for this project was promised to be very high (99.95%). However, given the huge quantity of data, there will inevitably be errors and gaps. The users felt that, for any resource, it would be useful to be able to correct errors or flag them up for future users as you come across them, and again these users would need to prove their expertise in order for their contribution to be valid. They suggested the possibility of giving certain registered users advanced privileges to make direct changes to the transcription, or as an intermediate step, for users to flag up corrections in public annotations by marking them with “REVIEW” or similar so they would be easy to find.

5.4. Future Proofing

It was generally agreed that digital resources should be evolving with the wider research community and augmentable by those accessing them.

“You can never totally future-proof something like this, you can make it open and hope that it is as future-proof as possible.”

This also links back to concerns about reliability and trust in a resource. The participants felt that if you could see that a site was implementing changes suggested by logged in users and was growing and developing to fit the community, then they could put more trust in the results they were achieving and access this added value.

6. Disciplinary Differences

Some distinct disciplinary differences have been noted throughout the design work and in the Design Groups the participants have thought about how to incorporate their different requirements and ways of working.

For History researchers, precise dating is very important, or at least some contextual information about how dates are handled in the resource.

“In the 1640s the calendar year began on 25th March and ran until 24th March so say a date like say I want 1st February 1641 according to our Gregorian calendar, that would have been 1st February 1640 in the calendar of the time.”

For English Literature researchers the ability to compare and link sources was central; our participant commented on her extensive use or ‘abuse’ of tabbed browsing and was interested in integrated something similar into the newsbooks resource. She also appreciated having associated links to records that she trusted:

“Records come from the ESTC [English Short-Title Catalogue] so they tend to be fairly accurate”.

 For Journalism researchers an idea of variances between editions was significant:

“Here’s the text, here is the image that comes from and here are other versions of it because it’s maybe more of a modern issue not just other versions like if it’s drawing on multiple libraries here or prints of it but editions matter so the first edition of a story doesn’t include a quote that the second edition does”.

The linguistics approach stood out as being significantly different to other disciplines and is very often ignored by resources where search options clearly reflect certain research interests (particularly historical) and are tailored to those. The differences would undoubtedly be more obvious for applied linguistics, in fact completely separate and therefore its methods are not relevant to this project’ however, the participant in our Design Groups is a historical socio-linguist and as such her search strategies are quite similar to those used in history. She did flag up certain annoyances with historically-driven resources and to reflect this we have attempted to incorporate features to accommodate linguistic research. These have been confirmed by other participants from different disciplinary backgrounds as useful additions for their own work. Particular aspects included: saving search results (to protect yourself from site upgrades), removing or hiding certain results, and a ‘jump to page’ option for search results.