1. Survey background
A landscape survey created in Survey Monkey was launched on 7th March 2012. It was initially distributed by email to departmental administrators at fifty universities UK-wide 11. University of Sheffield, Sheffield Hallam University, University of Cambridge, University of Oxford, Goldsmiths, King’s College London, London School of Economics, UCL, University of Leeds, University of York, Durham University, Lancaster University, University of Liverpool, University of Manchester, University of Birmingham, Newcastle University, University of Sussex, University of Surrey, University of Central Lancashire, Birkbeck University of London, University of Edinburgh, University of Glasgow, University of Aberdeen, University of Stirling, University of Northampton, St Mary’s University, University of Hertfordshire, University of Hull, University of Lincoln, University of Exeter, Bristol University, Royal Holloway, Queen’s University Belfast, Cardiff University, University of East Anglia, University of Kent, Aberystwyth University, University of Southampton, University of Nottingham, University of Leicester, University of Reading, University of St Andrews, University of Dundee, University of Bedfordshire, University of Bath, University of Chester, University of Derby, University of Ulster, University of Warwick, Keel University. to be forwarded to PhD students and academic staff.
It was subsequently also publicised through the project blog (http://psdnewsbooks.wordpress.com) and the HRI and AHRC Twitter feeds. It appeared on the JISC webpages as a news item and was also advertised through Facebook, LinkedIn and via representatives of the Higher Education Academy. To boost the response rate towards the end of the collection period, the survey link was sent personally to academic staff at a smaller selection of universities, primarily those of the White Rose University Consortium and London universities, as we hoped to identify focus group and Design Group participants from these locations. The survey was closed on 8th May 2012 when the target response rate of 500 had been reached.
A total of 509 responses to the survey were received. From this point on, the percentage of respondents is calculated on those who answered each particular question rather than the total number of responses received, as it was possible and necessary to allow respondents to skip questions due to the structure of the survey. The analysis is undertaken per page of the survey and each question acts as a subtitle. Diagrams are provided by the Survey Monkey ‘Analyze Results’ function.The full survey can be found in Appendix 1.
2. Survey Questions, Diagrams and Initial Analysis
Question 1: What is your current role? How many years have you been undertaking research?
38.7% of respondents were PhD students who were among the first to respond; 8.1% were post-docs and research assistants/associates; 53.2% were academic staff with emphasis on lecturers and professors, which may correlate with 40.1% stating that they have been undertaking research for 0-4 years (early career researchers + PhD students) and 29.1 % stating 14+ years.
Question 2: What is your research discipline?
History provided the greatest number of responses (40.9%) probably owing to History departments being present in the majority of universities surveyed and having a large number of staff per department. Possibly also greater interest in the project’s test dataset prompted a higher response rate. English Literature (24.2%) and Politics (22.8%) also achieved a high response rate, also due, it must be assumed, to the targeted universities having vibrant research cultures in these disciplines. ‘English Language’ (7.7%) and ‘Journalism’ (4.5%) were less well represented, perhaps owing to the fact that the survey did not lend itself well to answering from the perspective of English Language research, whilst Journalism is not currently as widely taught or researched as other disciplines.
Question 3: What type of research methods training have you had?
This was a multiple response question. 77.8% of respondents had undertaken PhD research training. An average of 26% had had each of the other types of training listed, with library skills being the most common at 41%. In the ‘other’ category, palaeography, codicology, and language training were frequently cited, which arguably fall under subject specific or advanced methods training; while many respondents stated they were essentially self-taught i.e. had developed research methods through the actual practice of conducting research (learn by doing, trial and error).
Question 4: For research purposes, do you use text-based or web-based sources?
93.7% of respondents answered that they use both types of source, which was an expected majority. 3.6% and 2.7% limited themselves to text-based or web-based sources respectively. Those declaring themselves primarily text-based were mostly PhD students of History, which is interesting as one might expect later career researchers to be more text-focused due to presumably having a broader knowledge base of available sources and literature. Although, we must bear in mind that the largest sampled groups were PhD students and those from the discipline of History so this result could be somewhat skewed. In terms of those working online, they spanned the various career stages but mostly came from the disciplines of Politics and then English Language, suggesting that the resources they require can be easily found and used online.
Question 5: What sources do you use in your text-based search strategy?
This was a multiple response question. Digital indexes and catalogues (75%) seemed to take preference over their counterpart printed sources for use in text-based search strategy. This could be to do with access or increased digitisation of research tools in physical libraries and archives. Chapters (80.4%) and journals (92.8%) are also commonly used. Browsing shelving systems (66.3%) seems also to be a regular practice; researchers could be looking for similar titles/authors/items that are known but were not identified through a catalogue or index. In the ‘other’ category, common themes were ‘word of mouth’, ‘footnotes and bibliographies’ and ‘personal databases and digital libraries’.
Question 6: What search components do you employ in a text-based search?
This was a multiple response question. Keywords (95.2%), names (92.8%) and titles (85.5%) were the most commonly employed search components in text-based search. Less emphasis was placed on dates (62.5%) and synonyms (24.1%) perhaps being used more to expand searches than for initial steps. ISBN numbers (10.2%) are used infrequently, perhaps because they are not as easily called to mind as names or titles; it is more likely that ISBN information is passed between colleagues, which was only mentioned by 3 respondents in the previous question. Responses picked out of the ‘other’ category include: place, place of publication, language, acronyms, publishing date, cited by.
Question 7: Please briefly describe the stages of your text-based search process (access to archive, primary scoping etc)
It seems that researchers tend to have an initial, quite general, browse over a lot of material before defining some keywords to look at in detail and filtering what they discover through that process. Many commented that they are rarely working from scratch so know fairly quickly where to go to find what they need, while others mentioned browsing shelves to get an idea of what is around or make serendipitous discoveries, or using their own collections of books and journals.
Nowadays, for reasons of time and cost, many scholars do their initial research online to find out whether a trip to a library or archive is worthwhile for them before they commit to a short and intensive research trip, particularly if travel to another country is necessary. In this way, text and web approaches overlap somewhat.
Question 8: How do you judge when a text-based search is complete?
This was a multiple response question. The consensus seemed to be that a search can never be judged to be ‘complete’, however the main reason for stopping was revealed to be practical constraints (such as time pressures and travel restrictions). This reinforces the notion that the search interface used to present the newsbooks must be intuitive and efficient to be of use to individuals who have many demands on their time. Quality control was significant, with 40.5% of respondents saying that verification through cross referencing completed their search, which links to having sourced and collected all valid data (40.8%).
Question 9: How do you collect and manage data from a text-based search?
This was a multiple response question. 77.5% of respondents stated that they used electronic file stores such as Word, Excel and Access to collect and manage data collected from text-based searches, which likely also encompasses indexing and bibliography creation. 52% still keep hardcopies for filing; this could be because some find it easier to read and make notes on paper, for security or simply out of habit. One particularly interesting response in the ‘other’ category stated ‘digital photography’, which was subsequently mentioned in focus group sessions as a way of recording resources in libraries and archives as an alternative to photocopying.
Question 10: Please give example(s) of archives/data sources you frequently use
Respondents gave many specific examples, which fell into the following categories: local and national library and archive catalogues, museum and gallery collections, special collections, subject specific collections, bibliographies, university library archives, institutional library and archive catalogues, rare book catalogues, journals, newspapers, official records, abstracts, among others.
Question 11: What sources do you use in your web-based search strategy?
This was a multiple response question. Search engines proved to be employed by the vast majority of respondents (91.2%) perhaps due to the generality of the search they offer, which is useful for primary scoping if working from a broad research question. Specialist databases, such as JStor and ProQuest, were used by 89.2% of respondents; perhaps because researchers already know that they can find what they are looking for there. This is also possibly something to do with validity, in that the collections found on these sites are seen as more trustworthy and verifiable. Alongside these, digital collections and official online records were also frequently employed (78.9% and 64% respectively). This shows that the digitisation of traditionally paper/text-based sources has become important for researchers, perhaps in terms of ease of access to such documentation online. There must be limitations on these resources in terms of keeping up to date with the most current events (this applies more to researchers in the fields of politics and journalism perhaps) – a great deal depends on the speed of the digitisation process. For older sources this is not a problem unless it has only been partially digitised. What level of digitisation is important? – is it enough just to be able to view facsimiles online? This will vary from discipline to discipline and is something that was discussed further in the focus group setting. Interestingly, meta-search engines and information gateways were rarely used (6.1% and 7.6% respectively). This could be because respondents do not know they exist, what they do or how they could be useful. ‘Other’ responses included GoogleScholar and Web of Knowledge, US-based Open Library and Internet Archive, which are online library collections, and also online newspaper and journal databases.
Question 12: What search components do you employ in a web-based search?
This was a multiple response question. As predicted, keyword search was the most commonly used (98%) followed closely by names (90.4%) and dates (67.5%). These represent a more basic level of use in terms of search engine capability. Synonyms were less frequently used (29.5%) although this could be a problem of classification and differentiation between ‘keyword’ and ‘synonym’. On the other hand, a researcher might have a set of keywords in mind specific to the field of research, for which using synonyms after an initial search has failed would likely produce irrelevant results. Higher levels of search to both refine and broaden results were less widely used. Boolean logic was employed by 53.8% of respondents and would seem to be used to gain more specific results than simply inputting keywords, names and dates. Broadening strategies such as truncation and wild cards were used least frequently (28.7% and 22.2% respectively). In the ‘other’ category some interesting additions were made. Full text phrases, author, acronyms, titles, place of publication, publisher and cited by, were given as examples; some of these being similar to examples given of text-based search components. “” and + or – functions in Google were also mentioned. This led to focus group discussions about awareness of how web search works and the functionality that exists behind the scenes that there is no instruction for. Do researchers learn to use it effectively in their research process or get along adequately without such knowledge?
Question 13: How important are these aspects of navigation when searching on the web?
As expected, three of the four aspects of navigation given as examples were judged by the majority of respondents to be either important or very important.
Question 14: Please briefly describe the stages of your web-based search process (enter keywords into search engine, primary scoping etc)
Web-based approaches were very similar to text-based ones as might be expected, though the process seems to be condensed by the accessibility of resources online. Generally, a search would begin by entering keyword terms into a search engine, Google or otherwise, or into a subject specific resource for a general overview of what is available on a topic. The results of this would be ‘sifted’ to find reputable sites and references and the keywords honed and refined accordingly or more search parameters set on the basis of what has been found. This would be followed by printing, downloading or making notes on relevant results. Some said they would access material online if possible, whilst others still prefer to locate in hardcopy if possible and convenient. Comments about reputability and trust in resources found online are important and something we will return to in the focus groups and design groups.
Of course, as previously mentioned, web and text-based approaches are not mutually exclusive for the scholar of today and so it is difficult to separate them here. Admittedly, it is easier to go more quickly from broad scoping to specific keyword searches if working entirely online as long as all the desired resources are accessible. That said, many respondents commented that web research is simply a means to an end; a way of finding out where relevant hardcopy materials are located. The ‘go-to’ method for many respondents was a Google search, which we found initially surprising, however, in terms of primary scoping and seeing what is out there in the topic area it proves to be very effective.
Question 15: If web resources offer instructions for using their search function, what do you normally do?
40.5% of respondents would refer to instructions as needed. 37.2% would briefly scan and 20.2% would ignore and continue as normal. Only 2.1% said they would read in detail. This seems to point to a desire for accessible but non-invasive instructional material, more signposting than in depth explanation.
Question 16: Do you use advanced search if it is available?
87.5% of respondents answered ‘sometimes’ or ‘often’, 10.5% responded ‘always’ and 2% ‘never’.
Question 17: In conducting a search, how important is it that the interface is easy to use?
As expected, 88.3% of respondents stated that a user-friendly interface is important. 11.7% were indifferent; perhaps these respondents are more comfortable navigating their way around preferred digital resources and therefore are not troubled by interface design.
Question 18: How important are the following aspects of search interface?
Each of the four aspects was judged on average to be either very important or important. It would seem that, while ‘look and feel’ is significant, the search function should blend into the site and function smoothly i.e. it should be transparent. Search functions become ‘visible’ when they do not work properly, when they are slow or when they do not produce the expected results. Intuitive design is perhaps a given but what is intuitive for one person might not be for another. Having appropriate search fields was awarded the greatest importance (63.8%). This could be because certain integral fields are often missing from search interfaces; there are too many to take the time to fill in or there are irrelevant fields. Display of return results is something that appears to have some scope and significant room for alteration and development. A technical solution could be arrived at by asking researchers how they would ideally like results to be displayed.
Question 19: How do you judge when a web-based search is complete?
This was a multiple response question. Again, several respondents commented that a search is never truly “complete”. Quality control was almost equally significant as for text-based searches. One respondent commented “I often cross-reference web results with textual sources – if I can find them/am aware of them existing”. Practical constraints figured highly again (67.5%) but slightly less so than for text-based search, perhaps because the internet is accessible from a desk and is easier to dip in and out of when time allows. However, one respondent commented that “it is a myth that the latter [on-line searching] is ‘easier’ or a labour-saving device – you still have to read it all at the end of the day”. ‘No new data found’ was given slightly higher importance in web-based search (57.1%), perhaps because researchers spend more time sifting results. One respondent commented “when I begin to come up with the same results despite changing search strategies”. It would seem that more ‘searches’ or perhaps rather ‘permutations of search’ are conducted online than they would be at a physical library or archive. Time is still a factor for online research, therefore, an intuitive and efficient interface is key.
Question 20: How do you collect and manage data from a web-based search?
The data showed a definite deferral to personal storage with 63% of respondents using a desktop filestore and 58.1% printing off and hardcopy filing. One might speculate that hardcopy storage is still deemed by many to be more secure but what then are the implications for version control and backup? Online filestores are employed by 19.9% of respondents while 28.9% make use of online bibliographic referencing tools which link directly to searches. A large proportion of respondents use favourites to save sites (57.8%) for ease of re-access or even save search result pages (44.9%), presumably if a particular search has produced a number of usable hits. Recurring responses in the ‘other’ category included Zotero; a web browser-based research assistant which collects research in a searchable interface by automatically full-text indexing the content. Another common practice is to keep a PDF library or personal research notes (Word) or database (Excel, Access). Interestingly, several respondents commented that they still made notes on paper and used post-its.
Question 21: Please give example(s) of web resources you frequently use
Once again, respondents gave a variety of specific examples, ranging from search engines, online applications, research tools, and digitised resources. These included: Google applications i.e. Google scholar and Google Books; referencing software such as Zotero and EndNote; digital collections; online library and archive catalogues, both local and national; online museum and gallery collections; online newspaper archives; OED or similar; JSTOR and other digital journal collections; online bibliographies; subject specific digital resources; online periodicals; university library digital archives; institutional library and archive digital catalogues; digitised official records.
Early English Books Online (EEBO) and Eighteenth Century Collections Online (ECCO) were also mentioned with significant frequency.
Question 22: When conducting a search, approximately what percentage of time do you spend doing text-based and/or web-based research?
The average percentage of time spent on text-based search was 41.9% while we-based search was on average 57.9%. Filtering this result by discipline: History (text-based 47.7% web-based 51.1%); English Literature (text-based 45.5% web-based 54.7%); English Language (text-based 35.7% web-based 63.3%); Politics (text-based 31% web-based 68.6%); Journalism (text-based 37.1% web-based 62.9%). In all cases, text-based search was used on average less frequently than web-based search, however, due to the amount of digitised archival material now available online it is not a surprise that it is probably now accessed in this way.
Question 23: What are the advantages of combining both methods?
The greatest perceived advantage of combining text and web-based search was the ability to access a broader range of sources for validity (86.3%). Other comments revealed a number of recurring themes. Many researchers stated that they worked from web to text for a variety of reasons:
“Online sources are often incomplete/unreliable. I’ve learned to check printed sources to fill the gaps.”
“I use web-based research partly to establish focus for archival research and particularly to follow up archival findings and attempt to locate further information about points where archival research has thrown up new lines of interest or left questions unanswered.”
“Web search gives (often) immediate access to text-based sources to feed back into the text-based search.”
“Online archival catalogues invaluable in identifying targets for primary sources, and advance preparation prior to visiting archives – saves huge amounts of time in archives, and thus expenses.”
This feeds into the theme of access, which was commented on both positively and negatively:
“While I prefer the experience of reading hardcopy the accessibility of web-based sources is invaluable.”
“Inspiration and enlightenment – cross fertilisation.”
“Not everything I need is online.”
“Up to date research is often not immediately digitised.”
The limitations and advantages of keyword searching exposed some common issues:
“Its [web-based searching] shortfall is that it is often not ‘smart’: a keyword search returns many irrelevant results but often fails to return results which would be relevant but are not associated with that keyword in the catalogue.”
“Speed. In perhaps a minute, I can search all 110 texts (30,000 pdf pages) for key words and phrases. Thus, for example, if I wish to find comment about a song lyric, I can quickly find who of the ‘authorities’ in my library has written on it – and indeed if such people have not. This ability is rather empowering from a research standpoint.”
“Web-based allows you to search the content rather than just the title. So you can discover books, and sources that superficially may not seem relevant. i.e. misleading title, or poor choice of key-words to classify it.”
There is also a strong sense of serendipity related to text-based searching, which was observed to be sometimes restricted by keyword searching and the absence of the physical manifestation of a book or other artefact:
“Web-based searches can be quicker but by browsing the shelves means you can pick up diamonds in the rough, that is, old classics or forgotten books that are actually extremely valuable and insightful.”
“While web-based work is important, browsing shelves allows exposure to topics one would not normally see/engage with. Specificity of web searches can be a problem in this regard.”
“You can often get more from a physical object as layout is informative, and you find useful things you would not have thought to look for.”
However, web-based searching is also seen to contain elements of serendipity by one respondent:
“It offers sources you never thought of before. Web searches are serendipitous.”
The ability to quickly assess and scope a field before continuing with a particular line of enquiry is also mentioned:
“The web is often quicker at letting you know how easy or otherwise it’s going to be to find something out, and how widespread or popular certain knowledge is.”
Question 24: Please rate advantages of text-based search (1=highly advantageous, 5=not advantageous)
Aspects related to construction and categorisation were highest rated. Trust in a reputable archive was rated as highly advantageous (53.4%). This is still obviously very important to researchers, as are peer reviewed sources (45.8%). This might be something they are less comfortable assessing for online resources. Flexibility was seen as less important for text-based search (27.8%) than for web-based (36.7%) which is probably to do with the openness of the web and having so much information on hand that you can browse or discard with greater ease, while text-based search, you could say, requires a more systematic approach. The other three aspects seemed to relate to shared information within a research community and the potential for this information to be developed in an online resource. Familiarity and knowledge of resources was judged to be important by respondents. The support of information professionals was rated in the middle, but this could vary with career level as PhD students might be more reliant on expert knowledge for locating sources than academics who have been working in their field for many years and know their resources inside out. Building from this, in focus groups and design groups we explored whether a generic approach to search with specific content i.e. background knowledge, inserted into the source for other users would be seen as an advantage. This could include tagging of keywords previously unidentified or note-making against the source-texts, although the validity of these would need to be closely monitored with some form of (self-)governance or tracing. If such issues could be resolved then a living site may potentially develop which could lend it greater sustainability beyond the life of the project.
Question 25: Please rate disadvantages of text-based search (1=highly disadvantageous, 5=not a disadvantage)
The greatest disadvantage, which has been highlighted on several occasions, is difficulty of access in terms of travel, expense and time. 47.5% of respondents rated this as highly disadvantageous.
Another access issue rating the second highest was concerned with collecting and collating search results, presumably because this would have to be done by photocopying and note-taking on paper or a laptop etc. which is time consuming and cumbersome. In addition, many works are reference only, which puts a restriction on time spent interrogating sources. If travelling to an archive or library, the research will have to be well planned to get the most out of a visit. Managing records and indexing post-visit is related in as it may involve transferring information from one medium to another i.e. paper to electronic, although this rated in the middle of the scale. How an archive is organised was judged not to be a great disadvantage in structuring a search, perhaps because multiple visits would mean you learn the particular idiosyncrasies of the site which could be a reassuring factor, as you become familiar with the resources available to your field of study. In focus group sessions we discussed whether it would be considered an advantage to take a more flexible approach and have the ability to order things in ways to suit your own research i.e. the structure of the resource is not prescribed.
Question 26: Please rate advantages of web-based search (1=highly advantageous, 5=not advantageous)
The majority of the aspects of web-based search given were rated as highly advantageous, particularly themes of access, such as the availability of large amounts of information, and time saving factors. The response to being able to share data in a research team was more widely rated but averaged in the middle.
This could be something to do with the lack of a data sharing tradition in humanities subjects overall and seems to conflict with the conclusions drawn from the ratings of advantages of text-based search which could be applied to developing digital resources, such as building research community elements into the interface. We explored this disparity in focus group discussions.
Question 27: Please rate disadvantages of web-based search (1=highly disadvantageous, 5=not a disadvantage)
None of the given aspects were rated as highly disadvantageous which may be as a result of the flexibility of online searching. There are questions arising from issues of quality assurance and relevance, such as how do you decide on quality? And what exactly does relevance mean? A lack of recognition of search terms was also brought to light in previous comments about flawed keyword allocation. Aspects of ordering were rated in the middle, perhaps because you learn how a preferred search works and come to expect what and how results will be displayed. Managing and maintaining search focus and frustration with lack of results were not seen as great disadvantages of web-search potentially because it is by its very nature a more open way of searching and there is more time to refine and modify search terms. In fact, you might find something unexpected by following a tangent thrown up by a particular search, returning to the theme of serendipitous findings.
Question 28: It is important to have a good understanding of search methodology?
47.5% of respondents strongly agreed that it was important to have a good understanding for text-based search methodologies, an almost equal proportion as for web-based searching (47.8%).
Question 29: Do you think it would help you to be kept informed about current research practice and developments?
73.8% of respondents thought it would help to be kept informed of the above.
Question 30: What do you think would help improve your search practice?
Comments about search practice fell into the four broad categories of policy, awareness, specific practice, and time, although there is some overlap between them.
Policy observations brought up aspects of infrastructure and funding, training, resources, best practice and open access, among others:
“UK government investing in academic facilities for a digital era.”
“Regular university provided additional training on new online developments e.g. new relevant data collections online, developments of google search processes etc.”
“The expansion of keyed text in EEBO, and of the search facilities associated with keyed text, would be highly advantageous.”
“Well-written and easy-to-follow basic guidelines pertaining to respective archives or online search media” and “more linked resources and context in online collections” and “better indexing of sources by projects” and “online tutorials by respected scholars/institutions, code of practice, clearer idea of do’s and don’t’s.”
“De-commercializing research databases (i.e., Project Muse, J-STOR) and increase free, open-access to research outputs. Integration of online resources (databases, catalogues) with note-taking and curatorial webtools (e.g. Evernote).”
There is certainly a feeling that, while many researchers are generally satisfied with their processes and what they achieve through them, training is often lacking and many would appreciate some guidance on keeping up-to-date and improving their own practice.
Awareness observations showed a common thread of lack of knowledge about what resources are out there, old and new, and how to navigate and use them effectively:
“It’s not easy to keep up with what’s available in any medium. As MLA’s annual index is no longer available in print, for instance, it is impossible to search all of an issues Medieval or Renaissance entries on, say, poetry, or prose fiction, because it can’t be searched by era or dates on line. New websites mostly show up through word of mouth, e.g. I’ve told at least a dozen friends about the Anglo-American Legal Heritage website with 6 million images of PRO documents on line free – not one had heard of it.”
“If it was easier to keep up with the changing nature of search terminology and if databases were publicised more – I have found about many excellent ones only through attending seminars and workshops, etc.”
“Better understanding of how web search engines actually work, and why they don’t always work the way that your brain might in sifting information…”
“How to find suitable sources (keywords, relevant words, etc.)”
“Greater knowledge of how other researchers – both within my field and in others – conduct their research.”
There was a strong sense emerging that the research community could unite more to share knowledge about new and existing resources and best practice, perhaps using social media. It also seems that researchers would like to have a better understanding of how online resources are working behind the scenes, how archives are structured, be generally better up to date with new research and how best to locate sources both online and offline.
Specific practice observations reveal how researchers take what they know and their current way of doing things to plan, strategise and operationalise search and how they think this could be improved:
“Planning search strategies more fully in advance.”
“Being more systematic and less improvisatory – but that would turn me into something I am not.”
“Working with different online tools – practice.”
“I think the ability to be able to take the metadata from online databases and structure it in different ways. To be able to make sub-sets within a search. So that a sub-set could be compared to a larger set of search results clearly.”
“Patience. In the humanities in particular, this is not an exact science, and efforts to improve search functionality almost all need to come on the back end, since most users ‘know’ what they are trying to find in electronic databases before they go looking for it, perhaps a mistake in and of itself.”
“Talking to other researchers to find out how they do their research” and “Reading more research by scholars searching in similar areas.”
“Better strategies for key word searches, resources for beginning searches, knowledge of databases and data sources beyond typical web-based journals or publications (government records, statistics, etc…)”
“Being able to record, sort, comment and save searches and sources, transforms the experience.”
Many of the observations above deal with aspects of personal development, such as being more organised and systematic in planning, as well as in conducting search and processing results. These feed into the next set of observations about the practicalities of conducting research.
Practical constraints observations, such as time restrictions, are mentioned with significant frequency:
“More time and funding to support research.”
“As with just about everything, more time to devote to searching and dealing with the results would be the key improvement.”
Overall, it appears that resource design should allow for a certain amount of creative freedom but also retain systematic processes. As time is a common factor which restricts search, the interface should work quickly and effectively and allow researchers to mould it in a variety of ways to suit their needs and shape the outcomes of each search.
98 respondents across disciplines and career levels left their contact information, signalling their willingness to be contacted about further involvement in the project. A selection of these people, based on role, discipline and location to ensure a range of different opinion and experience, were contacted about taking part in focus group sessions to discuss themes and specific points of interest arising from the survey.
3. Survey Findings and Key Themes
The intention of this landscape survey was to address our initial research questions and identify general trends by mapping existing search methodologies within the relevant research communities, highlighting benefits and deficiencies and finding out what academics value in terms of search. It also helped to identify people who might be interested in further involvement with the project.
The target response level was achieved with just over 500 responses. The main sampling criteria were also fulfilled with representatives of the five identified disciplines at a range of career levels. Respondents were approximately half PhD students, post-doctoral researchers and research staff, compared to half being academic staff.The disciplines were less balanced, however, as previously mentioned; this could be due to History, English Literature and Politics departments being more widespread and having a larger number of staff per department across the participating universities. The following presents the key findings and themes arising from the survey, bringing in disciplinary differences where relevant and comparing responses from PhD students and professors, in order to show the widest variations in career level research practice.
Emergent Findings 1: Search Strategies
There are similarities between on and offline search strategies in terms of how knowledge is sought out and put together. For example, keywords and names were used most frequently as search components for both on and offline searches. Dates were more central to researchers in History and Journalism than other disciplines.
Digital catalogues and indexes were prioritised over printed resources in offline search strategies, indicating a crossover between on and offline search where library and archival information is now available online.
Digital collections and specialist databases were most frequently used in online search strategies,which must be seen in the context of many resources that used to only be available in physical archives and libraries that can now be accessed and browsed online, and therefore such resources are trusted and verifiable. The same applies to digitised journals because you are essentially accessing the same resource in a different way.
Search engines are also widely used, which reflects how researchers are able to take full advantage of the openness and flexibility of the web.
Emergent Findings 2: Time
This factor seemed to influence search in terms of time available to conduct searches in the first place, to read instructions for using search functions, to learn how to use search tools most effectively, and to deal with results. It was also the most important factor for most disciplines for judging when a search is ‘complete’ although many respondents commented that no search is ever truly complete.
English language respondents were distinct in that they prioritised having sourced and collected all valid data over practical constraints such as time for both on and offline searching. This is perhaps to do with the nature of the data and the types of analysis they are conducting on it i.e. more statistical/ quantitative. They were also less concerned about cross-referencing.
The average percentage of time spent on offline and online searching was assessed and showed that History and English Literature spend only fractionally more time conducting search online than offline, whereas English Language, Politics and Journalism were spending significantly more time on web searches. This is probably to do with the types and availability and accessibility of resources for these disciplines online. This may also explain why they perceived more advantages of web searching than History and English Literature, especially time saving aspects, ease of access and the availability of large amounts of data from respected sources. Certain disadvantages of offline searching were also given more weight by these three disciplines, particularly collecting and collating search results and managing records post-visit; again this is probably to do with the types of resources and analysis taking place.
Key Themes 1: Context, Quality and Validity
Professors expressed more concern about certain aspects of navigation in web searching, particularly the transparency of site structure and meaningfulness of menus. It could be that PhD students are generally more used to the presentation of web resources and therefore more comfortable navigating them. In terms of search interface, professors seemed to value intuitive design and ‘look and feel’ of the search function more highly than PhD students. When asked for general reactions to web resource search function instructions, most respondents stated that they would scan or refer to them as needed while only 2% said they would read them in detail. This seems to point to a need for accessible but non-invasive instructional material, more akin to signposting as discussed in the LAIRAH 22. Warwick, C., Terras, M., Huntington, P. and Pappa, N. (2007) ‘If you Build It Will They Come? The LAIRAH Study: Quantifying the Use of Online Resources in the Arts and Humanities through Statistical Analysis of User Log Data’, Literary and Linguistic Computing, vol. 23, no. 1, pp. 85-102. study.
Overall, specialist databases and online journals such as ProQuest and JStor were used by 89% of respondents. This may be because these are trusted and verifiable resources. Quality and validity issues proved to be more important to professors. Knowledge of research area and resources, in addition to consistency and quality assurance, were also more highly valued by later career researchers. PhD students seemed less concerned about these aspects, perhaps because they generally have less background knowledge to bring to bear on their research. Most respondents expressed interest in knowing more about how digital resources work behind the scenes, perhaps with the motive of understanding how the resource has been structured, which could in turn help to validate any search results they might obtain for their own peace of mind.
Key Themes 2: Online Versus Offline Practices
We compared online and offline practice to find out which aspects people value of each and equally what they dislike. Many respondents stated that they worked online to offline, for example, conducting a web search to identify resources and then verifying results with archival or library work and for reasons such as reliability. One respondent commented that “online sources are often incomplete or unreliable. I’ve learned to check printed sources to fill the gaps”.
The ability to quickly assess and scope a field online before continuing with a line of enquiry was also mentioned, “The web is often quicker at letting you know how easy or otherwise it’s going to be to find something out and how widespread or popular certain knowledge is”. Many people are working increasingly online, for reasons including remoteness of resources which are now accessible digitally, saving time and expense. It would also seem that more ‘searches’ or ‘permutations of search’ are conducted online than at a library or archive due to ease of access. However, online access was commented upon both positively and negatively since the accessibility to resources it provides is invaluable, although up to date research is not always immediately digitised.
The limitations and advantages of keyword searching brought up some common issues, with respondents highlighting the convenience of being able to search thousands of pages of text for keywords and phrases, while bearing in mind that human error in inputting tags or the different ways in which people might categorise things may mean certain things cannot be found.
Browsing and serendipity was more strongly associated with offline research and given a lot of weight. 66% of respondents stated that browsing shelving systems was a regular search practice because it means exposure to topics you would not normally engage with, simply by virtue of them being placed around specific items of interest. Formulating specific web searches, so as not to be drowned in data, was thought to significantly reduce serendipitous findings, although this was not unanimously agreed upon.
A user-friendly interface was less important to History and English Literature respondents; they seem to be more prepared to make do if they really need to use a resource, which ties in with the findings of the LAIRAH project. Several focus group participants went further into explaining this practice of finding ‘work-arounds’ for challenging resources. In addition, the ability to conduct full text search on resources appears to be increasingly needed and expected by all disciplines.
Key Themes 3: Awareness
When asked what would improve search practice, a number of respondents replied that a better knowledge of what tools and resources are available and how to access, navigate and use them effectively would be helpful, particularly those of specific relevance to the humanities which are useful rather than gimmicky. There was also a strong sense that the research community could unite more to share knowledge about new and existing resources, research practices and methodologies within and across disciplines.
In terms of dissemination for offline sources, respondents commented that ‘word of mouth’ was very important, as were footnotes and bibliographies. The same was said of online resources for discovering the available new research.
One respondent pointed out that, “new websites mostly show up through word of mouth e.g. I’ve told at least a dozen friends about the Anglo-American Legal Heritage website with 6 million images of PRO documents online free – not one had heard of it.” It must be assumed that this is not an isolated case.
The landscape survey was immensely important for providing a snapshot of research practices across disciplines. The topics and themes arising as well as the questions raised, were incorporated into a focus group topic guide to enable us to build on and consolidate those findings and gain a more in depth understanding of specific and individual research practices and methodologies. The next section explains how the focus groups were conducted and analysed, in addition to presenting the main findings.