Skip to Main Content

Digital Humanities: Collections

Support for researchers using the Library's collections with Digital Humanities techniques.

Library collections with Digital Humanities capabilities

The following collections have been selected as ideal for Digital Humanities research. Supporting information is provided for each, such as the different ways there might be to access the same resource and the pros and cons of each way.

The collections are organised by publisher, in boxes below. The publishers include Adam Matthew Digital, Gale Cengage and ProQuest. See also the Special Collections webpages, list of Historical English Corpora from the Department of Linguistics and English Language at The University of Manchester, and Researcher services: Using our collections and sources on the Library website.

 

Gale Cengage

Many Gale Cengage databases are available, both via direct access and via the Gale Primary Sources platform. All entry routes provide full text searches but Gale offers more advanced features such as cross-searching, term frequency and term clustering searches, and OCR (Optical Character Recognition) text download.

Adam Matthew Digital

Adam Matthew Digital provides several text and image-rich collections covering the last 500 years. A data mining agreement is provided up-front. The collections include Apartheid South Africa, Global Commodities, African American Communities, Victorian Popular Culture and Mass Observation.

ProQuest

The ProQuest collections arguably of most interest to Digital Humanities are Historical Newspapers and Early European Books Online.

Feature: Early modern books

One of the most widely used collections for Digital Humanities research is Early European Books Online. It is available via several platforms and providers. These include ProQuest, EEBO-TCP, Jisc Historical Texts and the Oxford Text Archive.

The platforms offer the same source material but with different levels of text encoding or search features. For example, Jisc Historical Texts allows you to search fuzzy and variant spellings, variant forms, create histograms, make subsets, and cross-search with eighteenth and nineteenth century collections.

Early European Books migrated from the legacy Chadwyck Healey platform to ProQuest in summer 2018. It now has better search and visualisation options, including historical mapping.

JSTOR and Portico

[Update: Constellate will be 'sunsetted' on 30 June 2025]

Constellate is the text analytics service from the not-for-profit ITHAKA - the same people who brought you JSTOR and Portico. It is a platform for teaching, learning and performing text analysis using archival repositories of scholarly and primary source content. You can query, search and download full-text articles or upload your own to use with their tools. Constellate works using Python and includes sample Jupyter Notebooks which you can modify and extend.

The University of Manchester has access to the free, public version of Contellate (excluding the Lab). 

ITHAKA launched the Text Analysis Pedagogy (TAP) Institute to help instructors and librarians learn and teach text analysis. From 10 July to 11 August 2023, Constellate—in partnership with the Academic Data Science Alliance and the Association of College and Research Libraries—is offering free events and classes for anyone interested in teaching text analysis. Courses are progressive, so you will benefit from taking a single class or the entire series, no matter your skill level.

Register for summer 2023 TAP events now

 

There were a series of webinars in summer 2022 which can watch now:

You may also take in their four-session “Introduction to Python” course by working in the Constellate Lab alongside a recording of the class.

Data for Research (DfR) was a separate interface to access journal and pamphlet content on JSTOR ready for analysis and data mining. Searching DfR enabled researchers to find useful patterns, associations and unforeseen relationships in the body of research available in the journal and pamphlet archives on JSTOR. You could search OCR, metadata and key terms to download N-grams and word counts for up to 1,000 documents at a time, in XML or CSV format.

Data for Research has been replaced by Constellate.

Financial and business databases

Company annual reports are increasingly used for data mining, which are available from the (quoted) company's corporate website, from bodies such as Companies House (chargeable), or databases such as PI Navigator, Bloomberg or Refinitiv Eikon.

Another popular resource is the U.S. Securities and Exchange Commission search tool EDGAR.

Ask a question