Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Digital Humanities: Text-Mining

An introduction to digital humanities


Text mining is the process to discover knowledge (or information, patterns) from text data, which are unstructured or semi-structured, by kinds of transformation of a text.
Researchers use text mining tasks such as:

  • Textual analytics
  • Entity extraction
  • Topic modelling
  • Natural Language processing

Text analysis

Basic text analysis 

Advanced text analysis (From Underwood, T. (2012). Where to start with text mining.)

Textual Datasets

Datasets are the basis for getting started a text mining project. Some textual data application programming interfaces (APIs) / platforms are openly accessible to the public and some of them are available to the HKU community.

Government and institution data

Open data for research

Other Subscription-based API

Digital Scholarship Librarian

Profile Photo
Terry Chung
5/F, Main Library, University of Hong Kong, Pokfulam Road,
Hong Kong
(852) 2859-7002