Skip to Main Content

Digital Humanities: Text-Mining

An introduction to digital humanities


Text mining is the process to discover knowledge (or information, patterns) from text data, which are unstructured or semi-structured, by kinds of transformation of a text.
Researchers use text mining tasks such as:

  • Textual analytics
  • Entity extraction
  • Topic modelling
  • Natural Language processing

Text analysis

Basic text analysis 

Advanced text analysis (From Underwood, T. (2012). Where to start with text mining.)

Textual Datasets

Datasets are the basis for getting started a text mining project. Some textual data application programming interfaces (APIs) / platforms are openly accessible to the public and some of them are available to the HKU community.

Government and institution data

Open data for research

Other Subscription-based API

Digital Scholarship Librarian

Profile Photo
Terry Chung
5/F, Main Library, University of Hong Kong, Pokfulam Road,
Hong Kong
(852) 2859-7002