Skip to Main Content

Research Data Management

A practical guide on the best practices of research data and codes management

What is research data

What is Research Data? 


Under the HKU Policy on Research Data and Records Management, research data is defined as: 

The recorded information (regardless of the form or the media in which they may exist) necessary to support or validate a research project’s observations, findings or outputs

Types of research data

Types of Research Data 


Data are the raw materials awaiting analysis to transform that data into information. Research data is a very board term, and it can take many forms. For instance, it can be: 

  • Primary / raw or secondary / processed 

  • Qualitative or quantitative 

  • Digital or non-digital 

  • Textual, numerical, or consisting of images or audio-visual resources 

  • Observational, experimental, simulation, or derived

 

By Data Sources

 

Primary / Raw

Data that are directly collected or created by researchers, including but not limited to:  

  • Responses to interviews, questionnaires, and surveys  

  • Data acquired from recorded measurements, including remote sensing data  

  • Data acquired from physical samples and specimens form the base of many studies  

  • Data generated from models and simulations 

Secondary / Processed Data that is used by someone different than who collected or generated the data. Often, this may include data that has been processed from its raw state to be more readily usable by others.

Source: NASA. (2024). Open Science 101. https://science.nasa.gov/open-science/os101/

 

By Data Types

 

Quantitative

Quantitative data are data represented numerically, including anything that can be counted, measured, or given a numerical value.  

Quantitative data can be classified in different ways, including categorical data that contain categories or groups (like countries), discrete data that can be counted in whole numbers (like the number of students in a class), and continuous data that is a value in a range (like height or temperature). 

Quantitative data are typically analyzed with statistics. 

Qualitative

Qualitative data are data representing information and concepts that are not represented by numbers.  

They are often gathered from interviews and focus groups, personal diaries and lab notebooks, maps, photographs, and other printed materials or observations. 

Source:  

Network of the National Library of Medicine. (2024). Data Glossary, Quantitative Data. https://www.nnlm.gov/guides/data-glossary/quantitative-data 

Network of the National Library of Medicine. (2024). Data Glossary, Qualitative Data. https://www.nnlm.gov/guides/data-glossary/qualitative-data

 

By Data Formats

 

The below table lists several examples: 

Textual data
  • Text documents 

  • Questionnaires, transcripts of interviews 

  • Methodologies and Workflows 

  • Protocols 

  • Text corpora 

  • Field notebooks, diaries, focus group notes 

  • Codebooks, usage instructions, any documentation describing the data

Numerical data
  • Spreadsheets 

  • Geospatial data 

Multi-media
  • Photographs 

  • Films 

  • Audio and video recordings 

  • Images 

Codes
  • Source codes 

  • Algorithms 

  • Scripts 

  • Models 

 

By Collection or Collection Methods

 

Observational Captured through observation of a behaviour or activity - in real time and typically irreplaceable. 
Experimental Captured from lab equipment or generated in controlled environments - often reproducible but this can be expensive or time-consuming. 
Simulation Generated from test models where model and metadata (about the model, code, computing environment, input conditions) are more important than the output data generated. 
Derived Resulting from processing or combining existing data points, often from different data sources. 

Source:

DMPTool. (2024). Data management general guidance. https://dmptool.org/general_guidance

University of York. (2024). Research data management: a practical guide. https://subjectguides.york.ac.uk/rdm/data