Under the HKU Policy on Research Data and Records Management, research data is defined as:
The recorded information (regardless of the form or the media in which they may exist) necessary to support or validate a research project’s observations, findings or outputs.
Data are the raw materials awaiting analysis to transform that data into information. Research data is a very board term, and it can take many forms. For instance, it can be:
Primary / raw or secondary / processed
Qualitative or quantitative
Digital or non-digital
Textual, numerical, or consisting of images or audio-visual resources
Observational, experimental, simulation, or derived
Primary / Raw |
Data that are directly collected or created by researchers, including but not limited to:
|
---|---|
Secondary / Processed | Data that is used by someone different than who collected or generated the data. Often, this may include data that has been processed from its raw state to be more readily usable by others. |
Source: NASA. (2024). Open Science 101. https://science.nasa.gov/open-science/os101/
Quantitative |
Quantitative data are data represented numerically, including anything that can be counted, measured, or given a numerical value.
Quantitative data can be classified in different ways, including categorical data that contain categories or groups (like countries), discrete data that can be counted in whole numbers (like the number of students in a class), and continuous data that is a value in a range (like height or temperature).
Quantitative data are typically analyzed with statistics. |
---|---|
Qualitative |
Qualitative data are data representing information and concepts that are not represented by numbers.
They are often gathered from interviews and focus groups, personal diaries and lab notebooks, maps, photographs, and other printed materials or observations. |
Source:
Network of the National Library of Medicine. (2024). Data Glossary, Quantitative Data. https://www.nnlm.gov/guides/data-glossary/quantitative-data
Network of the National Library of Medicine. (2024). Data Glossary, Qualitative Data. https://www.nnlm.gov/guides/data-glossary/qualitative-data
The below table lists several examples:
Textual data |
|
---|---|
Numerical data |
|
Multi-media |
|
Codes |
|
Observational | Captured through observation of a behaviour or activity - in real time and typically irreplaceable. |
---|---|
Experimental | Captured from lab equipment or generated in controlled environments - often reproducible but this can be expensive or time-consuming. |
Simulation | Generated from test models where model and metadata (about the model, code, computing environment, input conditions) are more important than the output data generated. |
Derived | Resulting from processing or combining existing data points, often from different data sources. |
Source:
DMPTool. (2024). Data management general guidance. https://dmptool.org/general_guidance
University of York. (2024). Research data management: a practical guide. https://subjectguides.york.ac.uk/rdm/data