Skip to Main Content

Research Data Management

A practical guide on the best practices of research data and codes management

Data documentation

Data Documentation 


Documentation is a critical component of research data management, enabling users without prior knowledge of the project to understand how the research was conducted and what the data represents. It should always accompany data to ensure your data remains discoverable, understandable, and reusable by other researchers without requiring assistance from the original data creators. 

“A crucial part of ensuring that research data can be shared and reused by a wide range of researchers for a variety of purposes is by taking care that those data are accessible, understandable and (re)usable.” (UK Data Service, 2024) 

Proper documentation is the recommended practice to achieve the above. Clear, organized and detailed descriptions and annotations could enable other users to contextualize your data, leading to more informed, effective and correct reuse. It hence facilitates discovery of your data. 

 

Researchers could document their data files at two levels: 

 

Study-level documentation 

Study-level documentation for a data collection or dataset should offer comprehensive information on the research context and design, the methods employed for data collection, who, where and when the data were collected, how the data was processed, summaries of the findings derived from the data, and access information of the data. 

More details and examples are available in the UK Data Service guide

 

Data-level documentation 

Data-level documentation includes detailed descriptions and annotations that accompany individual data files and variables within a dataset. For example, descriptions of variables (e.g. name, data type, coding scheme, etc.), value labels, codes for missing data and units of measurement. This type of documentation is crucial to aid others, including those who were not involved in the original study, to understand a set of data. 

More details and examples are available in the UK Data Service guide.  

Recording metadata

Recording metadata 


Whenever possible, it is best to consult the community metadata standards in your field before you begin collecting research data, so that you could record them efficiently when your data are active during the research process.  

Based on the nature of your research, documentation of your data can be recorded and maintained in a variety of forms, such as Lab Notebook, README file, and Data Dictionary. 

README file

README File 


README files are a common way to document the contents and structure of a folder and/or a dataset. It is often saved as a plain text file (.txt) in a project-related folder. Possible usage include:

  • documenting changes to files or file names
  • explaining directory structures and file naming conventions
  • specifying usage instructions, to be accompanied with files/data deposited in a repository

 

You can refer to more details and examples of a README File at the Harvard University Research Data Management Guide

Data Dictionary

Data Dictionary 


A data dictionary is a document that outlines the structure, content, and variable definitions for a dataset or collection of data.  The purpose of a data dictionary is to explain what all the variable names and values in your data files really mean.  This may include variable names, human-readable variable names, measurement units, allowed units, and definition of the variable. You can refer to more details on creating a data dictionary at the Open Science Framework’s How to Make a Data Dictionary guide and the Harvard University Guide on Data Dictionary

Electronic Lab Notebook

Electronic Lab Notebook 


An Electronic Lab Notebook (ELN) is a software tool that digitally replicates paper lab notebooks traditionally used in the sciences to record experimental results (NNLM, 2024). You may document protocols, observations, notes, and other data using an electronic device with ELN installed. More details can be found at the Havard University Guide on Electronic Lab Notebooks and the Imperial College London’s Guide on Electronic Laboratory Notebooks (ELN)

More Resources:

Network of National Library of Medicine. (2024). Data Glossary, Electronic Laboratory Notebook (ELN). https://www.nnlm.gov/guides/data-glossary/electronic-laboratory-notebook-eln