Skip to Main Content

Research Data Management

A practical guide on the best practices of research data and codes management

Sharing data FAIRly

Sharing data FAIR-ly


It is common to see that along with the growth of open science in the recent decades, the global research community is increasingly sharing or publishing research data, making it available to others. Many academic research funders and academic journals are also imposing data policies that require data accessibility. 

Widely adopted by the academic communities, the FAIR data principles first introduced in 2016 is the set of guidelines that help researchers make better use of, and engage with a broader audience with, their research data. Research data shared are required to be findable, accessible, interoperable, and reusable.  

When you are planning to share your research data, you are suggested to consider the below to maximize reusability of your data: 

  • Share your data and codes in open trusted repositories 

  • Get a persistent identifier (e.g. DOI) for your data and use it in the associated publication for others to cite 

  • Document your data, code, workflows, software required to open the data in separate file(s), and share alongside your data 

  • Select appropriate data format and tools for higher interoperability 

  • Use open license for your shared data and code 

Read our guide on Open Data for more details on the benefits of data sharing, the FAIR data principles, and selecting repositories for sharing.

Licensing Data

Licensing data 


Research data is intellectual property that could be under the ownership of researchers, the supporting institution, or the funder. HKU researchers may refer to the university’s policy on Intellectual Property Rights and the Research Data and Records Management for relevant regulations. 

When sharing your research data, using an open license is one of the recommended best practices to increase the reusability of your data. An open license specifies what can and cannot be done with an original work regardless of its form. It grants permissions and states restrictions. According to the definition by Opendefinition.org, an open license is one which grants permission to access, re-use and redistribute a work with few or no restrictions. 

Understanding the licensing terms of a dataset before reusing it is crucial to prevent copyright infringement and other intellectual property concerns.  

The most common open license used for academic work and datasets is the Creative Commons (CC) licenses. See the textbox below for more details on each CC license with different combinations, and more information on our guiding page

For open license frequently used specifically for open-source software or codes, read our guide on Open-Source Software and Codes

Creative Commons licenses

There are six Creative Commons license options. The Creative Commons license on a copyrighted work answers the question: What can a user do with this work?

Elements Copy and distribute
the material
Attribute the creator Distribute, remix, adapt,
and build upon
the material
Share the modified
material under
identical terms
Commercial use

CC BY

ccCC by

Allowed Required Allowed Not required Allowed

CC BY-SA

ccCC bycc sa

Allowed Required Allowed Required Allowed

CC BY-NC

ccCC bycc nc

Allowed Required Allowed Not required Prohibited

CC BY-NC-SA

ccCC bycc nccc sa

Allowed Required Allowed Required Prohibited

CC BY-ND

ccCC bycc nd

Allowed Required Prohibited (Modification prohibited) Allowed

CC BY-NC-ND

ccCC bycc nccc nd

Allowed Required Prohibited (Modification prohibited) Prohibited

CC0

cccc0

Allowed Not required Allowed Not required Allowed
Elements Copy and distribute
the material
Attribute the creator Distribute, remix, adapt,
and build upon
the material
Share the modified
material under
identical terms
Commercial use

 

Note:

Different publishers may have different license to publish (LTP) agreements.

As an author, when you choose a license, you will read through the license terms and consider which license suits you best.

For example, do you prefer CC-BY-NC-ND license, if you need to grant only to the journal publisher (but not to other users) the right to sell or rent your article?

 

Learn more and license chooser: 

Disclaimer: The information and materials provided on the website are for general informational purposes only and do not constitute legal advice.

Citing Data

Citing data


If you are utilizing third-party data published by others in your research or publications, you must provide a citation for the dataset to give credit where credit is due (original author/producer) and to help other researchers to locate the materials. 

Acknowledgements and citations contribute towards fostering a culture of sharing data without fear of ideas or recognition being stolen. Data citations also aid in the transparency of how data is being used. By citing data, original authors and new researchers can easily track how the data are being used to answer different questions. 

 

In general, a citation for dataset often includes the following components: 

  • Authors and their affiliated institutions/organizations 

  • Title 

  • Version 

  • DOI (or URL if a unique identifier is not available) 

  • Creation date 

  • Additional fields may also be specified or required by individual repository/journal 

 

The Australian Resource Data Commons (ARDC) provides standard data and software citation templates and examples: 

 

Standard data citation 

Template 
Creator (Publication Year): Title. Publisher. (resourceTypeGeneral). Identifier 

Example 
Hanigan, Ivan (2012): Monthly drought data for Australia 1890-2008 using the Hutchinson Drought Index. The Australian National University Australian Data Archive. (Dataset) http://doi.org/10.4225/13/50BBFD7E6727A

 

Standard software citation 

Template 
Creator (Publication Year): Title. Version No. Publisher. [resourceTypeGeneral]. Identifier. 

Example 
Xu, C., & Christoffersen, B. (2017). The Functionally-Assembled Terrestrial Ecosystem Simulator Version 1. Los Alamos National Laboratory (LANL), Los Alamos, NM (United States). [Software]. https://doi.org/10.11578/dc.20171025.1962

 

More resources: 

Digital Curation Centre, How to Cite Datasets and Link to Publications 

European Union, Data citation: A guide to best practice 

International Association for Social Science Information Services & Technology (IASSIST), Quick guide to data citation 

Open Science 101, Module 3: Open Data – Using Open Data