Skip to Main Content

HKU DataHub: The Guide

RPg Students Submission Guide

Submission Guide

Please allow sufficient time for data review and curation by the libraries after your dataset submission. Once your data have been submitted, the Libraries will contact you within 5 working days for confirmation or further amendment(s).

 

Step 1:

Before you deposit and submit your dataset, please review the followings:

  1. Depositor’s Agreement

  2. What to Deposit

  3. When to Redact and Anonymize

  4. Open Access

  5. What Can I Upload

 

Step 2:

Review the “Restricted Access Procedures for RPg students” page for deciding whether any restricted access options should be applied to your data files. Please consult your supervisor on the suitability of your chosen access right control option before the submission.

If it is considered necessary to upload and deposit a metadata record only to the University* (MUST with valid justification and supervisor's approval), please forward your justifications and supervisor's approval to researchdata@hku.hk for arrangements.

 

*Please read point 4, 5, 6, 7 and 9 stated in the HKU Policy on Research Data and Records Management.

 

Step 3:

  • Prepare a README file for submission by using the README file template.

  • Organize your data files into one main folder. Inside the main folder, you may create sub-folders to classify your files. Alternatively, you may also classify your data files into multiple sub-folders. Please refer to this record for example.

  • Separate your README file from the main folder(s) as shown below:

*Recommended Practice*

If your dataset folder is larger than 5GB, you are recommended to 1) upload it via the FTP Uploader*. Alternatively, you are also suggested to 2) divide it into multiple folders with a maximum of 5GB per each folder for smooth uploading experience via the web browser interface. 

If the main folder is smaller than 5GB: If the all data files are larger than 5GB in total, separate them into multiple folders:

 

*Notes on using FTP Uploader:

**You are recommended to break down your datasets into multiple zipped folders with a smaller folder size per each if you have a very large volume of data files.

**Only zipped folder (.zip) can be uploaded via FTP Uploader. Uploading unzipped folders will fail to transfer the files to DataHub.

 

Special Condition: Dataset Published on External Repository

If you have already deposited your dataset (raw data or processed data) in an external repository such as subject-specific repositories of your field of study e.g. SRA, ENA etc., you may upload a README file and provide the URL(s) or DOI link(s) of the deposited record without uploading the same data files onto DataHub.

Please include the link(s) under the field “Related Materials” in your dataset submission on DataHub. Please refer to the uploading data page (Point 10) for how to add the entries under the "Related Materials" field.

 

Step 4:

To submit the dataset for examination, please access DataHub. Login with your HKU Portal ID. 

******************************************************************************************

If you have encountered redirection issues after login, especially your HKU Portal login credentials have been saved on the browser, you are suggested to try the following alternatives for logging into DataHub:

  1. Clear caches, browsing history, and cookies on the browser you are currently using, and then re-open the browser and try to login to DataHub again.
  2. Try to login to DataHub via Chrome (Incognito) / Firefox (Private) mode.
  3. Try to login to DataHub via other web browsers other than the one you are using, such as Chrome, Firefox, Edge, or Safari.

*******************************************************************************************

After the HKUL Authentication, on the DataHub interface, follow the below steps:

  • Under “My data”, click on “Create a new item record

  • Upload your README file and the data folder(s) by dragging them into the record or selecting files by clicking on “Browse files”, or folders by “Browse for folders”.

  • Separate the README file from the main folder(s) under the same item record as shown below:

 

Note 1: Click on "Manage files" to view the uploading status of your files. A "tick" icon refers to "uploading completed", and an orange file icon refers to "uploading successful but file cannot be scanned". 

   

Note 2: If your data file / folder fails to be uploaded, the record will be highlighted in red with a message "Something went wrong" shown beside the file. Please remove the record and re-upload the file / folder again.

 

  • Fill out the below metadata fields below:

Field name(s)

Requirement
Title Assign a title following this recommended format: Supporting data for “title of your thesis”
Descriptions
  1. Explain the scope of your data, e.g. what your research project / studies are about, sample size,  target participants, etc.
  2. Describe and introduce what your data files are about generally with full sentences. It is suggested to be as descriptive as possible yet avoid inclusion of sensitive or confidential contents if any.
Keywords Enter subject-specific keywords that could specifically describe your data. Multiple keywords can be entered.
Related Materials Enter the DOI(s) and article/dataset title(s) if you have published any journal articles/datasets as an author in any academic journals or in any external repositories. Leave them blank if you do not have any.
License Select the appropriate license you wish to obtain for this dataset record. The default license will be CC BY-NC if no other preferred type of license is chosen.

You may refer to the uploading data via DataHub Interface guidelines for a detailed guide with illustrations. 

Please apply embargo period and restricted access if necessary. By default, the dataset will be set as Open Access under the Creative Commons (CC-BY-NC) license. Please refer back to Step 2 if you are in doubt.

 

Step 5:

When you are depositing your data, Reserve a DOI and Share with private link are mandatory. Please refer to the related guides on separate pages:

Reserve a DOI: Click on the tab "Reserving DOI" on the "Publishing Data" page.

Share with private link: Click on the tab "Share Files" on the "Sharing Data" page.

 

Before submitting your data, make sure you have successfully reserved a DOI and generated a private link for your item record. They are necessary for you to proceed with the remaining dataset submission process.

 

Step 6:

Press the “Submit for review” button in order to submit your dataset.

 

Step 7:

Your submitted data will be sent to the data curator(s) in the Libraries for review. You will receive an email confirmation from the Libraries when the submitted data are properly curated. If any amendment(s) are required, the Libraries will contact you within 5 working days and discuss with you directly. Review process may take a longer time if the dataset submission is not up to the required standards. Please allow sufficient time for data curation by the libraries. Do not proceed if you haven’t received any reply from the Libraries.

 

Step 8:

Only after you have received the confirmation from the Libraries via email, fill in and submit the Dataset Submission Form. The link for the form will be sent to you by the Libraries via email. Please refer to the guidelines by clicking on the tab "Dataset Submission Form" on this page for a step-by-step guide on how to fill in the form. After successful submission, you may print out a hardcopy or save an electronic copy of the completed form for submission to your department or faculty as a proof of submission.

 

Step 9:

Your primary supervisor will receive an email notification requesting for his/her review on your submitted dataset upon submission of the above online form. If comments are received from your supervisor, the Libraries will contact you directly for amendments. If no comments were received or upon completion of the final amendments, your dataset record will be released on DataHub and the whole dataset submission process is completed.

Handbook for PhD and MPhil Programmes 

 

As documented in the Handbook for PhD and MPhil Programmes issued by the Graduate School, regulations on data management plans (DMP) and dataset submission are applicable to RPg students enrolling from September 2017 onwards. 

Appendices 20 & 21, Procedures for the Degree of Master of Philosophy (MPhil) and Doctor of Philosophy (PhD) specifies: 

 

For probation and confirmation of candidature: 

  1. A candidate who registers in September 2017 and thereafter has to submit a Data Management Plan (DMP) before the expiry of his/her probationary period if data is to be collected or generated as part of the research.   

For submission of thesis for examination: 

  1. A candidate who registers in September 2017 and thereafter shall also submit a dataset of his/her research, where applicable, at the time he/she submits the thesis for examination. 

 

The HKU Policy on Research Data and Records Management 

 

The University’s Policy on the Management of Research Data and Records was approved by the Senate at its meeting on May 5, 2015. Revisions were made and approved by Senate on September 3, 2024. 

The policy states the University’s ownership over research data for projects conducted at the University without any arrangements to the contrary, and the minimum retention period for research data and records is five years after data collection or the most recent publication related to the data, whichever is later. 

The policy asks HKU researchers, i.e. anyone who conducts research at the University, to be responsible for  

  • ensuring that research data and records owned by the University are accessible to the University; 

  • developing and documenting clear procedures for the collection, storage, use, re-use, access and retention or destruction of the research data and records associated with their research, being documented in a research data management plan (DMP). 

The HKU Libraries is providing relevant research data services to enable this process. 

 

Dataset Submission for RPg Students: What to deposit 

 

Under the HKU policy, “research data” is defined as the recorded information (regardless of the form or the media in which they may exist) necessary to support or validate a research project’s observations, findings or outputs

 

1. Your data files 

In other words, all materials and information, including but not limited to raw data, processed data, protocols, scripts, questionnaires, codes used for data analysis or visualizations, necessary for a third party to arrive at the same research results claimed are research data, and must be preserved. 

Below are several examples: 

  • Interviews: transcripts, questionnaire, interviewer guidelines 

  • Field and/or lab work: research notebook in digital format 

  • Multimedia files: images, video clips, audio sound files 

  • Software: codes, metadata of your software, any instructions necessary to obtain the software 

  • Source codes: any codes that were written to collect, analyze, visualize your data 

 

2. Documentation describing your data files 

As part of the data deposit, RPg students should indicate which datafiles are raw data (i.e. data that indicate the original data collection process such as questionnaires) and which are processed data (i.e. data ready for analysis in publications) in a separate document. Documentation such as README file, data dictionary, codebooks should be well documented and submitted alongside your research data. 

The documentation, e.g. README file, aims to provide your supervisor and data curators an overview of your data, describing how you organize your data and how the files are named, relationships between files, details of your data, and methodological information, etc. A clear and tidy file structure would allow viewers to better understand your findings and locate the resources they need easier, which would eventually avoid prolonged data deposit processing time for your research outputs. 

 

Your README file should consist of four main components:

  1. General Information

  2. Data and File Overview

  3. Data Description For Each Files

  4. Methodological Information (if applicable)

For specific requirements under each section, please refer to the README file template below:

 

Sensitive data 

For sensitive data that may contain personal identifiers and confidential or restricted data as per the HKU Policy on Research Ethics, depositors and/or PIs are strongly advised to submit an anonymized copy for long-term retention.  They must also be submitted under restricted access unless the approval of relevant Institutional Review Board (IRB) or ethics committees, e.g. Human Research Ethics Committee (HREC) has been obtained. 

Dataset Submission Form

Reminder: Do NOT submit this form until you have received email confirmation from the Libraries.

1. Go to the Online Dataset Submission Form at <https://lib.hku.hk/dataset-submission/>

2. Login with your HKU Portal ID.

3. Your details will be automatically filled after you have logged in.

4. Enabling dataset destruction is optional. If necessary, tick the “Enable Dataset Destruction” box and enter the date of destruction. Please strictly follow the University requirement on minimum retention period when you are making the request.

5. Enter the DOI that has been reserved for your dataset. Press on the “Resolve” button. The details of your dataset will appear automatically. Double check if it is the correct record that you are going to submit.

undefined

6. Click on “Submit”. An email notification will be automatically sent to your primary supervisor for his/her review.

7. If no further comments from your supervisor is received, print out and submit a hardcopy of the Dataset Submission Form to your department or faculty for record.

Restricted Access Procedures for RPg Students

There are occasions that you may wish to upload your data with access control conditions, especially for the sensitive data that conveys personal identifiers. The below will guide you through the steps required for setting up access restrictions to your files.

If your data contain sensitive, confidential or restricted data per the HKU Policy on Research Ethics, you are required to handle those data by means of either of the below two methods:

  1. Upload the data under restricted access

  2. Make and upload a version that anonymizes the data, for public access (with the approval of relevant IRBs or ethics committees)

You are recommended to consult your supervisor on the suitability of your chosen access right control option before the submission. 

 

For uploading your data files under restricted access, which only allows you and your supervisor to access those files, please refer to the below procedures:

Step 1: Click on “Add embargo and restricted access” on the item record editing page.

 

Step 2: Leave the option “Nobody” as your currently selected option.

 

Step 3: Click on “On files only” from the dropdown menu under Embargo type.

 

Step 4:  Decide how long your data files will be restricted for access.

  • For fully confidential items, select “Permanent Embargo” under Embargo period.
  • For applying an embargo with a fixed period of time, select the appropriate length of time or select a specific date.

OR

Note: For RPg student research dataset submission, dataset under this setting will be accessible only to the student, and his/her supervisor(s). The examiner(s) of thesis may also request access to the data for thesis examination process (MPH15 & PHD15 of the Procedures). When the student and supervisor(s) leave the university, the Dean of the faculty will be granted access permission.

Video - Dataset Submission Procedures for RPg Students

RDM Procedures for RPG Students: Dataset Submission