Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

DataHub: RPg Students

RPg Students

Submission Guide

Step 1:

Before you deposit and submit your dataset, please review the followings:

  1. Depositor’s Agreement

  2. What to Deposit

  3. When to Redact and Anonymize

  4. Open Access

  5. What Can I Upload

 

Step 2:

Review the “Restricted Access Procedures for RPg students” page for deciding whether any restricted access options should be applied to your data files. Please consult your supervisor on the suitability of your chosen access right control option before the submission.

If it is necessary to upload a metadata record only, skip step 3 and proceed to step 4 directly.

 

Step 3:

  • Prepare a README file for submission by using the README file template.

  • Organize your data files into one main folder. Inside the main folder, you may create sub-folders to classify your files. Alternatively, you may also classify your data files into a few main folders*. Please refer to this record for example.

  • Separate your README file from the main folder(s) as shown below:

  • Compress the main folder(s) into .zip folder. Please avoid compressing it in .rar format as .rar files are unavailable for preview on DataHub.

 

*Note: If your dataset main folder is larger than 5GB, you are recommended to divide it into multiple .zip folders with a maximum of 5GB per each folder for smooth uploading experience.

 

Special Condition: Dataset Published on External Repository

If you have already deposited your dataset (raw data or processed data) in an external repository such as subject-specific repositories of your field of study e.g. SRA, ENA etc., you may upload a README file and provide the URL(s) or DOI link(s) of the deposited record without uploading the same data files onto DataHub.

Please include the link(s) under the metadata field “Related Datasets” in your dataset submission on DataHub:

 

Step 4:

To submit the dataset for examination, please access DataHub. Login with your HKU Portal ID. 

  • Under “My data”, click on “Create a new item record”*

  • Upload your README file and the zipped main folder by dragging them into the record or selecting them by clicking on “browse”.

  • Separate the README file from the main folder(s) under the same item record as shown below:

  • Assign the title to your dataset record in this recommended format: Supporting data for “title of your thesis.

  • Briefly describe your data files and introduce what they are about generally with full sentences in “Descriptions”. It is suggested to be as descriptive as possible yet avoid inclusion of sensitive or confidential contents if any.

  • Fill out the metadata fields by following the uploading via DataHub Interface guidelines. Be careful that “Resource Title” and “Resource DOI” are optional, leave them blank if you do not have any external resources (e.g. articles or datasets) to be linked with the record.

  • Apply embargo period and restricted access if necessary. By default, the dataset will be set as Open Access under the Creative Commons (CC-BY-NC) license.

 

*Note: If the total size of your dataset zip folders are too large, you may consider to upload them with FTP Uploader. Please refer to the guidelines for uploading large files and making bulk upload via FTP Uploader

 

Step 5:

When you are depositing your data, Reserve a DOI and Generate Private Link are mandatory. Please refer to the related guides on separate pages:

Reserve a DOI: Click on the tab "Reserving DOI" on the "Publishing Data" page.

Generate Private Link: Click on the tab "Share Files" on the "Sharing Data" page.

Before submitting your data, make sure you have successfully reserved a DOI and generated a private link for your item record. They are necessary for you to proceed in your dataset submission process.

 

Step 6:

Tick the “Publish” box and follow the publishing data guidelines in order to submit your dataset.

 

Step 7:

Your submitted data will be sent to the data curator(s) in the Libraries for review. You will receive an email confirmation from the Libraries when the submitted data are properly curated. If any amendment(s) are required, the Libraries will contact you and discuss with you directly. Do not proceed if you haven’t received any reply from the Libraries.

 

Step 8:

Only after you have received the confirmation from the Libraries via email, fill in and submit the Dataset Submission Form. The link for the form will be sent to you by the Libraries via email. Please refer to the guidelines by clicking on the tab "Dataset Submission Form" on this page for a step-by-step guide on how to fill in the form. After successful submission, you may print out a hardcopy or save an electronic copy of the completed form for submission to your department or faculty as a proof of submission.

 

Step 9:

Your primary supervisor will receive an email notification requesting for his/her review on your submitted dataset upon submission of the above online form. If comments are received from your supervisor, the Libraries will contact you directly for amendments. If no comments were received or upon completion of the final amendments, your dataset record will be released on DataHub and the whole dataset submission process is completed.

Beginning with the September 2017 intake, all HKU Research Postgraduate (RPg) students have responsibility for

  1. using a data management plan (DMP), where applicable, to describe the use of data in preparation for, or in the generation of their theses, and
  2. depositing, where applicable, a dataset in the HKU DataHub. "RPg" includes the degrees of MPhil, PhD, and SJD.

The 2017-18 Graduate School Handbook describes these new regulations. Sections XX and XXI of the Handbook give the Procedures for MPhil and PhD, respectively. In these Procedures, the relevant paragraphs for data are,

  • MPH5 & PHD5 Probation and Confirmation of Candidature – for description of a data management plan (DMP)
  • MPH7 & PHD7 Period of Study – for describing when in the period of study, a dataset, where applicable, is to be submitted
  • MPH14 & PHD14 Submission of Thesis for Examination – for description of dataset submission
  • MPH15 & PHD15 Thesis Examination – for consideration of DMP Entry results and dataset if applicable, and if desired by the examiners

The 2015 Policy on Research Data and Records Management asks that all researchers, including RPg students, properly and ethically describe in a Data Management Plan at the beginning of their project, how they will collect, organize, store, and finally deposit a dataset (where applicable) at the end of their project. The HKU Libraries provide Research Data Services to enable this process.

In order to facilitate high quality of research integrity and data curation process for the research outputs from our Research Postgraduate (RPg) Students, additional procedures will be applied when RPg students are submitting their datasets. The specific requirements are as follows:

 

1. WHAT TO DEPOSIT

The emphasis of the HKU RDM initiative is on "research integrity". Research results claimed in publications must be reproducible. Replication datasets must be preserved to enable this later reproducibility. All data, scripts, questionnaires, codebooks etc. necessary for a third party to arrive at the same research results claimed must be preserved.

As part of the data deposit, please indicate which datafiles are raw data (i.e. data that indicate the original data collection process such as questionnaires) and which are processed data (i.e. data ready for analysis in publications) – both are needed eventually, but raw data files are essential for any completion report.

Raw data may contain personal identifiers, and therefore must be stored under "Restricted Access". If the data contains sensitive, confidential or restricted data per the HKU Policy on Research Ethics, the researcher may, at his or her choice, wish to further make a version that anonymizes the data, for public access (with the approval of relevant IRBs or ethics committees), to show in open access.

If data includes personal data,the data should be put under confidential,

  • Personal data from clinical research (i.e. Institutional Review Board (IRB) approved)
    • provide approval code, consent forms, ethical application form when available, please state the risk of re-identification from the different datafiles and how the risk has been minimised for any dataset intended for sharing.
  • Personal data from non-clinical research (i.e. Human Research Ethics Committee (HREC) approved)
    • provide approval code, consent forms, ethical application form, please state the risk of re-identification from the different datafiles and how the risk has been minimised for any dataset intended for sharing.

If data includes interviews,

  • Interview transcripts
  • Blank questionnaire & interviewer guidelines

If field research data,

  • provide copy of file research notebook in digital format, preferably machine readable.

If lab research data,

  • copy of working papers and/or lab research notebooks in digital format, preferably machine readable.

If simulated data,

  • how was it generated? Please either explain or provide a link.

If other types of data, such as Image or video data, Creative or Design data,

  • please explain what type of data and how was it collected/generated.

If software is needed to read or analyze any of the datafiles,

  • please provide full details of software name, version needed, and any instructions necessary to obtain the software. If you have written your own script for analyzing the data, please include this script also in final deposit.

When you are uploading your files onto DataHub, you are also required to prepare and submit a README file alongside with your dataset.

The README file aims to provide your supervisor and data curators an overview of your data, describing how you organize your data and how the files are named, relationships between files, details of your data, and methodological information, etc. A clear and tidy file structure would allow viewers to better understand your findings and locate the resources they need easier, which would eventually avoid prolonged data deposit processing time for your research outputs.

Your README file should consist of four main components:

  1. General Information

  2. Data and File Overview

  3. Data Description For Each Files

  4. Methodological Information (if applicable)

For specific requirements under each section, please refer to the README file template below:

 

2. Data Curation

When all necessary files are submitted successfully on DataHub, the University of Hong Kong Libraries will start the data curation process with a series of checking and digital preservation activities. If the descriptions or information stated in the README file are found ambiguous, the Libraries may contact the submitter to provide more information in order to ensure that datasets have been duly submitted.

 

3. Supervisor Review

Your submitted dataset and README file will be sent to your primary supervisor for review. This is to ensure the appropriate dataset is duly submitted. 

 

Please refer to the step-by-step guide on the submission procedures by moving to the next tab.

Dataset Submission Form

Reminder: Do NOT submit this form until you have received email confirmation from the Libraries.

1. Go to the Online Dataset Submission Form at <https://lib.hku.hk/dataset-submission/>

2. Login with your HKU Portal ID.

3. Your details will be automatically filled after you have logged in.

4. Enabling dataset destruction is optional. If necessary, tick the “Enable Dataset Destruction” box and enter the date of destruction. Please strictly follow the University requirement on minimum retention period when you are making the request.

5. Enter the DOI that has been reserved for your dataset. Press on the “Resolve” button. The details of your dataset will appear automatically. Double check if it is the correct record that you are going to submit.

undefined

6. Click on “Submit”. An email notification will be automatically sent to your primary supervisor for his/her review.

7. If no further comments from your supervisor is received, print out and submit a hardcopy of the Dataset Submission Form to your department or faculty for record.

Restricted Access Procedures for RPg Students

There are occasions that you may wish to upload your data with access control conditions, especially for the sensitive data that conveys personal identifiers. The below will guide you through the steps required for setting up access restrictions to your files.

If your data contain sensitive, confidential or restricted data per the HKU Policy on Research Ethics, you are required to handle those data by means of either of the below two methods:

  1. Upload the data under restricted access

  2. Make and upload a version that anonymizes the data, for public access (with the approval of relevant IRBs or ethics committees)

You are recommended to consult your supervisor on the suitability of your chosen access right control option before the submission. If uploading a metadata record only is necessary, please refer to the “Metadata Records Only” page under Publishing Data for the procedures.

For uploading your data files under restricted access, which only allows you and your supervisor to access those files, please refer to the below procedures:

Step 1: Click on “Apply embargo & restricted access” on the item record.

 

Step 2: Leave the option “Nobody” as your currently selected option.

 

Step 3: Click on “On files only” from the dropdown menu under Embargo type.

 

Step 4:  Decide how long your data files will be restricted for access.

  • For fully confidential items, select “Permanent Embargo” under Embargo period.
  • For a fixed period of time, select the appropriate length of time or select a specific date.

Note: For RPg student research dataset submission, dataset under this setting will be accessible only to the student, and his/her supervisor(s). The examiner(s) of thesis may also request access to the data for thesis examination process (MPH15 & PHD15 of the Procedures). When the student and supervisor(s) leave the university, the Dean of the faculty will be granted access permission.

Dataset Submission Procedures for RPg Students (Part 1)

Dataset Submission Procedures for RPg Students (Part 2)

RDM Procedures for RPG Students: Dataset Submission