LibGuides: HKU DataHub: The Guide: Uploading Data

Ways to upload your data

There are several ways to upload your data, depending on the size of the files:

Through the HKU DataHub Interface where you can drag and drop files of up to 5GB (default limit). If you need to upload a single file that is bigger than 5GB, please use the FTP Uploader.
Using the FTP Uploader or the Figshare API, especially when working with large file/s or bulk upload.
Integrate with GitHub, if you want to archive a copy of your code deposited in your GitHub account.

Uploading Data

HKU DataHub Interface

1. After you have signed in, click on the “My data” page and press the +Create a new item button from the top left of the page to start the uploading process.

2. Click on “Browse files” or "Browse for folders” in the uploading box to select the file(s) or simply drag and drop the file(s) into the box. Then fill in the necessary information related to your uploaded data. Please note that fields marked with a * icon are mandatory.

3. Enter the title for your item under "Item title".

4. Select the appropriate item type for the uploaded file(s). Click on "Change item type" first, a box will prompt up and select the item type from the list. The selected type will turn into green colour. Click on "Apply changes".

5. In the “Authors” field, more than one author, or co-authors, could be added. You may drag the green boxes in order to rearrange the order or click on the cross button next to the author name for deleting it. If the author could not be found on the dropdown menu, click on “Add author details” for adding it manually.

6. Categories shall be selected from the dropdown menu. You are required to choose the primary category by clicking on the "Next" arrow, and select one or more sub-categories from the list by ticking the box(es).

7. Input the "keywords" that could specifically describe your data. Move your cursor to the input box and type the word. Please press on "Enter" after each word and the word will be saved.

8. Enter a "description" that introduce your data files / dataset / research project with details.

9. Add the funding information under “Funding” and multiple organizations could be added by clicking on “Add another”.

10. If you would like to link your dataset record with an article (e.g. peer-reviewed publication) which is related to your submitted dataset, or any other related materials (e.g. project website, online resources), you could input the identifiers such as DOI or URL under the section “Related Materials”.

First, click on "Manage materials".

Input the identifier, such as a DOI or an URL, under "identifier", and enter a title for this material, e.g. title of the linked journal article.

Select the identifier type from the dropdown box, and then select the relation type between the linked material(s) and the dataset. Click on "Add material" to add the related material. Multiple materials can be entered. You may continue to add the next material on the form. Click on "Done" when it is completed.

The entered materials will be listed at the bottom of the dataset record page. If you ticked the box "Show in linkout area", it will create an box on the right side of your dataset item record page which highlights the linkage.

You may refer to an example of a dataset record linked with external resources.

11. Select the appropriate "license" you wish to obtain for this dataset record. The default license will be CC BY-NC if no other preferred type of license is chosen.

12. Click on “Save changes”. You may wish to preview how your dataset record will look after publication by clicking on "Preview item".

13. At this stage, your data have been uploaded but not yet been publicly available. You may follow the steps stated in section Publishing an item under Publishing Data for making it permanently available.

*************************************************************************************************************************

Before publication, you are advised to review the below:

Restricted Access is available for applying to your dataset record before publication. Please read the Restricted Access section for details.
Reserve a DOI for your dataset. You may use the reserved DOI for citation before it is published.

FTP Uploader

FTP Uploader allows you to easily and securely upload files in your account directly from your computer by using a secure FTP connection. To use this method, you need to install an FTP client like Filezilla (but any FTP client will work).

When you have successfully downloaded and installed Filezilla on your device, please refer to the below step-by-step guidelines on how to establish FTP connections between Filezilla and HKU DataHub, as well as how to upload files via Filezilla.

**You are recommended to break down your datasets into multiple zipped folders with a smaller folder size per each if you have a very large volume of data files.

**Only zipped folder (.zip) can be uploaded via FTP Uploader. Uploading unzipped folders will fail to transfer the files to DataHub.

Step 1

Retrieve FTP Username and Password on HKU DataHub

Go to HKU DataHub and login with your HKU credentials. Click on “Applications” from the drop-down menu at the top right corner.

Mark down the “Username” for your account and press on “Generate Password” to retrieve the password for later use.

Step 2

Establish Connection Settings

Open Filezilla, Select File > Site Manager…

Follow the instructions step by step in the below image:

First, create a "New site", rename it to "HKU DataHub" or any name that you could recognize it. On the right hand side of the panel (i.e. item 3-4), under the "General" tab, please follow and enter the below settings:

Host: ftps.figshare.com Port: 21

Protocol: FTP - File Transfer Protocol

Logon Type: Ask for Password

User: The username you copied from HKU DataHub in Step 1

Now, switch to the "Transfer Settings" tab, select Passive under "Transfer mode". When everything is set, press on the Connect button.

A pop-up window will be prompted to you and you are required to enter the Password that you copied from HKU DataHub in Step 1.

Press OK to proceed.

Step 3

Start uploading your files

Please read the reminder notes (under the session “Before you begin”) from Figshare Help.

**Create ONE folder under the ‘data’ folder only. Please do not create any sub-folders under this newly created folder.

**You are recommended to break down your datasets into multiple zipped folders with a smaller folder size per each if you have a very large volume of data files.

**Only zipped folder (.zip) can be uploaded via FTP Uploader. Uploading unzipped folders will fail to transfer to HKU DataHub.

Tips: You are recommended to upload a testing text file to test the uploading flow before you actually upload your data files especially if they are very large in size. You can delete the testing file from your data record afterwards.

Step 4

Double check your uploading status and files uploaded

Return to HKU DataHub, if you have successfully uploaded any files onto the folder directory in Filezilla, an item record with the same name of your created directory will appear automatically on your “My data” page.

Whenever a file or a folder has been successfully uploaded via the FTP uploader, it will appear under this item record.

**Note: The item record does not appear if you haven’t uploaded any files under the new folder directory on Filezilla.

Step 5

Amend the metadata of this item record

Once you have uploaded all data files, enter the metadata onto the item record as per the requirements, including dataset title, author(s), dataset descriptions, item type, category, etc. (Refer to Step 3 in the upload via DataHub Interface guideline) Your item record will be ready for submission and publication once you have completed the metadata fields.

For Research Postgraduate student, please return and refer to Step 4 in the Submission Guide to complete the metadata requirements and proceed to the remaining submission steps.

GitHub

To connect Figshare with your GitHub account, you can set up in the Applications section, which is located in the dropdown menu at the top right corner next to your name. Next, select Connect as shown below:

undefined

After signing in to your Github account, you will be able to authorize Figshare for integration. To start uploading data that were already deposited on Github, go to the "My Data" tab and click on the GitHub icon as shown in the figure below.

undefined

Then, you can start importing data files in your list of public repositories from GitHub:

undefined

If you configure the auto-sync setting to ON, Figshare will automatically update for every release (for each of your imported repos) and this will only occur if your data record on HKU DataHub is public. Each new release would generate a new version of your dataset record.

For more detials, please refer to the figshare article on how to connect Figshare with your GitHub account.

Figshare API

The Figshare API allows you to push data to Figshare, or pull data out. It can also create collections out of public content or build applications on top of the functionality.

Documentation on how to use Figshare's API can be found at https://docs.figshare.com/.

Uploading Data with Conditions

Restricted Access

There are occasions that you may wish to upload your data with access control conditions, especially for the sensitive data that conveys personal identifiers. The below will guide you through the steps required for setting up access restrictions to your files.

If your data contain sensitive, confidential or restricted data per the HKU Policy on Research Ethics, you are required to handle those data by means of either of the below two methods:

Upload the data under restricted access
Make and upload a version that anonymizes the data, for public access (with the approval of relevant IRBs or ethics committees)

Please refer to the "Restricted Access" section for a detailed step-by-step guide.

Linked Files

If your data is already retained in an external repository, you may wish to create an item record with the link directing others to where your data is stored.

Instead of browsing or dragging any files, click the “Link to external files” button in the data uploading box. Copy and paste the link into the text box.
Note: This option is available only if you have not uploaded any files to the item.

If you would like to edit the link, you are required to remove it completely by clicking the cross symbol (Remove external link) and add a new one.

The link will be shown on top of the published metadata page as shown below. You can refer to an example of a linked file item.

undefined

Metadata Record Only

If your data are forbidden to be uploaded onto another repository owing to copyright issues, or the data are considered to be too sensitive to be uploaded (with valid justification), you may wish to create a metadata record only.

Without any file(s) being uploaded, select the “Set as metadata record” option in the data uploading box. Fill in the reason(s) for creating a metadata record only. Provide the access information if data are preserved in an external repository.

Once the item is published, the metadata record page will be publicly available with no file to be shown. You can refer to an example of a metadata record.

undefined

Reserving a DOI

By publishing your research on DataHub, a DataCite DOI will be automatically allocated, which will enable your data to be cited using different citation methods.

However, if your data are not ready to be published yet, you may also reserve a DOI during the uploading process. It will only be active and citable when the item is published.

Click on “Reserve DOI” at the right side of the item details page.

Click on "Reserve". A DOI will then be generated immediately. It will be shown in a prompted box.

Press on the "Copy" button to copy the reserved DOI for future use, then close it.

The DOI information will be available on the right side of your screen. You may "disable DOI" only if the item is private and not yet published. Please note that the activated DOI cannot be disabled once the item has been published.