Skip to main content

DataHub: Dataset

Uploading Data

There are several ways to upload the data, depending on the size of the files:

  1. Through My Data where you can drag and drop files of up to 5GB (default limit). If you need to upload a single file that is bigger than 5GB, please contact us at <researchdata@hku.hk>.
  2. Using the Desktop Uploader or the FTP Uploader or the Figshare API, especially when working with large file/s or bulk upload.
  3. Using the Connect with GitHub, if you want to submit the code in your GitHub.

My Data

1. After you have signed in, click on the “My data” page and press the +Create a new item button from the top left of the page to start the uploading process.
undefined

 

2. Click on “browse” on the top to select the file(s) or simply drag and drop the file(s) on the page. 
Then fill in the necessary information related to your uploaded data. Please note that fields marked with a green dot are mandatory.

undefined

 

3. In the “Authors” field, more than one author, or co-authors, could be added. You may drag the green boxes in order to rearrange the order or click on the cross button next to the author name for deleting it.
If the author could not be found on the dropdown menu, click on “Add author details” for adding it manually. Email address or ORCID will be required.
undefined    undefined  

4. Categories shall be selected from the dropdown menu. You are required to choose the primary category and select one or more sub-categories from the list by ticking the box(es).
undefined     undefined

 

5. Select the appropriate item type for the uploaded file(s).
undefined

 

6. Input the keywords that could specifically describe your data and write a description with details.
undefined

 

7. Add the funding information under “Funding” and multiple organizations could be added by clicking on “Add another grant”. Fill in the reference URL or DOI link if applicable, and then select the appropriate license you wish to obtain. The default license will be CC BY-NC if no other preferred type of license is chosen.
undefined

 

8. Click on “Save changes”.
undefined

 

9. At this stage, your data have been uploaded but not yet been publicly available. You may follow the steps stated in section Publishing an item under Publishing Data for making it permanently available.

Desktop Uploader

The Desktop Uploader is an App that locates on your desktop and allows you to drag in your research outputs. You can track their progress as they are uploaded securely to the cloud. 

Download the Uploader by clicking here. Once installed, login with your Figshare credentials.

To get more information about how to upload large datasets and carry out bulk upload by using Desktop Uploader, please refer to: https://knowledge.figshare.com/articles/item/upload-large-datasets-and-bulk-upload-using-the-ftp-uploader-desktop-uploader-or-api

FTP Uploader

FTP Uploader allows you to easily and securely upload files in your account directly from your computer by using a secure FTP connection. To use this method, you need to install an FTP client like Filezilla (but any FTP client will work).

There are several details to be noticed in order to connect to the Figshare FTP server to be able to upload data to your account.

Host: ftps.figshare.com

Stage host: ftps.figsh.com

Transfer mode: This should be set to passive. This is the default mode for some FTP clients. However, please make sure you are using this mode.

Username: You can find the username in the Application page of your account, which is accessible via the user menu.

Password: Depending on the authentication method for your account, you will either be using the same password as the one used to log in to Figshare or you will be able to generate one if your account has SSO.

To know more about how to use FTP, please refer to: https://knowledge.figshare.com/articles/item/how-to-use-ftp

GitHub

To connect Figshare with your GitHub account, you can get set up in the Applications section, also located in the dropdown menu at the top right corner next to your name. Next, select Connect as shown below:

undefined

 

After sign in to your Github account where you will authorize Figshare, click off the Configure Github Integration overlay, go to My Data and click on the GitHub icon as the figure below.

undefined

 

Then, you can start importing GitHub from your list of public repositories:

undefined

 

If you configure the auto-sync setting to ON, figshare will automatically update for every release (for each of your imported repos) and this will only occur if your figshare item is public. Each new release would generate a new version of your figshare item.

 

For how to connect Figshare with your GitHub account, please click the following link for details:

https://knowledge.figshare.com/articles/item/how-to-connect-figshare-with-your-github-account

Figshare API

The Figshare API allows you to push data to Figshare, or pull data out. It also can create collections out of public content or build applications on top of the functionality.

Documentation on how to use Figshare's API can be found at https://docs.figshare.com/.

Reserving a DOI

By publishing your research on DataHub, a DataCite DOI will be automatically allocated, which will enable your data to be cited using different citation methods.
However, if your data are not ready to be published yet, you may also reserve a DOI during the uploading process. It will only be active and citable when the item is published.

 

Select “Reserve Digital Object Identifier” at the bottom of the item details page.
undefined

 

A DOI will then be generated immediately.
undefined

Uploading Data with Conditions

Confidential Files

There are occasions that you may wish to upload your data with conditions, especially for the sensitive data that conveys personal identifiers. The below will guide you through the steps required for uploading confidential files, embargoed files, linked files and creating metadata record only items.

If your data contain sensitive, confidential or restricted data per the HKU Policy on Research Ethics, you are required to handle those data by means of either of the below two methods:

  1. Upload the data as “confidential”
  2. Make and upload a version that anonymizes the data, for public access (with the approval of relevant IRBs or ethics committees)

‚ÄčIf you choose to upload the data as “confidential” on DataHub, simply follow the below 2 steps:

1. During the uploading process, click on “Make file(s) confidential” at the bottom of the item details page.
undefined

 

2. Input the reason(s) for the confidentiality or contact information for access request. The information will be publicly available once the item is published.
undefined

 

Once the item has been published as “confidential”, the files are unable to be previewed and downloaded as shown in the below screen capture. You could also refer to an example of a confidential item.

For RPg student RDM workflow, only you and your supervisor are able to view the files.

undefined

Embargoed Files

If your data is temporarily not permitted to be publicly available within a certain period, you may apply an embargo period on your files or the whole item. Applying an embargo period to the whole item means the metadata record will neither be submitted for review nor be publicly available until the embargo period is over.
Click on the “Apply embargo” button.
undefined


Choose whether the embargo will be applied on the file(s) only or the whole item. Set the embargo period by either selecting the length of period or enter an exact date of expiry. You may also include the reason(s) and it would be publicly visible if the embargo is at the file level.
undefined

 

When the item is published with an embargo period (at file level), a countdown will be shown on top of the metadata page of your item and files are temporarily restricted from previewing and downloading. The files will become available for Open Access once the embargo period has ended. You can refer to an example of an embargoed item.

undefined

Linked Files

If your data is already retained in an external repository, you may wish to create an item record with the link directing others to where your data is stored.

Instead of browsing or dragging any files, click the “Link file” button at the top right of the page.
Note: This option is available only if you have not uploaded any files to the item.undefined

 

Copy and paste the link into the box.
undefined

 

If you would like to edit the link, you are required to remove it completely by clicking the cross symbol (remove link) and add a new one.
undefined

 

The link will be shown on top of the published metadata page as shown below. You can refer to an example of a linked file item.

undefined

Metadata Record Only

If your data are forbidden to be uploaded onto another repository owing to copyright issues, or the data are considered to be too sensitive to be uploaded (with valid justification), you may wish to create a metadata record only.
Without any file(s) being uploaded, tick the “Metadata record only” box at the top left of the page.
undefined

 

Fill in the reason(s) for creating a metadata record only. Provide the access information if data are preserved in an external repository.
undefined

 

Once the item is published, the metadata record page will be publicly available with no file to be shown. You can refer to an example of a metadata record.

undefined

Managing Dataset

Edit in Batch

You could always manage your uploaded files under the “My data” page. All of your uploaded files will be shown here and you are able to edit in batch, delete or restore individual files, or organize files under the same category.

1. Select the files that you would like to batch edit by ticking the boxes. Click on “Actions” and select “Edit in batch”.
undefined

 

2. Choose the field(s) that you would like to edit. Three options: prepend, append, and replace are available and changes will apply to all the items you have selected.
undefined

 

3. Click on “Save all changes” at the bottom right corner.
undefined

 

4. Press the “Confirm & save all” button. (Click on “Back to editing” at the top left corner if you would like to continue editing)
undefined

 

5. Click on “Done”.
undefined

Delete & Restore

To delete individual file(s) within an item and restore them, you can:
1. Select the item and click on “Manage” at the top right corner.
undefined

 

2. Now you can manage the individual files uploaded under the same item.
            (i) Drag and drop to rearrange the order of the file(s)
            (ii) Click on the arrow to download the file(s)
            (iii) Click on the cross to delete the file(s)
undefined

 

3. If you would like to restore the files that were deleted before, return to the item details page and click on “Deleted files”.
undefined

 

4. You will find a list of deleted files and simply click on the loop arrow for restoring the file(s). Please note that deleted files can only be restored within 30 days after it is deleted. Remaining days available for restoration will be shown next to the file’s name.

undefined

 

5. Successfully restored file(s) will be highlighted in yellow.
undefined

Organize with Keywords

While creating folders is currently not available on DataHub, you may use the keyword feature to indicate file(s) under the same category and organize your files. 
1. Go to “My data” page.

2. Select the files that you would like to put under the same category (or same folder) by ticking the boxes, click on “Actions” and “Edit in batch”.
Note: Please refer to the Edit in Batch section for a step-by-step guide on batch editing.

3. Add the same keyword to those files you have selected.

4. Search the keyword and sort out items with this keyword tag.
undefined

You could always edit or replace the tags in batch if you do not wish those internal file management tags to be published when the items are made publicly available.
 
5. You may also add those items into a Project and do the sorting within the Project. Please refer to the Projects & Collaboration page for guidelines on starting a Project.

Cite Dataset

Citing dataset in your published research serves the same purpose of citing journal articles or other types of publications: attributing credits to the producers or providers of the dataset and allowing other researchers to track the sources and reuse the data, which would enable the reproducibility of your findings.

Include the followings when you are citing a dataset:

  • Author(s)
  • Title
  • Year of publication
  • Publisher and/or distributor
  • Access location information (e.g. URL, DOI, or other persistent identifier)

Unfortunately, standard of data citation has no agreed unified format and may vary across disciplines. Yet, on DataHub, you may choose from a dropdown menu on which citation style you would like to use when citing the data. The below steps will show you how to get the citation:


1. Go to the published item page and click on the “Cite” button.
undefined

 

2. Select the citation style you wish to use. DataCite has been set as default.
undefined

 

You may also export the citation to RefWorks, BibTeX, Endnote, DataCite, NLM, DC and RefMan. These could be found on the right bottom of the page.
undefined