fb pixel

RDM Data Documentation & Storage

It is important to provide robust documentation and description of your datasets for any possible future users. When collecting data, it is also important to think about what formats you will be storing, and what kind of security you will use.

Which file formats are preferred?

File formats can go out of style, making it difficult for anyone to retrieve the data in the future (think of laserdiscs!) It is important to try to plan for future readability as much as possible. In most cases, this means using the simplest file format that is not proprietary - i.e., not tied to a particular software or program. Open File Formats usually fall into this category. In all cases, the simpler the file format, the more likely it will be readable in the future.

Further Reading:

What kind of description is useful?

Consider adding a plain text based README file to your dataset that describes the dataset itself. You can include definitions of special terms, description of folders, file formats, citations, what was done to clean the data, and more.

In addition to a README file, you will need to add metadata.

How do I add metadata?

Metadata is the information that describes your data set. Properly describing your data helps make them accessible and usable over time.

Common metadata elements would be:

- title of dataset

- Creator - author/ researcher

- Subject - using standard language from your discipline

- Description - including the how & why

- Abstract, Type, Date, License, etc

It is useful to be as descriptive as possible in your metadata, and to provide keywords that may help give context to how your data may be useful. Even if your data will not be publicly available, it is important for your own use during the research process, and for any collaborators you may have. 

Consider how you will organize & label your data with metadata as part of your Data Management Plan. If you plan on depositing your data to a repository like WinnSpace, look at the metadata field the repository uses. Doing this early in the research process will make it easier to deposit your data.

Further Reading:

Data Storage and Security

When storing data, you need to always consider the effect of loss of the data to the study, and to anyone involved in the study. You need to plan a way to minimize the effects of the loss or destruction of data.

To prevent the accidental destruction of data, we recommend the 3-2-1 backup strategy:

  • 3 total copies of your data on
  • 2 different devices
  • with at least 1 copy offsite

The University of Winnipeg has Data Protection Classifications and Requirements that you should follow in order to safeguard your data.

Your network space is available from anywhere with WebFiles. Please contact TSC to discuss additional storage needs and costs.

Safe and Secure Collaboration

Sharing sensitive data on commercial platforms (including free solutions such as Dropbox) can potentially cause security concerns.

Institutional/Internal Options: 

  • NextCloud: UWinnipeg offers 100GB of collaborative storage space, stored locally (not backed up), through NextCloud, available to all UWinnipeg faculty and staff. Access NextCloud here using your UWinnipeg credentials here.
  • Microsoft 365 (OneDrive/Teams/Sharepoint): Microsoft 365 is available to all UWinnipeg faculty and staff. Each faculty member and staff have been allocated 1TB of storage on Microsoft OneDrive. Data can be shared internally and externally via Sharepoint. More information available here. 

External Options: 

For larger storage needs and computational cloud resources, Compute Canada has two preferred solutions. To access these services and NextCloud, you will need to sign up for an account with Compute Canada first (approval make take up to 2 business days, and the application should be done by the Primary Investigator), as well as sign up for the solution more suited to your needs: 

  1. Rapid Access Service for modest amount of storage and cloud resources. Many research groups can meet their needs by using the Rapid Access Service only. This option allows up to 10TB of storage. 
  2. Resource Allocation Competitions for those who require CPU or GPU allocation, or larger amounts of storage requirements. This options allows up to 1TB of storage. 


Further Reading:

Library Help Chat