Data Repository

Should I upload identifiable data into data repositories?


​Individually-identifiable data should not be stored in online data repositories to prevent leakage of identifiable personal data. This is regardless of whether the data deposits are made in a "locked" or open-access format, and is in line with the policies of most data-repositiories, including DR-NTU(Data)

The keys (or linkages) to the identification of human subjects, should not be uploaded, and should be kept encrypted and separate from the rest of the primary raw data.

Please refer to Section 6.6 of DR-NTU(Data)'s Terms of Use:
User Uploads must be void of all identifiable information, such that re-identification of any subjects from the amalgamation of the information available from all of the materials (across datasets and dataverses) uploaded under any one author and/or User should not be possible. Specifically, User Uploads cannot contain social security numbers; credit card numbers; medical record numbers; health plan numbers; other account numbers of individuals; or biometric identifiers (fingerprints, retina, voice, print, DNA, etc.). 

The only exceptions for when identifiable information is allowed are when:
1. the information has been previously released to the public;
2. the information describes public figures, where the data relates to their public roles or other non-sensitive subjects; or
3. all identified subjects have given explicit informed consent allowing the public release of the information in the dataset