Repository features

Data Curation

Princeton Data Commons is a fully mediated data repository. Depositors work with curators (data management experts) at the Princeton Research Data Service to ensure all data submissions adhere to FAIR Principles in a process called ‘data curation.’ Data curation is a free, required service for those wishing to publish data/code in the Princeton Data Commons.

For more on data curation, please see Data Curation.

Access Options

Depositors can make their data immediately available to the public or restrict access through an embargo for up to one year.

Persistent Identifiers

All datasets in Princeton Data Commons receive a unique digital object identifier (DOI) in the form of a persistent link. This persistent identifier guarantees no broken links to the data record and can be used as part of a dataset citation which includes all objects in the data submission.

Discovery and access

Princeton Data Commons is Princeton University’s data repository for long-term archiving and open, citable dissemination of digital research data and code produced by the Princeton Research Community. Anyone can search and download data housed in this repository that is not otherwise under an embargo period, withdrawn, or migrated, either immediately or by request. For more information please see the Acceptance and Retention Policy.

Metadata Schema

Our metadata schema provides the essential information needed to make assets discoverable and provide necessary context for reuse. We use required and selected properties from Datacite 4.4 to provide essential description, but have added other schema properties to optimally describe research data. Depositors can contact us to suggest additional schema properties we may provide to best describe the submission.

ORCID ID Support

We are beginning to offer support for ORCID IDs. You may enter an ORCID iD to your profile or apply one to a creator or contributor in the submission form. We are developing additional features supporting ORCID IDs – stay tuned!

Preservation and long-term storage

Princeton Data Commons (PDC) provides long-term storage of files and associated metadata with preservation services such as data migration policies, secure backups, and file-level integrity checks. For more information on retention, migration, and digital storage for PDC, please see the Acceptance and Retention Policy.

Large Data

All submissions over 100 MB will be required to use Globus during the curation process and will also be available for download in the repository through Globus. Globus is a free, not-for-profit service developed and operated by the University of Chicago. This research cyberinfrastructure allows users to download, share, and transfer files securely from anywhere via web browser.

Meets NSTC Desirable Characteristics

Princeton Data Commons meets a large number of Desirable Characteristics of Data Repositories for Federally Funded Research as outlined by the U.S National Science and Technology Council.*

Organizational infrastructure

Free and Easy Access Princeton Data Commons provides broad, equitable, and maximally open access to datasets and their metadata free of charge in a timely manner after submission.
Clear Use Guidance Princeton Data Commons requires documentation describing terms of dataset access and use to accompany all submitted datasets. For more information, please see Getting Started as a Princeton Data Commons Describe Contributor.
Risk Management Princeton Data Commons does not accept sensitive data. For technical risk management, please see “Long-term technical sustainability” below.
Retention Policy The Data Acceptance and Retention Policy and Guidance for the Princeton Data Commons Data Repository Service documents the university’s policies for data retention.
Long-term Organizational Sustainability The Data Acceptance and Retention Policy and Guidance for the Princeton Data Commons Data Repository Service contains plans for long-term management of data, including maintaining integrity, authenticity, and availability of datasets.

Digital Object Management

Unique Persistent Identifiers Princeton Data Commons assigns a digital object identifier (DOI) to all data submissions to support data discovery, reporting (e.g., of research progress), and research assessment (e.g., identifying the outputs of Federally funded research). The unique DOI points to a persistent location that remains accessible even if the dataset is de-accessioned or no longer available.
Metadata Princeton Data Commons requires datasets to be accompanied by metadata to enable discovery, reuse, and citation of datasets, using schemas that are appropriate to, and ideally widely used across, the communities that the repository serves. For more information, please see Getting Started as a Princeton Data Commons Describe Contributor.
Curation and Quality Assurance Princeton Research Data Service provides expert curation and quality assurance to improve the accuracy and integrity of datasets and metadata housed in Princeton Data Commons.
Broad and Measured Reuse Princeton Data Commons requires datasets to be accompanied by metadata that describe terms of reuse and provides the ability to measure attribution, citation, and reuse of data through assignment of adequate and openly accessible metadata and unique digital object identifiers (DOIs).
Common Format Princeton Data Commons allows datasets and metadata to be accessed, downloaded, or exported from the repository in widely used formats.
Provenance Princeton Data Commons has mechanisms in place to record the origin, chain of custody, and any other modifications to submitted datasets and metadata.

Technology

Authentication Princeton University Library Information Technology (IT) supports the authentication of Princeton University-affiliated data submitters. The repository has technical capabilities that facilitate associating submitter IDs with those assigned to their deposited digital objects, such as datasets.
Long-term Technical Sustainability Princeton University Library Information Technology (IT) has a plan for long-term management of data, building on a stable technical infrastructure and funding plans.
Security and Integrity Princeton University Library Information Technology (IT) has documented measures in place to meet well established cybersecurity criteria for preventing unauthorized access to, modification of, or release of data, with levels of security that are appropriate to the sensitivity of data (e.g., the NIST Cybersecurity Framework: https://www.nist.gov/cyberframework).

*The National Science and Technology Council, Desirable Characteristics of Data Repositories for Federally Funded Research, 2022, DOI: https://doi.org/10.5479/10088/113528