Data Acceptance and Retention Policy and Guidance for the Princeton Data Commons Data Repository Service v6.

Summary

Princeton Data Commons (PDC) is Princeton University’s data repository for long-term archiving and open, citable dissemination of digital research data and code produced by the Princeton Research Community. The following policies are designed to ensure PDC’s long-term sustainability while meeting researcher needs, ensuring compliance with funder and journal requirements, and following recommended practices for FAIR data management and stewardship. Our policies fall into four broad categories:

  • Data Acceptance defines what digital research products may be added to the repository, how large they can be, who is eligible to submit, and what supporting documentation is required.
  • Data Embargo and Peer Review describes our process and expectations for limited access and preliminary submissions for data requiring an embargo or peer review.
  • Data Retention defines our standard costs and retention tiers as well as our three levels of access: open access, mediated access, and withdrawn/migrated.
  • Data Retraction describes the conditions under which research data may be removed from the repository.

Please read on for the full acceptance and retention policy. Questions regarding this policy can be directed to prds@princeton.edu

Data Acceptance

  • Princeton Data Commons (PDC) accepts datasets composed of any type and combination of research data and code, and any additional content necessary for effective re-use
  • Datasets must have a readme that contains basic reuse information
  • Depositors must agree to a distribution agreement that gives Princeton University, through Princeton University Library (PUL), permission to make data openly available and make copies as necessary or appropriate
  • Depositors must affirm that they have the authority to submit the data for open access and, to the best of their knowledge, that the dataset contains no data that can or should not be made publicly available (e.g., because publication of the data may violate the law or infringe the rights of another individual or entity)
  • Depositors must provide a small set of required metadata, and are encouraged to provide extended metadata
  • Datasets of any size may be accepted, but datasets over 1 TB may be subject to additional conditions, see table below
  • Any member of the Princeton Research Community, including faculty, students, and staff, may submit a dataset
  • Datasets must have at least one (co-)creator who is a current Princeton faculty member, student, or staff member (exceptional cases may be made with a Princeton sponsor)
  • Datasets may undergo review by data curators before they are accepted and published; curators may request changes to datasets or metadata before acceptance
  • Once published, datasets are considered final and may only be changed via versioning; minor metadata updates may be made without versioning

Data Embargo & Peer Review

  • Upon request, a dataset may be accepted into PDC and given a DOI under an embargo that prevents access to the dataset for a limited period of time. Embargo requests must include a release date that is within one year of dataset acceptance.
  • Upon request, a dataset may be preliminarily accepted into PDC and issued a draft DOI for the purposes of peer review. In this case, changes to the data and metadata may be made in response to peer review before the dataset is fully accepted and a DOI is finalized.

Data Retention

Datasets in PDC have one of three statuses:

  • Open Access: Datasets are in a directly downloadable state
  • Mediated Access: Datasets are in a long-term, low-use storage state and are available upon request
  • Withdrawn/Migrated: Datasets are not stored or available

PDC endeavors to retain and make available datasets based on the table below. Datasets with Open Access or Mediated Access status will be equally discoverable in PDC, with the same metadata information visible regardless of status. In cases where datasets are Withdrawn/Migrated as a result of this data retention policy or otherwise, a reasonable effort will be made to contact the depositor of the dataset in advance of the change of status, and PDC will endeavor to retain the corresponding metadata (including the readme) and make it publicly available via a deletion marker (i.e., a Tombstone record).

Curatorial review for retention will be at the discretion of Princeton University and typically led by Princeton Research Data Service (PRDS) staff in consultation with appropriate external experts. Such review may evaluate factors such as funder requirements, sustainability of storage, cost and feasibility (e.g., of dataset re-creation), availability of other copies elsewhere, and assessment of current and/or active use by the research community.

Standard Cost and Retention Tiers

Size < 1 TB 1 - 10 TB 10+ TB
Cost to Researcher* None $2000 per TB Contact PRDS
Open Access Status Minimum 10 years 5 years Contact PRDS
Status Review Schedule Review at 10 years, followed by 5 year cycle until mediated access Review at 5 years, followed by 5 year cycle until mediated access Contact PRDS

*One time charge at time of acceptance. Charges and review schedules are subject to change.

Data Retraction / Takedown

Princeton University, through PDC, reserves the right to remove data, including an entire dataset, from an openly accessible state and/or from PDC entirely, at any time as deemed necessary by the University, such as under the following conditions:

  • Legal: A valid take-down notice requests it, or content that is not legally sharable or that violates the rights of another individual or entity is discovered in the dataset
  • Security: Any part of the dataset is determined to be a security risk to the University or others, including a security risk to the University’s infrastructure
  • Research misconduct: A review of the dataset and/or publications based on the dataset are found to fall under “research misconduct” pursuant to the University’s policies.
  • Ethical considerations: Any part of the dataset is determined to raise ethical concerns, including but not limited to content that is harmful to individuals or communities, or violations of privacy or data sovereignty.

In the case where a dataset is removed from the repository as a result of this Data Retraction/Takedown policy, the corresponding metadata (excluding the readme) may be retained and remain publicly available in a Tombstone record unless otherwise disallowed by the University/PUL. The dataset itself may be retained privately or deleted, at the University’s discretion, depending on the reason for the takedown.

In the case where the depositor or another contributor to the dataset requests their dataset to be Withdrawn from the repository, PUL will undertake a review of the request and the dataset to determine the best course of action, working with the requestor and other campus offices as appropriate. Datasets will not be Withdrawn because the depositor or a creator has moved to another institution. Subject to applicable law and contractual requirements (e.g., with the funder of the research/dataset), the depositor or a creator may provide a copy of the dataset to another repository as allowed by the distribution agreement.

Data Migration

PDC reserves the right to migrate a dataset to another repository / platform as appropriate. In cases where datasets are migrated as a result of this data retention policy, reasonable efforts will be made to contact the depositor of the dataset in advance of the change of status, and corresponding metadata (including the readme) may be retained and remain publicly available via a Tombstone record.

Policy Review and Changes

This policy, as well as data storage options and prices, may be reviewed and revised as deemed necessary or appropriate by the University. Such reviews may include an annual assessment by PRDS with input from appropriate external experts. This policy, including retention policies and costs, may be updated at the University’s discretion, for example to ensure long-term sustainability for data retention and stewardship.

Effective 1 February 2024

Distribution Agreement

"In furtherance of its non-profit educational mission, Princeton University makes certain research data available to the public through its Princeton Data Commons (PDC). In order for PDC to reproduce, translate, and/or distribute your submission for public access, your agreement to the following terms is necessary. Please take a moment to read the terms of this agreement. By clicking through this agreement, you (the author(s) or creator(s) or copyright owner(s)) hereby grant to Princeton University the non-exclusive, worldwide, perpetual, irrevocable, royalty free right to reproduce, translate (as defined below), and/or distribute your submission (including the abstract) worldwide in electronic format. This license is in addition to any rights in the submission that Princeton University may hold based on other contracts, licenses, University policies, or applicable law.

You represent that the submission is your original work, and/or that you have the right to grant the rights contained in this license. You represent that your submission does not, to the best of your knowledge, infringe upon anyone's copyright and does not violate any laws or breach any contracts (for example, research agreements that supported the project) or other licenses. If the submission contains material for which you do not hold copyright, you represent that you have obtained the unrestricted permission of the copyright owner to grant Princeton University the rights required by this license, and that such third-party owned material is clearly identified and acknowledged within the text or content of the submission.

Because Princeton University owns the copyright in any work that any member of the University staff creates in the course of his or her assigned duties at the University ("University Work Product"), University Work Products may not be deposited in Princeton Data Commons without first securing the written consent of the University supervisor who is responsible for the creation and dissemination the University Work Product. Such consent must include acknowledgement by the supervisor that the University Work Product can, and will, be made available to the general public. Regarding copyright ownership of faculty work products, including software, please see the University’s Copyright Policy.

You also represent, to the best of your knowledge, that public distribution of your submission will not violate the personal privacy rights or other proprietary rights of any group or individual. You agree that Princeton University may translate the submission to any medium or format as deemed reasonably necessary by the University, e.g., for the purpose of preservation.

You also agree that Princeton University may keep more than one copy of this submission for purposes of security, back-up, and preservation.

If the submission is based upon work that has been sponsored or supported by an agency, company, or other organization other than Princeton University, you represent that you have fulfilled any right of review or other obligations required by such contract or agreement.

You acknowledge that you have read the “Data Acceptance and Retention Policies and Guidance for the Princeton Data Commons Data Repository Service,” which is incorporated by reference into this agreement, and that you agree to the policy’s terms and conditions. The policy addresses topics including data acceptance, embargo, retention, cost, retraction/takedown, and migration. In addition, Princeton University reserves the right to make unavailable and/or remove from PDC any submission or content which it deems to be in violation of this agreement."

Effective 1 February 2024