- Library
- Guides
- Research
- Managing Research Data
- Plan Components
Managing Research Data: Plan Components
Data Management Plan Components
While exact content requirements will vary between funding agencies and specific solicitations, many of the core components of a data management and sharing plan remain the same. Those components (along with some recommendations) are outlined below, based on a template from NIH.
The Data
Data To Be Collected
In your data management and sharing plan, you will describe what kinds of data will be collected or generated in your proposed project. Consider the types of data you will produce as well as the sources of those data. It is helpful to mention file formats as well as estimated volumes of data produced, particularly if you anticipate elevated data storage needs.
Metadata and Other Data Documentation
In addition to the data from your project, you should describe what other information is necessary to fully understand and interpret those data. Will you be using tools or applications that automatically produce metadata (such as EXIF metadata recorded with many digital photographs)? How will you document the protocols and other methodologies used in your project? When you conduct data cleaning and analysis, will you maintain records of the steps you took and why?
Related Tools, Software, and Code
This component is where you specify the kinds of tools, software, and/or code that will be used during your proposed project. This can include specific programs needed to visualize and work with particular types of data as well as more general statements about programs that can be used to access the data (e.g., common image viewing software, common spreadsheet-based software such as Microsoft Excel).
Standards
Some disciplines or data types may follow particular standards for metadata, file formats, and even contents of the data itself. Researchers may also mention any standard file formats they will use (e.g., .TIFF files for microscopy images, .PDF for page layouts of digital muscial scores).
Some examples of standards include:
-
NIH Common Data ElementsCommon Data Elements are often used for recording data in clinical trials and other medical research
-
Cataloging Cultural ObjectsUsed to create metadata for cultural artifacts like art and architecture, as well as their visual representations
-
FGDC's Geospatial Metadata GuidelinesThe Federal Geographic Data Committee (FGDC) provides these guidelines for geospatial metadata
Data Preservation, Access, and Associated Timelines
Preservation
For all research data, metadata, and other files (such as coding scripts) that will be shared as part of the proposed project, mention which data repositories have been selected for archiving those files. It is perfectly alright to list multiple repositories as your research files may have more than one appropriate destination. For example, Zenodo is equipped to handle all sorts of file formats and is a popular destination for software and code because of its partnership with GitHub, while repositories such as ICPSR and Databrary are more focused on sharing data from disciplines like behavioral, educational, and social sciences.
Access
Researchers should briefly describe ways in which the repositories chosen promote access to the shared data. Features such as providing persistent identifiers (e.g., DOIs) to data deposits, making data open access, and using unique identifiers (e.g., DOIs, accession numbers) to reference the data are all ways in which repositories and researchers can enhance access to the research data.
Timelines
The plan should mention not only when data will be made available, but also for how long. In general, U.S. federal funding agencies expect research data to be shared "as soon as possible" or "within a reasonable time" and they may set limits such as "no later than the time of an associated publication, or the end of performance period, whichever comes first" (NIH Policy for Data Management and Sharing). Reputable data repositories should guarantee a minimum duration of data access and preservation. For example, Zenodo promises long-term preservation for the lifetime of the repository, which it defines as "the next 20 years at least."
Access, Distribution, or Reuse Considerations
There may be certain factors (i.e., legal, ethical, or technical) that limit researchers' ability to share all data openly and publicly. Some examples include embargos related to patents coming from the research, national security concerns, or sensitive data that risk re-identification of human research participants. Federal funding agencies are typically willing to accommodate these limitations as long as they are disclosed in the data management and sharing plan and there are some data repositories that provide restricted access to data and authenticate prospective users.
Oversight of Data Management and Sharing
This final section clarifies which individuals are explicitly responsible for oversight of the data management and sharing plan. The Principal Investigator(s) and Co-Principal Investigator(s), as well as other key project members (e.g., data managers) are commonly mentioned as responsible parties. It is important to note that the data management and sharing plan is functionally a contract between the principal investigator, the institution, and the funding agency, so you should expect to be held to the standards set by your plan. Compliance with this plan may be evaluated annually and your progress toward data management and sharing activities may be expected as part of your annual progress report.
Major components based on contents of a sample NIH data management and sharing plan (PDF) for non-human basic research.
If you have specific questions about your data management and sharing plan or would like feedback on the plan's contents, contact Dr. Dani Kirsch, Research Data Services Librarian.