##viz Skip to Main Content

Managing Research Data

Managing Research Data: Metadata

Good Metadata Practices

Sharing your research data is impossible without proper data documentation. Metadata is data about data - structured information that describes the content and makes it easier to find or use. Metadata can be embedded within the data itself or stored separately, and it can be included in any data file or file format. 

While discipline-specific metadata formats are often more structured and controlled, the creation of more generic metadata - such as README files -  should consider the following to maximize the usefulness of the metadata:

  • Provide sufficient detail so that context and content of data are clear to future users
  • Clearly state licenses/restrictions placed on the data (such as Creative Commons licenses)
  • Report bibliographic information about the dataset, including citations to relevant publications
  • Summarize key methodological information
    • Sampling methods (e.g., geography, dates, protocols)
    • Software (including versions, where applicable)
    • Processing or transformation of files and/or data
  • Describe file formats (e.g., CSV, TXT, TIFF), contents, and hierarchies
  • Follow FAIR principles when creating metadata (more detail below)
  • Ask a colleague to review your metadata and data files and suggest improvements or point out concerns

This README file template may be a useful starting point:

FAIR Data Principles

These principles assert that published research data should meet the following 4 criteria:

Findable

  • Metadata are descriptive
  • (Meta)data have unique, persistent identifiers (e.g., DOIs)
  • Metadata exist in a searchable resource (e.g., data repository)

Accessible

  • Data can be accessed, which may require authentication and authorization
    • Protected and restricted data can be made accessible in this manner
  • Metadata remain accessible even if data are no longer available

Note: this is a narrow application of the term "accessible" and does not consider efforts to ensure data are equally and equivalently accessible for people with various disabilities

Interoperable

  • (Meta)data are machine-readable, meaning they can be read and processed by computers without human intervention
  • Standard vocabulary and formatting are used, where possible
  • Open file formats are used
  • Data are properly cited and linked to related (meta)data

Reusable

  • Metadata are sufficiently detailed to provide necessary context
  • (Meta)data meet relevant community standards, such as
  • Provenance and citation/acknowledgment of data authors are clear
  • Clear data usage terms and appropriate licensing are provided

Discipline-Specific Metadata

Some research disciplines and types of data have their own metadata standards. You can search the following metadata standards catalog to see if there is a metadata standard appropriate for your research or data type:

The following resources will provide information and guidance on a variety of these metadata standards:

Content adapted from Cornell University guides on metadata and describing data and writing README style metadata.