Skip to Main Content

Data Storage, Access, and Preservation: File Organization

File Naming and Version Control

File organization is an important component in the Data Management Plan (DMP). Start by establishing a name policy for the directory structure, folder names and file names to ensure that the all files are consistently and logically named and similar files will be grouped together for management. Having a policy will also help ensure the files will make sense to other stakeholders, not just the researchers.

Best Practices:

1. Directory top-level folders should include the project title, unique identifier, and date. The substructure should have a documented naming convention; for example, each run of an experiment, each version of a dataset, and/or each person in the group.

2. Keep the file names simple, but unique, avoid duplicating file names to eliminate problems.

3. Assign file names that are consistent, and retain the same order for each part of the name. 

4. Descriptive information for the file names could be the name of the experiment or project, name of researchers, type of data, dates, conditions, location, version number, etc. This type of information is important to facilitate identifying a particular file, especially in a large datasets.

File Naming Convention Rules:

1. Reserve the 3-letter file extension for application-specific codes and in lowercase, for example, formats like .wrl, .mov., .tiff 

2. Record dates as part of the file by following the ISO 8601 format of YYYY-MM-D or YYYYMMDD for example. If included at the beginning of the file name, it is easier to maintain a chronological order. Always retain the order of the date information (e.g., YYYYMMDD, and not, MMDDYYY)

3. The number of characters in the name should be no more than 32.

4. When using sequential numbering, make sure to use leading zeros to allow for multi-digit versions (01-10)

5. No special characters: & , * % # ; * ( ) ! @$ ^ ~ ' { } [ ] ? < > -

6. Use only one period before and after the file extension (e.g. name_paper.doc NOT name.paper.doc).
For example: Project_instrument_location_YYYYMMDD[hh][mm][ss][_extra].ext

File ReNaming:

1. For files are already named, use a file renaming application such as ReNamer (Mac/Windows/Unix)

Version Control

1. It is important to manually keep track of versions of files. Follow the naming strategy for directory structure and file naming conventions. Use a sequential numbering system such as v.01, v02. Avoid using confusing labels such as "revision", "revision2", "final", etc.  Or, you can use a version control software (SVN) such as Mercurial, or TortoiseSVN, which can track revisions to files and help you roll back to a previous version of a file.

Sources:
The following sites and materials were consulted in the development of this web page:

University of Oregon Library Research Data Management --Brian Westra

CalTech Library Research Data Management --Gail Clement