Skip to content

This is the sixth article on the changes to the NIH Data Management and Sharing policies that will go into effect for NIH grant applications starting January 2023. For more information, see our previous articles on data management.  

Broadly speaking, file management pertains to the organization, access, storage and retrieval of documents, folders and information. When creating a research plan, it is important to spend time reflecting on your file management system to avoid future complications such as losing a necessary document or being locked out of a folder. A proper file management system will ensure that all documents are correctly labeled, stored and available to be viewed by project team members. 

The Lamar Soutter Library states that file management consists of “Structuring the hierarchical organization of file folders in a logical and clear way; Planning for the syntax and vocabulary of individual file names; [and] Using agreed-upon conventions consistently.” (Lamar Soutter Library, 2022) A file management system clarifies steps for preserving research data in a manner that is accessible to all research team members. 

You should create your file management system before engaging in your research. Spend as much time as necessary establishing and documenting a routine, folder hierarchy, file naming conventions and other information. Here are some points to consider when creating your file management system:

  1. Think about the goal and purpose of file management for your research: Having a goal in mind will allow you to determine the best way to manage your files and ensure that you are in compliance with the new NIH Data Management and Sharing policies. A goal will also allow you to eliminate any confusion surrounding file management or the software or hardware you will use to maintain your research files. 
  2. Seek input from all research team members: If multiple people are involved in the research project, seek their input on how they manage their personal files. Learn which tools and resources they are accustomed to. Form a consensus on which software and file management structure the team will use. 
  3. “Develop a nested folder structure that makes the most sense for your project and team’s retrieval needs” (Lamar Soutter Library, 2022) : There are multiple ways to organize a nested folder structure. If the research project will occur over a long period of time, consider creating a “base” folder with the project name and adding additional folders based on the year, month or quarter for the project. You may also consider nesting folders based on the type of information. For example, if you conduct a survey, separate the responses into different folders based on whether the survey occurred in person, over the phone, via email or through another form of communication. Think about your research plan and what types of information you will or may gather, then create corresponding folders. Be sure to create folders that may be of use in the future so you can immediately organize that information without disrupting your current file management system. 
  4. Once a system has been established, make sure all team members have access: No matter how you decide to store your files, take time to ensure all project members can access the folders and documents that they need. If there are passwords attached to folders or documents, store those passwords in a secure location that members can locate. 
  5. Create a reference file management document: Once you and your team have established a file management system, create a separate document that notes the systems and software you plan to use, folder passwords, file naming conventions and other relevant information. Store this document in a location that is accessible and use it as a reference over the course of the research project. If changes are made to the file management system, be sure to update this document to reflect the new changes. 

Resources are available to guide you through the file management development and maintenance process. Read our previous NIH Data Management and Sharing policies articles to gain valuable information on the new policy and how to comply with it. Our File Naming Conventions article offers insight on how to name your files in a clear and consistent manner and our File Storage and Backup Best Practices article discusses storage tools, the importance of saving your files in multiple locations and data security. 

GW IT has a breakdown on the different document management services available to GW faculty and staff. This webpage explains the difference between regulated, restricted and public data and has a guide to help you determine which service best meets your research needs. 

Lastly, if you’re unsure of where to begin with creating a file management system MIT Libraries’ Data Management Guide provides a worksheet that walks you through the process of creating a file management hierarchy. Follow the steps from the very beginning or pick sections from the worksheet to help you develop a file management system for your research project. 

File management may feel like a daunting task. By reflecting on the goals of your management system and developing a plan before collecting data, you will avoid losing research data or navigating unorganized folders and files. The staff at Himmelfarb Library are here to help you understand file management or any topic related to the new NIH policy. Be sure to read our previous articles or browse our NIH Data Management & Sharing Plan (DMSP) Research Guide. Continue to follow our Himmelfarb Library News site for future data management articles! 

References:

MIT Libraries. (n.d.). Organize Your Files. Data Management. https://libraries.mit.edu/data-management/store/organize/

Lamar Soutter Library. (November 22, 2022). File Management. Research Data Management Resources. https://libraryguides.umassmed.edu/research_data_management_resources/file_management
Microsoft 365 Team. (June 15, 2021). 11 ideas for how to organize digital files. Microsoft: Business Insights and Ideas. https://www.microsoft.com/en-us/microsoft-365/business-insights-ideas/resources/11-ideas-for-how-to-organize-digital-files

This is the third article in a series on the changes to the NIH Data Management and Sharing policies that will come into effect for NIH grant applications starting January 2023. For more information, see our first article for a general overview of data management resources and our second article for writing a data management plan best practices.

With the new 2023 NIH Data Management and Sharing Policy scheduled to take effect in January 2023, file naming conventions are an important piece of the data management puzzle. This new policy encourages project teams to agree on file naming conventions for objects and files and follow file naming convention best practices. This post will explore current best practices for file naming conventions.

Why Use Standardized File Naming Conventions?

Creating standardized file naming conventions is an important part of the research process. Standard file names are a great way to keep your research organized while ensuring that files can be easily located and identified by everyone in the research group. Using standard file naming conventions will also help future users find and understand the data after the project has ended. Using standardized and descriptive file names will help streamline the workflow by helping users easily identify the contents of a file without having to open the file (Univ. of Michigan Library, 2022).

The best time to develop a file naming convention is before you start your research project. Having a file naming convention in place before you start the project will prevent your project from having a backlog of unorganized files, which can lead to misplaced or lost data (Longwood Research Data Management, 2022). Your research group should decide at the outset of your project what naming conventions will be used. Once a file naming convention has been agreed upon by the research group, it must be consistently followed by all members of the group. If the naming convention isn’t followed, data could become difficult to find, making it unusable. 

What Should Be Included in File Names?

File names should be descriptive enough to capture relevant information about the file, so try to build two or three salient characteristics of the project and dataset into each file name (University of Michigan Library, 2022). Think about the types of files you’ll be working with and the types of information each file will contain when developing your file naming convention. For example, what groups of files will your naming convention cover? Are different naming conventions needed for different sets of files? Does your group, department, or discipline already have file naming conventions in place which could be used? 

It’s also a great idea to think about the metadata you’d like to include in each file name. Consider what information should be included to allow users to easily and quickly locate or search for a needed file. Since computers arrange files by name, character by character, it’s a good idea to put the most important information at the beginning of the file name. If finding information by date is a priority, start each file name with a date (see the Standardized Dates section below for more information on using dates in file names). If the type of data is the most important piece of information, start each file name with the type of information instead. 

Consider including the following pieces of information in your naming convention structure:

  • Unique identifiers (such as a grant number)
  • Project, study, or experiment name or acronym
  • Location information (such as spatial coordinates)
  • Researcher initials
  • Date or date range (in a standardized format)
  • Experimental conditions (such as instrument, temperature, etc.)
  • Version number (more information below in the Use Versioning section below)
  • Type of data (image, dataset, samples, etc.)
  • Family type, or file extension
  • Lab name or location

What Should be Avoided in File Names?

While many file naming best practices revolve around what should be included in a file name, there are also best practices related to what should not be included in file names. Here are the top three things to avoid in your file naming conventions:

  • Spaces: While separating metadata elements is a common practice, avoid using spaces to separate each element. Consider using dashes or underscores instead of spaces. For example, instead of using File Name.xxx, consider using File-Name.xxx or File_Name.xxx instead. You could also consider not separating metadata at all, and using Camel Case to eliminate spaces: FileName.xxx
  • Special Characters: Avoid using special characters such as @ # $ % & * in file names. Limit file names to alphanumeric characters.
  • Long File Names: In general, file names should be kept to 30 characters or less. Shorter file names will make it easier for users to identify the contents of the file. Longer file names may not be readable by software programs.

Standardizing Dates

When including dates in file names, using International Organization of Standardization (ISO) standards is generally considered to be the best practice. Dates should be formatted starting with the four-digit year, followed by the two-digit month, and two-digit day: 

  • YYYYMMDD (ex: 20221021)
  • YYYY-MM-DD (ex: 2022-10-21)

Use Version Control!

Many research projects involve creating and maintaining multiple versions of the same file. If this is the case for your research project, be sure to use versioning to indicate the most current version of files. Using file versioning not only helps you keep track of which file is the most recent update, but it also provides you with the ability to revert data to an earlier version without starting from scratch or having to regenerate data (Cornell University, 2022). 

Some tools such as electronic lab notebooks or Box allow you to assign version numbers, but you can create version control by building versioning into your file naming convention. You can track versions by adding version information to the end of a file name. Here’s an example: 

  • File_Name_v001.xxx
  • File_Name_v002.xxx
  • File_Name_v003.xxx

You can also include the date to indicate a version number:

  • File_Name_20220213.xxx
  • File_Name_20220321.xxx
  • File_Name_20220601.xxx

Avoid using ambiguous labels, such as “revision” or “final” in your file names. It’s also a good idea to save your original, untouched raw data and leave it that way. Having this raw data saved will allow you to always have the original data as a safe, untouched copy. 

Standardized Numbers - Use a Leading 0!

If sequential numbering is part of your file naming structure, use leading zeros. For example, instead of using 1, 2, 3, use 001, 002, 003. This will ensure that your files will be sorted in an easily findable manner. This applies to version control numbering as well.

Directory Structure Naming Conventions

File naming conventions don’t just apply to your files, use the same best practices to structure your directory folders as well. Directory folders should provide key information about the file contents stored within each folder. Be sure to include the project title, unique identifiers, and the date. It might be helpful to create a brief description of the content stored in major folders and to provide an overview of the directory structure in your documentation. The level of detail included should be enough to help someone understand the contents and organization of the files. 

Here’s a nice example: (Cornell University, 2022)

  • Top Folder: Study_name
    • Subfolder1: Study_name_Datasets
    • Study_name_2019-2020.csv
    • Study_name_2021-2022.csv
  • Subfolder2: Study_name_Semanitc_analysis
    • Study_name_semantic_analysis.R
    • Study_name_semantic_analysis_output.csv
  • Readme File: Study_name_readme.txt

Document Naming Conventions

Be sure to document each file naming convention in a top-level readme file. This file should include instructions for navigating the structure so that others involved in the research project, and others who might use this data once the project is complete can follow the naming conventions used. This file can be a README.txt file and should be kept with your files.

File naming conventions are an essential part of any research project! Be sure to take the time to create a file naming convention that will help keep your files organized, easily findable, and usable by your research team and any others who may look at your data once your project is finished. Stay tuned for future posts on best practices related to other data management topics!

References:

Cornell University. (June 2022). File management. Research Data Management Service Group: Comprehensive Data Management Planning & Services. https://data.research.cornell.edu/content/file-management

Longwood Research Data Management: Harvard Medical School. (2022). File Naming Conventions. Data Management, Harvard Medical School. https://datamanagement.hms.harvard.edu/collect/file-naming-conventions
University of Michigan Library (September 14, 2022). File-naming conventions. LabArchives: Best practices for research data management. https://guides.lib.umich.edu/c.php?g=739306&p=5286418