DAAC Home > Resources > News

News

Nine Steps to Sharing Your Environmental Data Set

Submitted by ORNL DAAC Staff on
Image Media
Nine Best Practices for Preparing Environmental Data Sets to Share
Image Media
Caption

Nine Steps to Sharing Your Environmental Data Set

The ORNL DAAC offers guidance to assist researchers in data management and data sharing. These nine best practices can be performed at any time during the preparation of a data set, but researchers should plan for them before measurements are taken and implement them during measurements.

1. Define the contents of your data files
In order for others to use your data, they must fully understand the contents of the data set, including the parameter names, units of measure, formats, and definitions of coded values.

  • Parameter Names: The parameters reported in the data set need to have names that describe the contents and are standardized across files, data sets, and the project.
  • Units: The units of reported parameters need to be explicitly stated in the data file and in the documentation. The critical aspect here is that the units be defined in the documentation so that others understand what is reported.
  • Formats: Within each data set, choose a format for each parameter, explain the format in the documentation, and use that format throughout the data set. Consistent formats are particularly important for dates, times, and spatial coordinates.

2. Assign descriptive data set titles
A descriptive title should briefly describe your data set and will help workers search for and identify your data set as pertinent and useful for future research.

3. Assign descriptive file names
File names should contain only numbers, letters, dashes, and underscores, and no blank spaces. Ideally, names should be descriptive and contain a combination of the project acronym, study title, location, investigator, year(s) of study, data type, version number, and file type.

4. Use consistent data organization
One organizational style, known as "long", is multiple rows, each with comma-separated values. Another style, known as "wide", uses individual columns for each value. Be sure to provide a definition for all coded values or abbreviations.

5. Use stable file formats
Text-based comma separated values are ideal for tabular data, and GeoTIFFs or netCDF files are suitable for raster data. Avoid proprietary formats that may not be readable in the future.

6. Preserve information
Save the raw data file with no transformations or analyses as "read-only". Use a scripted language to process data and store the script and output in separate files.

7. Protect your data
Ensure that file transfers are done without error by comparing checksums before and after transfers. Create and test back-up copies often to prevent the disaster of lost data.

8. Provide documentation
Consider what a future investigator needs to know in order to understand and use your data.

9. Perform basic quality assurance
Check that there are no missing values for key parameters. Scan and/or plot data to check for impossible and anomalous values. Perform and review statistical summaries.

For more information see our “Data Management for Data Providers” at http://daac.ornl.gov/PI/pi_info.shtml.

Tags
Nine Steps to Sharing Your Environmental Data Set
Slider Image
Nine Best Practices for Preparing Environmental Data Sets to Share