Skip to main content
Li Ka Shing Library

Research Data Management: Data Organization & Documention

Guide on research data management, with resources and tools for data planning, data organization, data documentation, data sharing, data security, data analysis and visualization

Data Organization

Updated: Jun 4, 2013

Data Sets from Different Research Projects

To organize data sets generated or processed from different research projects, put them in separate folders with names reflecting the research projects. It is good practice to keep a README file under each folder logging any changes.

Naming Conventions

  • Use brief and meaningful file names.
  • Use no more than 260 characters for a file name.
  • Recommended name style ProjectName_DatasetName_Version_YearMonthDay_Editor.XXX (.XXX stands for the file extension, for example ".xls" for excel files).
  • Use the segment _version_ to indicate the version or type of data set. For example _v1.0.1_ or _regression_.
  • Add or delete components in recommended name style to suit your needs. Make sure the name contains sufficient information to identify the data set and differentiate it from other files, especially when there are a large number of data files.

(Source: Organizing project data – files and folders by Florian Hollender)

Version Control

When the size of the data set is not large, e.g. several megabytes, using Git for version control is a good choice. Git is an open source version control system, widely adopted by programmers. It is also suitable for version control of any documents including data set files. If you are totally new to Git, start here. It is innovative and yet easy to use.

Some Git service providers are:

  • GitHub offers free acounts with unlimited public repositories, which means others can search and browse your documents.
  • Bitbucket is another great option that offers free account and unlimited private repositories.
  • Assembla also provides free accounts with the option of a private repository where total document size cannot exceed 1GB.

Data Documention

Research data sets will change as the research project progresses. For efficient management of research data, and for ease the re-use of research data sets, keep a well-maintained record of changes in your data sets. To document data sets, you need to choose a metadata standard and record all changes

Metadata Guidelines

Metadata covers different components and includes:

  • Description of data
    • Data collection methods
    • Context of data collection
    • Algorithms used
    • Description of steps taken to clean and manipulate raw data
    • Software and systems used for analysis
  • Format information
  • Creator and contributor information
  • Rights information

Choose a metadata standard:

  1. Adopt the required metadata standard of the data repository service used
  2. Follow the common practices in your discipline, e.g. for social and behavioral sciences use DDI metadata standard (examples)
  3. Adopt Dublin Core (examples)

Use metadata tools:

  1. DDI tools endorsed by DDI Alliance.
  2. DeXtris, a generic tool to explore XML statistical metadata. Supports Statistical Data and Metadata Exchange (SDMX), the Data Documentation Initiative (DDI) (including the draft release of DDI 3.0)
  3. Nesstar Publisher, editor for the preparation of metadata and data for publishing in an online catalog. The metadata produced is compliant with the DDI 2.n and the Dublin Core XML metadata standards.
  4. SDA, a set of programs for the documentation and web-based analysis of survey data, developed and maintained by the Computer-assisted Survey Methods Program (CSM) at the University of California, Berkeley. SDA programs can produce DDI-format metadata from SDA datasets and from other metadata formats.

(Source: Data Management Planning from UC Merced Library; Research Data Management from University of Oregon.)

Quick access to

Research data sources

Research and publishing support

Research grants

Ink (Institutional knowledge at SMU)

SMU wiki for SAS, STATA, SPSS

The use of electronic resources must comply with the Appropriate Use of Electronic Resources Policy and Singapore Management University Acceptable Use Policy