Skip to Main Content
SMU Libraries

Research Data Management: Publishing Your Data

Guide on research data management, with resources and tools for data planning, data organization, data documentation, data sharing, data security, data analysis and visualization

Data Paper

data paper is a scholarly publication describing a particular dataset or a group of datasets, published in the form of a peer-reviewed article in a scholarly journal. 

Unlike a conventional research article, the main goal of a data paper is to describe the dataset(s) focusing on collection method, distinguishing features, access and potential reuse rather than on data processing and analysis. Data papers are usually assigned an unique identification number (e.g. DOI), thus can be cited just like conventional journal articles.

Different publishers may have different requirements, so you should read their submission guidelines carefully beforehand. As a general guideline, here are what usually happens before a data paper is published:

  1. You will be asked to deposit the complete dataset in an online repository, which may provide a DOI (or a similar persistent identifier) for your data. The publisher may provide a list of recommended repositories.
  2. You shall draft the paper based on templates and in the format as indicated by the publisher. This is where you describe your dataset in a clear, detailed and systematical manner. 
  3. Submit the data paper for peer review. 
  4. Make revisions based on feedback from the reviewer(s). If everything goes smoothly, there you go - your data paper will be successfully published!

Refer to the section below for a list of journals where you can publish a data paper. 

Make your data accessible online

You may want to make your data freely accessible online for different reasons. Maybe the funder or publisher requires authors to make data publicly available, or you may want more people to find your research or cite your paper. 

You can deposit your dataset with SMU Research Data Repository (RDR). The benefits include:

  • Storage space of 100GB per user, securely hosted on Amazon S3 with regular backups
  • Long-term data retention for at least 10 years to help you meet requirements from SMU Research Data Management Policy
  • A collaboration tool to share data with SMU or external researchers
  • Research impact as your data will be citable with a DOI, discoverable in search engines including Google Dataset, and sharing underlying data may bring higher citations to your publications. 

Visit SMU RDR User Guide for instructions. Email the librarian if you have any questions/need help. 

There are numerous free or fee-based data repositories online. Some are subject-based while others may be multi-disciplinary. One of the good websites that you should check out is, a registry of online research data repositories. 

Listed below are several big data repositories that you can explore:

Repository Name Subject  Cost/Access Control About
DataVerse Multi-disciplinary Free for individual researchers Dataverse is an open source web application to share, preserve, cite, explore, and analyze research data. It facilitates making data available to others, and allows you to replicate others' work more easily. Researchers, data authors, publishers, data distributors, and affiliated institutions all receive academic credit and web visibility.
FigShare Multi-disciplinary Unlimited storage if you make your data publicly available

figshare is a repository where users can make 
all of their research outputs available in a citable,
shareable and discoverable manner.

SMU is an institutional subscriber of FigShare. Visit SMU Research Data Repository (RDR) to use the service. 

ICPSR Social Sciences Fee-based if you want to make your data open to the public; otherwise only ICPSR members can access the data An international consortium of more than 700 academic institutions and research organizations, 
ICPSR maintains a data archive of more than 500,000 files of research in the social sciences. It hosts 16 specialized collections of data in education, aging, criminal justice, substance abuse, terrorism, and other fields.
Databrary Psychology; Developmental Science Free Databrary is a video data library for developmental science.
Share videos, audio files, and related metadata. Discover more, faster.
CodePlex Computer Science Free CodePlex is Microsoft's free open source project hosting site. You can create projects to share with the world, collaborate with others on their projects, and download open source software.
GitHub Computer Science Free for public and open source projects;
Fee-based for unlimited private repositories
GitHub is a web-based repository hosting service. It offers all of the distributed revision control and source code management (SCM) functionality of Git as well as adding its own features. GitHub provides access control and several collaboration features such as bug tracking, feature requests, task management, and wikis for every project.
Launchpad Computer Science Free Launchpad is a software collaboration platform that provides functionality such as code hosting, bug tracking, code reviews, etc. 
SourceForge Computer Science Free SourceForge allows the user to find, create and publish open source software for free. 


List of data journals

Journal Publisher Overview Subject Areas
Scientific Data Nature Publishing Group Scientific Data is a new open-access, online-only publication for descriptions of scientifically valuable datasets. Scientific Data exists to help you publish, discover and reuse research data. multidisciplinary; natural sciences; social sciences; business and industry
International Journal of Robotics Research SAGE Publications International Journal of Robotics Research (IJRR) was the first scholarly publication on robotics research; it continues to supply scientists and students in robot and related fields - artificial intelligence, applied mathematics, computer science, electrical and mechanical engineering - with timely, multidisciplinary material on topics from sensors and sensory interpretations to kinematics in motion planning. IJRR also publishes peer reviewed data papers and multimedia extensions alongside articles. artificial intelligence, applied mathematics, computer science, electrical and mechanical engineering
Applied Informatics SpringerOpen Applied Informatics covers the theory and application of informatics in various scientific, technological, engineering and social fields. Aiming to inspire new multidisciplinary research, the journal acts as an integrative venue that collects high-quality original research papers and reviews on various aspects of applied informatics, with the foundations of informatics (information theory, statistical modeling, machine learning, etc) as the driving core and the interactions between essential realms as the promoting focuses; particularly important are the interactions between (a) life sciences (bioinformatics, medical informatics, bioengineering, etc); and (b) intelligence sciences (neural and cognitive informatics, multimedia, pattern recognition, etc), and (c) community sciences (social networks, affective computing, big data analytics, etc). applied informatics
SpringerPlus SpringerOpen SpringerPlus accepts manuscripts from all disciplines of Science. We accept manuscripts describing original research as well as case descriptions and methods, and we expressly encourage submission of data reports and large datasets. all disciplines of science, technology, engineering, medicine and humanities & social sciences
Journal of Open Psychology Data Ubiquity Press The Journal of Open Psychology Data (JOPD) features peer reviewed data papers describing psychology datasets with high reuse potential. Data papers may describe data from unpublished work, including replication research, or from papers published previously in a traditional journal. We are working with a number of specialist and institutional data repositories to ensure that the associated data are professionally archived, preserved, and openly available. Equally importantly, the data and the papers are citable, and reuse is tracked. psychology
Research Data Journal for the Humanities and Social Sciences Brill Research Data Journal for the Humanities and Social Sciences (RDJ) is a peer reviewed e-only open access journal, which is designed to comprehensively document and publish deposited data sets and to facilitate their online exploration. In this way it wants to contribute to transparency of research, accelerate dissemination and foster reuse. The journal concentrates on the Humanities and Social Sciences, covering history, archaeology, language and literature in particular. The publication languages are English and Dutch. The RDJ contains data papers: scholarly publications of medium length (with a maximum of 2500 words) containing a non-technical description of a data set and putting the data in a research context. A data paper gets a persistent identifier and provides publication credits to the author, who is usually (but not necessarily) also the data depositor. Research Data Journal for the Humanities and Social Sciences is published in collaboration with Data Archiving and Networked Services (DANS). humanities and social sciences
The use of electronic resources must comply with the Appropriate Use of Electronic Resources Policy and Singapore Management University Acceptable Use Policy