Research Data Management: Publishing Your Data

Guide on research data management, with resources and tools for data planning, data organization, data documentation, data sharing, data security, data analysis and visualization

Data Paper

data paper is a scholarly publication describing a particular dataset or a group of datasets, published in the form of a peer-reviewed article in a scholarly journal. 

Unlike a conventional research article, the main goal of a data paper is to describe the dataset(s) focusing on collection method, distinguishing features, access and potential reuse rather than on data processing and analysis. Data papers are usually assigned an unique identification number (e.g. DOI), thus can be cited just like conventional journal articles.

Different publishers may have different requirements, so you should read their submission guidelines carefully beforehand. As a general guideline, here are what usually happens before a data paper is published:

  1. You will be asked to deposit the complete dataset in an online repository, which may provide a DOI (or a similar persistent identifier) for your data. The publisher may provide a list of recommended repositories.
  2. You shall draft the paper based on templates and in the format as indicated by the publisher. This is where you describe your dataset in a clear, detailed and systematical manner. 
  3. Submit the data paper for peer review. 
  4. Make revisions based on feedback from the reviewer(s). If everything goes smoothly, there you go - your data paper will be successfully published!

Refer to the section below for a list of journals where you can publish a data paper. 


Make your data accessible online

You may want to make your data freely accessible online due to various reasons. It might be the publisher requirement to publish a data paper; or you wish to increase your research visibility and impact or to attract potential collaboration opportunities; or perhaps just for the sake of public good. 

You can deposit your dataset with InK, the institutional repository. The benefits include:

  • You will receive assistance from our friendly librarian throughout the entire process
  • The university institutional repository is non-profit, and guarantees long-term archival and accessibility of your dataset
  • You will receive a monthly usage/download report on your publications

Alternatively, if you choose to deposit your dataset with other online repositories (refer to the other tab above), such as a subject repository, we can index the record in InK for better discovery. Just contact the research data librarian if you need any help!

There are numerous free or fee-based data repositories online. Some are subject-based while others may be multi-disciplinary. One of the good websites that you should check out is, a registry of online research data repositories. 

Listed below are several big data repositories that you can explore:

Repository Name Subject  Cost/Access Control About
DataVerse Multi-disciplinary Free for individual researchers Dataverse is an open source web application to share, preserve, cite, explore, and analyze research data. It facilitates making data available to others, and allows you to replicate others' work more easily. Researchers, data authors, publishers, data distributors, and affiliated institutions all receive academic credit and web visibility.
FigShare Multi-disciplinary Unlimited storage if you make your data publicly available figshare is a repository where users can make 
all of their research outputs available in a citable,
shareable and discoverable manner
ICPSR Social Sciences Fee-based if you want to make your data open to the public; otherwise only ICPSR members can access the data An international consortium of more than 700 academic institutions and research organizations, 
ICPSR maintains a data archive of more than 500,000 files of research in the social sciences. It hosts 16 specialized collections of data in education, aging, criminal justice, substance abuse, terrorism, and other fields.
Databrary Psychology; Developmental Science Free Databrary is a video data library for developmental science.
Share videos, audio files, and related metadata. Discover more, faster.
CodePlex Computer Science Free CodePlex is Microsoft's free open source project hosting site. You can create projects to share with the world, collaborate with others on their projects, and download open source software.
GitHub Computer Science Free for public and open source projects;
Fee-based for unlimited private repositories
GitHub is a web-based repository hosting service. It offers all of the distributed revision control and source code management (SCM) functionality of Git as well as adding its own features. GitHub provides access control and several collaboration features such as bug tracking, feature requests, task management, and wikis for every project.
Launchpad Computer Science Free Launchpad is a software collaboration platform that provides functionality such as code hosting, bug tracking, code reviews, etc. 
SourceForge Computer Science Free SourceForge allows the user to find, create and publish open source software for free. 



List of data journals

Journal Publisher Overview Subject Areas
Scientific Data Nature Publishing Group Scientific Data is a new open-access, online-only publication for descriptions of scientifically valuable datasets. Scientific Data exists to help you publish, discover and reuse research data. multidisciplinary; natural sciences; social sciences; business and industry
International Journal of Robotics Research SAGE Publications International Journal of Robotics Research (IJRR) was the first scholarly publication on robotics research; it continues to supply scientists and students in robot and related fields - artificial intelligence, applied mathematics, computer science, electrical and mechanical engineering - with timely, multidisciplinary material on topics from sensors and sensory interpretations to kinematics in motion planning. IJRR also publishes peer reviewed data papers and multimedia extensions alongside articles. artificial intelligence, applied mathematics, computer science, electrical and mechanical engineering
Applied Informatics SpringerOpen Applied Informatics covers the theory and application of informatics in various scientific, technological, engineering and social fields. Aiming to inspire new multidisciplinary research, the journal acts as an integrative venue that collects high-quality original research papers and reviews on various aspects of applied informatics, with the foundations of informatics (information theory, statistical modeling, machine learning, etc) as the driving core and the interactions between essential realms as the promoting focuses; particularly important are the interactions between (a) life sciences (bioinformatics, medical informatics, bioengineering, etc); and (b) intelligence sciences (neural and cognitive informatics, multimedia, pattern recognition, etc), and (c) community sciences (social networks, affective computing, big data analytics, etc). applied informatics
SpringerPlus SpringerOpen SpringerPlus accepts manuscripts from all disciplines of Science. We accept manuscripts describing original research as well as case descriptions and methods, and we expressly encourage submission of data reports and large datasets. all disciplines of science, technology, engineering, medicine and humanities & social sciences
Journal of Open Psychology Data Ubiquity Press The Journal of Open Psychology Data (JOPD) features peer reviewed data papers describing psychology datasets with high reuse potential. Data papers may describe data from unpublished work, including replication research, or from papers published previously in a traditional journal. We are working with a number of specialist and institutional data repositories to ensure that the associated data are professionally archived, preserved, and openly available. Equally importantly, the data and the papers are citable, and reuse is tracked. psychology
Research Data Journal for the Humanities and Social Sciences Brill Research Data Journal for the Humanities and Social Sciences (RDJ) is a peer reviewed e-only open access journal, which is designed to comprehensively document and publish deposited data sets and to facilitate their online exploration. In this way it wants to contribute to transparency of research, accelerate dissemination and foster reuse. The journal concentrates on the Humanities and Social Sciences, covering history, archaeology, language and literature in particular. The publication languages are English and Dutch. The RDJ contains data papers: scholarly publications of medium length (with a maximum of 2500 words) containing a non-technical description of a data set and putting the data in a research context. A data paper gets a persistent identifier and provides publication credits to the author, who is usually (but not necessarily) also the data depositor. Research Data Journal for the Humanities and Social Sciences is published in collaboration with Data Archiving and Networked Services (DANS). humanities and social sciences
