Many journals now require, or encourage data and other materials to be made publicly available.
Why should you do this?
To preserve your scientific contributions
To allow others to build on your work, find new uses for your data and use in meta-analysis
It allows interested readers to replicate the results and findings of a study
It verifies results – readers can identify statistical or methodological errors
It can increase citation levels, and helps to ensure robust dissemination and appropriate credits to authors
What types of data should be available?
Primary data
Datasets
Code
Details of software required
Analysis code, such as R scripts
Other digital material
If the standard in the field is to share data that has been processed, this can be submitted instead of raw data. Otherwise, raw data should be made available. Exceptions should be discussed with the editorial office before submission.
When preparing your data, you should ask the following questions:
Have you included everything that might be helpful to researchers (including materials, code used to run your analysis, accession IDs for sequence data)?
Is it prepared in a useful and clear-to-understand format?
Have you taken into account any sensitive data (e.g human subject data)?
Is it clear how readers of your paper can access your dataset?
There are many repositories available for authors to deposit data, dependent on type of data and subject area. For example:
Sequence data should be uploaded to the NCBI Sequence Read Archive or GenBank
Source code and R scripts should be available in a software-focused repository such as GitHub, and then archived into Zenodo
Earth and environmental science data can be georeferenced in specific repositories such as PANGAEA
Where a data-specific repository is not available, we recommend generalist repositories such as Dryad or Figshare, which can handle a wide variety of data.