Find help documents, tutorials, and/or relevant information.

SPARC Data Submission Overview

Steps for submitting a dataset, getting the data curated, releasing the data under embargo, and eventually making the data publicly available.
Updated at: 01/07/2022

This documentation only applies to investigators that are funded through the NIH SPARC effort. Each investigator has user-credentials to the SPARC Consortium group on the DAT-Core website. All datasets that are SPARC related should be submitted to this account even though the investigator might have a separate, unaffiliated, private Pennsieve account. Dataset submission is required within 30 days of completing a project milestone (according to the SPARC Material Sharing Policy).

Table of Contents

Stages of Data Submission

Data Submission Diagram

1. Submit

2. Curate

3. Share

  • PI shares with SPARC Consortium by releasing to embargo

4. Publish

Using Dataset Status flags in Pennsieve

The progress of a dataset through this process is communicated in Pennsieve using status flags. When a step is completed, the investigator (blue) or curator (tan) changes the status flag, and the dataset is passed into the next step. Three teams perform curation: UCSD, MBF and ABI depending on the data type.

Curation Overview

Steps for Submitting a Dataset

Detailed instructions for how to submit a dataset are provided in the following sections.


  1. Creating a dataset
  2. Work with Data Curation Team to map metadata to standards
  3. Sharing datasets with the SPARC Consortium as Embargoed dataset
  4. Publishing datasets

sparc workflow

1. Creating a dataset

There are a couple of easy steps to submit a dataset:

  1. Create a dataset on the DAT-Core (SPARC Consortium account on Pennsieve) by clicking the New Dataset button in the top-right corner of the web application and provide a Dataset Name and a short Dataset Subtitle describing the dataset. Then, click on Create Dataset.

SPARC New Dataset

Naming guidelines: The “dataset name” is equivalent to the title of your dataset publication. This is the public field that promotes your team's dataset on the SPARC Portal to determine if other researchers in the SPARC community may want to learn more about your research. Please make sure that your dataset title is different than your other dataset titles and that it is informative. Please make sure to keep either the URL or the complete title in your records, the title is the only field that is searchable in Pennsieve.

The dataset sub-title will be visible, but not searchable, on the Pennsieve platform so using two or three sentences to further define your dataset, differentiating it from other datasets will be useful. This field will become the short description immediately under the title of your dataset once it is published.

You now created a private dataset that only you, as the dataset owner, can see.

SPARC Dataset Status Location

In the top left corner of the dataset page there will be a status list with the 12 status options that each SPARC Dataset will go through during the submission and curation process. The status of a dataset can be changed by anyone with edit permissions, and will be used by both teams to communicate the dataset’s progress through the data submission and curation process. Each status indicates which team is responsible for the step, until the dataset is published at step 12.

SPARC Dataset Status Complete List

Dataset status will automatically set to Curation Status 1 for each new dataset.

  1. Change the ownership of the dataset to the PI of the lab. This is a SPARC requirement and ensures that the PI of the lab is the only person who can publish the dataset. To do this, click on the Permissions tab on the left side bar, and add the PI to your dataset as a manager. Then click on the Manager label next to his/her name and select Make Owner. You will no longer be the owner of the dataset, but still have Manager permissions.

Dataset Permissions

  1. Add permissions to your Award team (contact the DAT-Core if you need help adding people to your award team) and the SPARC Data Curation Team. Select your award-team from the dropdown menu, add with the appropriate permissions, and add the SPARC Data Curation Team with Manager permissions.

Dat-Core Permissions 2

You have now allowed your award-team and the curation team to see, and edit the dataset.

  1. Upload files to the dataset according to the SPARC guidelines. The SPARC Dataset Structure (version 2.0.0) may be downloaded as a zip file or you may create it on your own. For help with working with the SPARC Dataset Structure, which is based on the BIDS specification, contact More information on how to upload files can be found here.

    Set dataset status to Curation Status 2

    Dat-Core Files

  2. Complete the metadata templates that are included in the downloaded SPARC Dataset Structure zip file. Make sure you are always using the most recent template version. Experimental metadata is specified by the SPARC Data Standards Committee based on the Minimal Information for a Neuroscience Dataset (MINDS) specification and are captured in the following files: 1) submission.xlsx, 2) dataset_description.xlsx, 3) subjects.xlsx, and 4) samples.xlsx. An annotated list of these fields can be found here.

  3. Upload the protocol(s) used to generate the SPARC dataset to After making your account, make sure to join the SPARC group. The group can be found through the search bar at the top of the webpage (also here). Upload the protocol within the SPARC group (this option is free to investigators). More specific instructions can be found here. Make sure to include a link to your protocol(s) within the dataset_description file. In order for the curation team to access the protocol for annotation the submitter needs to ensure that: 1) the protocol is added to the SPARC group, 2) the URL to the protocol is included in the dataset_description.xlsx file.

  4. Once you have completed your data uploads, please select step 3. Ready for Curation (Investigator) to have your dataset submitted to the curation queue. Please note that the curation team will not look into your dataset until you change the status to ready for curation.

    Set Dataset status to Curation Status 3

  5. Wait for Curation Team to process your dataset monitoring different stages of the process on Pennsieve platform as seen in the box below.

2. Work with Data Curation Team to map metadata to standards

Below are the steps and statuses listed for the curation cycle:

Curation Status 4

Once you indicate the dataset ready for curation, our team will switch the status to curation in progress and start curating your dataset, checking the integrity of data, validating values, working with image segmentation and creating maps. During this phase, you can monitor where your dataset is in the curation queue by looking at the status bar. The Curation Team will create a tracked ticket and will be reaching out to SPARC investigators to provide curation review results and to help address any errors.

Datasets that include microscopy image data are encouraged to pass through the image segmentation portion of the protocol, where SPARC investigators use MBF Bioscience software (MBF, MAP-Core) to create FAIR segmentations that can be retrieved by ABI for organ scaffold representations. For a detailed look at the MAP-Core SPARC Image Segmentation Workflow please refer to the following Google document.

Curation Status 5

For datasets that include image segmentation, the MBF Curation Team will reach out to SPARC investigators to provide curation review results and to help address any errors in the segmentation. To initiate the image segmentation workflow, the MBF Curation Team will provide investigators with access to MBF Bioscience segmentation software for FAIR neural, vascular, and anatomical reconstruction. Investigators can request a license of MBF Bioscience software by emailing

As SPARC investigators use MBF Bioscience software to segment images within their dataset(s), they will send completed segmentation files directly to an assigned MBF SPARC segmentation assistant for curation. Files can be shared with MBF via MBF Bioscience’s file sharing mechanism or Pennsieve. The MBF SPARC segmentation assistant will review each file and communicate with the investigator directly via email or #Slack if files needs revision (i.e., investigator needs to include subject metadata, annotate additional fiducials, and/or address inaccuracies or incompleteness).

Curation Status 6

In this step, the MBF Curation Team finalizes with the researcher all necessary edits to segmentation file(s) and images so the files can be uploaded by the researcher into the “derivative” folder of BIDS format for the respective dataset on the Pennsieve platform. Once all files within the dataset are curated, segmentation files and image files can be used as staging for scaffold building and other portal representations/simulations. Image files are converted to include minimum metadata and written in a standard JP2000 file format which permits efficient viewing on the Portal.

Curation status 7

Auckland Bioengineering Institute (ABI) downloads the segmentation files (in MBF format) from Pennsieve and configures 3D organ scaffolds for species and organ Physiome Model Repository (PMR). ABI utilizes non-image data from Pennsieve with annotations and registers embedded data to geometric scaffolds and flatmaps.

Curation Status 8

ABI uploads the transformation matrix for each set of registered data to Pennsieve as annotation. ABI also uploads the Uniform Resource Identifier (URI) for average scaffold in PMR and specific scaffold with registered data to Pennsieve as derived data with one or more parent Pennsieve IDs.

Curation Status 9

Anytime your input is needed during the curation process, you will receive email communication from the respective curation group. Please respond to the Curation Team’s inquiries in a timely manner. Your dataset may go through multiple iterations before it is ready for publishing.

  1. Review the curation feedback letter that you received from the Curation Team. Work with the Curation Team on implementing all necessary changes to the dataset that were listed in the feedback letter. Provide missing information and/or files. When you upload all changes to the Pennsieve platform, please change the dataset status back to Curation Status 3 so the Curation Team can pick up the dataset for curation again.

    Note: Your dataset can iterate between the status “Ready for Curation” and “Needs Attention” multiple times, until all SPARC mandated requirements are met.

    When the curation process is finished you will receive an email from the Curation Team with their final signoff. The dataset status will then be switched to Curation Status 10

  2. Work with the Curation Team on reviewing and approving all edits and changes that were implemented during data alignment, annotation, and visualization. At this time the SPARC Data Curation team will work with you to finalize the dataset within Pennsieve, adding the finalized description and authors, selecting the license and provisioning a DOI.

    Please verify the detailed description of your dataset that the Curation Team entered on your behalf using the description editor. This description will be highly visible once your dataset is published.

    Description Dataset Datcore

Upload a banner image for your dataset on Pennsieve. This can be done by clicking Upload Banner Image in the Settings or Overview page. This image should have a minimum resolution of 512px and will be associated with the dataset and used as a thumbnail once the dataset is published.

Banner Dataset

The Curation Team will assign your dataset a license Creative Commons Attribution CC-BY. You may also select this option yourself using the dropdown menu in the Dataset Settings page.

License Datacore

The Curation Team will add dataset contributors in the order as they appear in the data-description file you have uploaded. The order in which the contributors are added will be the same as the order in which contributors are listed on the public dataset landing page (SPARC Portal and Discover). If you need to make changes, you can easily add contributors by selecting names from a drop-down menu. More information on how to add contributors can be found here.

SPARC Dataset Settings

3. Sharing datasets with the SPARC Consortium as Embargoed dataset

  1. CONGRATULATIONS! Now your dataset is ready to be shared with the SPARC Consortium. You can share your dataset with the SPARC Embargoed Data Sharing Group with Viewer permissions. This allows any SPARC investigator who has signed the SPARC non-disclosure form to see your data.

    Change the dataset status to: Curation Status 11

4. Publishing datasets

  1. One year after the initial upload of your dataset, you must publish your dataset to Pennsieve Discover, which populates the SPARC Portal. To do this, 1) you can navigate to the Publishing left-hand menu in Pennsieve, 2) click Submit Dataset for Review, 3) select the dataset to be submitted for review, 4) click the appropriate checkbox (the second checkbox is only available when releasing a revision or new version of a dataset), and 5) hit Submit. Once submitted, your dataset will be locked, moved to the Pending Review section, and sent to the Publisher Team for review before it is ultimately accepted or rejected.

    Change the dataset status to: Curation status 12


This document outlined the steps required to submit and publish a SPARC dataset. Please feel free to reach out to the DAT-Core or Curation Team with specific questions about the workflow.