You are welcome to submit feedback to add, augment, or refine terminology for the SPARC glossary.
Addition of analysis information, knowledge or commentary to a data set or part of dataset. The identification of a signal in a micrograph as a mitochondria would be considered a type of annotation. To increase interoperability among data sets, SPARC encourages and enables annotation with the SPARC Vocabularies. These annotations on top of data then become part of the SPARC Knowledge Graph.
An automated methodology that produces simple anatomy schematics in a consistent manner, and provides for the overlay of anatomy-related information onto the same diagram. ApiNATOMY draws upon the topology of anatomy ontology graphs to automatically lay out treemaps representing body parts as well as semantic metadata that link to such ontologies.
ApiNATOMY is used in SPARC to build routing and connectivity graphs for anatomical entities. Such graphs support queries that, for instance, identify neural connections that course through a tract, nerve or ganglion. ApiNATOMY-based knowledge, allows the SPARC user to determine the nuclei/grey matter regions affected by the transection of a nerve or the stimulation of a ganglion.
In addition, the same routing information leveraged by the flatmap GUI may be used to discover metadata to SPARC experimental data or simulation models that locate along the route of a tract, nerve or ganglion. For an example of an ApiNATOMY map, see diagram on the SAWG page.
BIDS is an endorsed standard of the INCF. to prescribe a formal way to name and organize neuroimaging data and metadata in a file system; simplifying communication and collaboration between users. BIDGS also enables easier data validation and software development through using consistent paths and naming for data files. The SPARC Data Set Structure is based on BIDs.
Bioelectronic medicine, the convergence of molecular medicine, neuroscience, engineering and computing to develop devices to diagnose and treat diseases - Olafsson and Tracey, 2017.
A cloud-based platform for scientific data curation and management. This platform is used to prepare SPARC datasets for publication and to share and leverage datasets privately before data is made public.
A cloud-based platform where datasets from the Blackfynn Data Management platform are published and made publicly available. Blackfynn Discover provides tools o allow anyone to interact with public datasets as well as an API(Application Porgramming Interface) to programmatically navigate, browse, and discover new data.
All SPARC public data will be released under the Creative Commons Attribution license. The terms of the license require that you must give appropriate credit to the provider, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
An XML format for encoding mathematical models in a shareable, modular, and reusable manner. A core standard of the COMBINE network. Primarily created, edited, and simulated using the OpenCOR software tool.
These are the principal building blocks for computational studies on o²S²PARC. Services accept inputs and produce outputs (which can be stored or used as input for other services). The functions of computational services are manifold and depend largely on the author’s intention. These functions can span from predicting cardiac contractile force to neural spike rates to simply summing two inputs. Most computational services have parameters that are editable by the user to explore the effects of these parameters on the simulation outputs.
A computational study is essentially a simulation project on o²S²PARC. It is visualized as a nested graph that represents a workflow of modeling services, how they are pipelined, and what the service parameters are. Some of the nodes might also represent other operations, such as retrieving or storing data from/to DAT-Core. A study is a conglomerate of a full simulation from input to final output and how the intermediate processing or computational steps are linked and should be reproducible. It primarily captures the setup, but can also include links to results.
The organization, annotation, publication and presentation of data according to a set of standards enforced by the SPARC data repository. The goal of curation is to ensure that data are organized in a consistent and machine-readable format and that the necessary metadata are available to interpret and reuse the data. Data curation includes both manual and automated processes.
One of the 3 cores comprising the SPARC Data & Resource Center, responsible for storing, organizing, managing, and tracking access to data and resources generated by SPARC. See also SIM-Core and MAP-Core.
A dataset is a repository of data and metadata that can be selectively shared with users of the Blackfynn platform. Datasets can be private to its creator, shared selectively with users or teams in an organization or accessible to all users of an organization. Datasets can be published Blackfynn Discover, and ultimately the SPARC Portal, where data is publicly available.
Digital Object Identifier (DOI)
A DOI is a globally unique, Persistent Identifier that uniquely identifies a digital object such as an article, data set or protocol. SPARC uses DOIs to identify data sets and protocols. SPARC DOIs for data sets are managed by datacite.org and are assigned when users publish a version of a dataset, or if they reserve a DOI prior to publishing on the Blackfynn platform. Data sets for protocols are issued by Protocols.io when a protocol is made public. The DOI of a dataset is the standard way to reference datasets in publications.
SPARC data sets are subject to a 1 year embargo during which time the data sets are visible only to members of the SPARC consortium. During embargo, the public will be able to view basic metadata about these data sets as well as their release date.
FAIR Data Principles
High level principles designed to make data Findable, Accessible, Interoperable and Reusable for both humans and machines (Wilkinson et al., 2016). The principles encompass 15 guidelines designed to improve the usability of digital data. More details can be found at the GO-FAIR initiative. SPARC is adopting these principles, e.g., the use of persistent identifiers, FAIR vocabularies and community standards to ensure that SPARC data is FAIR.
A self-contained package of information used by computer systems and applications that contain both input and output data.
Two dimensional representations of anatomical structures and connectivity that serve as the query interfaces and visual representations of the SPARC knowledge graph.
Platform developed and maintained by the UCSD Fair Data Informatics Lab to make it easier for researchers to use and build FAIR vocabularies for data annotation and search. Interlex allows researchers to add their own terms and link them to existing ontologies. SPARC is using Interlex to enhance existing ontologies for use in SPARC.
Knowledge graphs allow users to search for a particular entity, e.g., a gene, plus search for related entities represented in machine-processable form.
A web page that provides basic metadata about a data set or other digital object that is the first place a user “lands” when clicking on the DOI for that data set. The landing page provides basic information such as a title, description, author and license, but also provides information about what files are included in the data and how they are organized.
The means by which the copyright holder grants specific rights to the general public for reuse of the digital object. SPARC data is released under a public license, the CC-BY license, which means that provided that the licensees obey the terms and conditions of the license, copyright holders give permission for others to reuse or adapt their work provided that the creator of the data is attributed.
One of the 3 cores comprising the SPARC Data & Resource Center, responsible for building interactive, modular, continually updated visualizations of nerve-organ anatomy and function. See also: SIM-Core and DAT-Core.
“Data about data”. Metadata provides additional information about a dataset. Metadata range from descriptive metadata, i.e., information that provides additional information about the source of context of the data, e.g., title, description, techniques, to structural information about the data set, e.g., how many files, what formats, data sizes, to administrative information, e.g., who owns the data and the license under which they are released.
The elements of metadata collected about a digital object and their organization. For example, the SPARC metadata model for a data set includes a title, description, author etc. The SPARC Minimal Information Standard includes information such the organism used and attributes of the organism such as age, sex etc.
A metadata standard developed by the European Human Brain Project through INCF that provides a set of standardized fields for describing neuroscience data sets. MINDs has not yet been formally released to the public, but as it represents a reasonable set of metadata fields, we have adopted its use in SPARC to organize some of the descriptive metadata that accompanies SPARC data sets.
A set of concepts and categories in a subject area or domain that shows their properties and the relations between them. When these are encoded using formal logic-based computer languages, e.g., OWL (Web Ontology Language), a computer can perform some of the same types of reasoning as a human. For example, a computer would be able to reason that a dorsal root ganglion was part of the peripheral nervous system. Examples of ontologies in use in SPARC include UBERON and the Gene Ontology. SPARC uses ontologies for annotation of data, to enhance search and to encode knowledge arising from the SPARC project.
ORCID provides a persistent digital identifier that distinguishes you from every other researcher and, through integration in key research workflows such as manuscript and grant submission, supports automated linkages between you and your professional activities ensuring that your work is recognized. SPARC encourages users to link their ORCID account to their Blackfynn profile as this information is associated with the dataset contributors when datasets are published. SPARC requires the corresponding contributor (the person who publishes the dataset) to associate their ORCID account.
An account on Blackfynn Data Management platform (i.e. SPARC Consortium).
The online platform that hosts simulations of physiological processes contributed by SPARC groups and maintained by SIM-Core. Through the SPARC portal, any user can directly investigate a particular contributed simulation through a “Run Simulation” link. They will then be able to change simulation parameters and run the simulation pipeline. Registered users who have an o²S²PARC account have access to greater functionality such as editing existing simulation workflows and creating their own simulations.
One of the core principles of FAIR is to use a persistent identifier (PID) to identify digital objects such as articles, data sets and protocols. Globally unique, persistent identifiers ensure that digital objects can be reliably found, that is, no broken links! PIDs are identifiers that are unique and never change: they point to one and only one digital object, e.g. a particular scientific article, and the persistent identifier may never be reused for another object. PIDs are issued by registries to ensure that the identifier is unique and that metadata that describes the object identified is available. Persistence of the ID is essentially a social contract: if you request a PID for an object, then you agree to keep the registry updated should the object move locations. For example, if a data set identified by a DOI moved to a new URL, the registry would have to be informed of the move. The FAIR principles require that the metadata that describes the object must persist even if the underlying object has been removed, i.e., if the data are no longer available for some reason, the identifier still resolves to a page, usually called a tombstone page, that describes the object and its disposition. A digital object identifier(DOI) is a well known example of a PID and is used in SPARC.
A model repository that includes some of the physics based models, as well as the scaffolds that are generated for the SPARC project. Each model will have a unique ID, including a DOI. Includes version control. The repository is hosted using AWS servers located in the USA.
A data property refers to a property of a model. For example, a model "Person" may have the properties “First name”, and “Last name”.
Online platform for sharing, creating and managing experimental protocols.Data sets in SPARC are accompanied by detailed experimental protocols to provide details about how the data were collected. SPARC investigators are required to make these experimental protocols available through the SPARC group, which are then linked to the data set in Blackfynn Discover portal. Protocols are private to the consortium until the accompanying data are made public. Protocols released to the public are assigned a DOI so they can be appropriately referenced. Many journals allow links to protocols in Protocols.io to be included in research articles.
Information on the origins and history of a data set or other digital object, e.g., a description of the workflow that led to the data, information on who generated or collected it? How were they processed? Does it contain data from someone else that you may have transformed or completed? Who to cite and/or how you wish to be acknowledged. Provenance is one of the FAIR principles, as understanding this type of information about a data set enhances its reusability and also allows credit and attribution for those that contributed to the data. As per FAIR principle R1.2, ideally information is described in a machine-readable format.
The act of making a data set or other digital object available to those outside of the SPARC community. A data set is considered published in SPARC when it is released to the Blackfynn Discover portal with a DOI and CC-BY license for reuse. For data sets still under embargo, the descriptive metadata about the data set are published with a DOI but no license, as access to the data requires permission from the author.
A data record is an instance of a data model. If there is a model “Person” with properties “First name”, and “Last name”, a sample data record could be: [Person 1: “First name = John”, “Last name = Smith”]
A data relationship refers to a connection between records in the knowledge graph. For example, one can define a relationship between a “Study” and an “Experiment” by stating that a particular experiment “belongs-to” a particular study. These relationships are also called “edges” between the record “nodes” in the graph.
A dataset revision refers to an update made to dataset metadata (i.e. title, subtitle, description, etc.) that does not require an updated DOI.
A mathematical model providing a 3D coordinate framework for defining anatomical shape and other embedded anatomical data. The model uses high order (Cubic Hermite) finite element basis functions to capture complex geometry with a small number of parameters (defined at the ‘nodes’ of the finite element mesh) that can be optimised to fit the scaffold to anatomical measurements. A wide range of anatomical data (including models derived from that data) can be embedded in the scaffold as fields that are defined by additional nodal parameters – for example, muscle tissue structure, collagen density and orientation, vascular structure, neural connectivity, etc. The 3D shape of a scaffold can change with time (beating heart, breathing lung, filling and emptying of the bladder, peristalsis in the colon, etc) but because the anatomical structures embedded within the scaffold are defined in terms of material coordinates, they move with the deforming scaffold. A 3D finite element mesh, of any type and at any specified spatial resolution, can be generated automatically from the scaffold for use with physics based modelling of physiological function.
Platform developed and maintained by the FAIR Data Informatics Lab at UCSD for providing unified search across independently maintained databases and other data resources. The platform includes data ingestion, curation tools and vocabulary services.
SciCrunch Knowledge Graph
Comprises the mark up of SPARC data and models with the SPARC vocabularies. The SciCrunch Knowledge Graph is used to enhance search across SPARC data.
An open source Neo4J ontology store that serves the SPARC vocabularies and will house the SPARC Knowledge Graph.
An XML format for encoding descriptions of simulation experiments (basic workflows) independent of the modelling format used to encode the models used in the experiment. A core standard of the COMBINE network.
The process of partitioning a digital image into multiple segments, generally to extract specific signals or structures from a complex image for the purpose of analysis or communication.
One of the 3 cores comprising the SPARC Data & Resource Center, responsible for developing an online framework capable of hosting and connecting simulations to create predictive, multiscale, multiphysics models spanning from modulation sources acting at feasible access points through to organ functional responses. See also DAT-Core and MAP-Core.
Software for Organizing Data Automatically (SODA)
SODA is software intended to simplify the organization and submission process of SPARC datasets by handling complex and/or repetitive tasks through an intuitive and interactive interface. The idea for SODA arose during the December 2018 SPARC Hackathon. SODA will provide an interactive interface that, without requiring any coding knowledge, walks SPARC investigators step-by-step through the data organization and sharing workflow all the while automating repetitive, complex and/or time-consuming tasks. SODA is distributed as a desktop application for Windows, MAC OS, and Linux. It is currently under development and will be released progressively as features are incorporated.
SPARC Anatomical Working Group (SAWG)
A working group comprised of anatomical experts who assist in the creation and vetting of SPARC vocabularies, flatmaps and scaffolds.
SPARC Data and Resource Center (DRC)
Supports the creation of the SPARC data portal, a multifunctional online hub facilitating coordination, synthesis, and prediction via three Core functionalities: Data Coordination, Map Synthesis, and Modeling & Simulation. Funded SPARC investigators closely coordinate with the DRC in order to achieve the following core functions:
- Data Coordination Core (DAT-Core) - Store, organize, manage, and track access to data and resources generated by SPARC;
- Map Synthesis Core (MAP-Core) - Build interactive, modular, continually updated visualizations of nerve-organ anatomy and function;
- Modeling and Simulation Core (SIM-Core) - Develop an online framework capable of hosting and connecting simulations to create predictive, multiscale, multiphysics models spanning from modulation sources acting at feasible access points through to organ functional responses.
SPARC Data Set
A collection of related data and metadata generally produced by a single laboratory supported by SPARC, uploaded to the SPARC data platform and made available through the SPARC portal.
The standard means for organizing and naming files for diverse data being generated by the SPARC Consortium. The standard is based on the BIDS format developed originally for neuroimaging. Files are organized into folders and accompanied by a set of descriptive files that contain information on subjects, experimental information, data set descriptions. Folders and files are named according to a standard naming convention. The SPARC Data Set Structure also provides a means to extend the core structure to accommodate most data acquisitions. The use of a common standard facilitates data reuse and integration.
SPARC Knowledge Graph
Knowledge + Data + Models produced by SPARC. It comprises the following:
- Data sets annotated to the SPARC Minimal Information Standard (MIS) for data and SPARC Vocabularies;
- MIS for models and simulations;
- Reference knowledge encoded from community ontologies and extended by SPARC investigator and knowledge extracted from the literature.
SPARC Material Sharing Policy
SPARC Minimal Information Standard (MIS)
The minimal metadata and data model for SPARC research objects:
SPARC Dataset MIS The minimal information model for SPARC data sets, developed by the SPARC Standards Committee. The MIS is encoded in TTL/OWL and is viewable using the Protege Ontology Editor.
The integrated online platform where users can browse datasets generated by SPARC groups, interact with and discover data with flatmaps and run simulations of physiological processes. This is the main entrypoint for users to access contributions of the SPARC teams. Services within the portal have been provided by MAP-Core, DAT-Core and SIM-Core.
Set of community ontologies used by SPARC to annotate data and models + custom extensions produced specifically for SPARC. Current ontologies used by SPARC include UBERON, the multi-species anatomy ontology, supplemented by terms from the Foundational Model of Anatomy, multiple brain atlases and others. These community ontologies have been imported into the NIFSTD ontology, which provides the backbone of the SPARC vocabularies.
A group of users on the Blackfynn Data Management platform in a single group (i.e. SPARC Data Curation Team).
A diverse group of data analysts and simulations experts based in four different countries who provide feedback regarding the usability and functionality of the o²S²PARC platform. This feedback is incorporated into the o²S²PARC 4-week development cycles.
A group of SPARC program scientists who provide feedback on the design and functionality of the SPARC portal at 6-week development cycles.
A dataset version refers to a DOI-specific, version-controlled iteration of a dataset. A new version of a dataset must be released when there are any changes to the files or scientific metadata made within a dataset.