2021 SPARC FAIR Codeathon

The SPARC Data and Resource Center, with funding from the National Institutes of Health (NIH), hosted an in-the-cloud codeathon from July 12-26, 2021.
Updated at: 07/30/2021

Eligibility: Competitors from anywhere in the world were eligible.

Prizes: Several prizes were awarded to teams at the conclusion of the codeathon:

  • Grand Prize: US$20,000
  • First Prize: US$10,000
  • Second Prize: US$7,000
  • Third Prize: US$3,000

In addition to the cash prizes, SPARC is also supporting publication costs for manuscripts describing the codeathon projects for all teams participating in the codeathon.

Optional Workshop: Thanks to the AWS team, a free AWS Technical Essentials workshop was available to codeathon participants prior to the codeathon.

Completed July 26, 2021 - Winners announced

A team of judges from the NIH were given the difficult task of judging the codeathon projects based on the judging criteria:

  1. Creativity of team solution;
  2. Impact on SPARC research community;
  3. Feasibility based on prototype demonstration; and
  4. Completeness of documentation.

Congratulations to all the participants, delivering incredible projects which made the judges decision very difficult. Following the final presentations and demonstrations on July 26, 2021, and examining the Github repositories and associated documentation the results of the 2021 SPARC FAIR Codeathon are as follows.

Grand Prize: KnowMore

KnowMore is an automated knowledge discovery tool envisioned to be integrated within the SPARC Portal that allows users of the portal to visualize, in just a few clicks, potential similarities, differences, and connections between multiple SPARC datasets of their choice.

Team: Bhavesh Patel (California, USA), Ryan Quey (Cambodia), Anmol Kiran (Malawi), Matthew Schiefer (Florida, USA).

First Prize: HuBMAP & SPARC Linkages

The HuBMAP & SPARC linkages project enables visualization and reporting on the interlinkages between HuBMAP and SPARC ontologies and data could be very valuable in longitudinal studies and to further the goals of both HuBMAP and SPARC in cross-consortium interoperability.

Team: Bruce Herr (Indiana, USA), Samuel O'Blenes (USA)

The goal of SPARClink is to provide a system that will query all external publications using open source tools and platforms and create an interactable visualization that is helpful to any person (researcher or otherwise) to showcase the impact that SPARC has on the overall scientific research community.

Team: Sanjay Soundarajan (California, USA), Monalisa Achalla (USA), Jongchan Kim (Virginia, USA), Ashutosh Singh (Massachusetts, USA), Sachira Kuruppu (Auckland, New Zealand)

Third Prize: AQUA

AQUA (Advanced QUery Architecture for the SPARC Portal) is an application that aims at improving the search capabilities of the SPARC Portal using natural language processing to make the search engine smarter at reading and understanding user input as search keywords.

Team: Tram Ngo (California, USA), Laila Bekhet (Texas, USA), Yuda Munarko (Auckland, New Zealand), Niloofar Shahidi (Auckland, New Zealand), Xuanzhi (China)

Closed July 6, 2021 - Participant Applications

Please be aware: 1) There was no registration fee associated with attending either of these events and 2) you must have your own laptop/computer and access to the internet because 3) we do not offer financial support for participating in this event.

How to apply? To apply, please complete this application form. Applications are due July 6, 2021 by 10 p.m. EDT. We will select participants based on their experience and their motivation to attend.

When and where are the codeathon and workshop? The in-the-cloud codeathon runs from July 12-26, 2021. The optional workshop will be before the codeathon on July 9, 2021.

What will the pre-codeathon workshop cover? AWS Technical Essentials introduces you to AWS products, services, and common solutions. It provides you with fundamentals to become more proficient in identifying AWS services so that you can make informed decisions and get started working on AWS.

Do I need to participate in the workshop to participate in the codeathon? No. If you would like to participate in the codeathon without attending the workshop you can indicate this in the application process.

Who can participate? We encourage researchers and data scientists at any stage of their data science journey to apply. Teams will greatly benefit from people who possess any of the following skills:

  • Data mining, image and text analysis
  • Working knowledge of scripting languages (e.g., Shell, Python, R)
  • Familiarity with methods for manipulating and/or analyzing large datasets (AI/ML, computational modeling, etc.)
  • Developing bioinformatics code, pipelines or tools
  • Data visualization
  • Knowledge graphs
  • Web development
  • Understanding of neuroscience and/or neurostimulation

What are some of the potential team projects?

  • SPARC Data Conversion to Neurodata Without Borders (NWB) Format
  • Visualizing Sample and Ontology linkages between HuBMAP and SPARC
  • AQUA - an Advanced QUery Architecture for the SPARC Portal
  • o²S²PARC: Reusable Models of Visceral Nerve Stimulation
  • SPARClink: Visualizing the Impact of SPARC
  • Development of an Automated Knowledge Discovery Tool for SPARC Datasets
  • A software tool to automate the construction of energy-based models for Physiome modelling
  • Visualising experimental protocols
  • Quantitative Analysis of Image Data
  • SPARC Portal Slackbot
  • Integration of the SPARC Portal in Jupyter notebooks
  • Other - See project examples here

How are teams formed? Before the event, we will create up to thirty teams, comprised of five to six individuals each with various backgrounds and expertise. Each will be led by an experienced leader.

What will a typical day be like? We will meet regularly as a group throughout the codeathon, exact timing to be determined as teams are formed. SPARC Data & Resource Center teams will host regular office hours enabling teams to seek guidance.

Teams will present short talks to introduce their project (July 12), discuss project progress (July 13-25), and present the results (July 26). We will also gather as a group for a few short presentations on relevant topics of interest to the bioelectronic medicine community (such as data curation, mapping and simulations as well as best practices, coding styles, related neuromodulation topics, etc.) and then break out to work on team project pipelines and tools within a cloud infrastructure.

What will we build? We will make all pipelines, other scripts, software, and programs generated in this codeathon available on a dedicated public GitHub organisation.

Teams may submit manuscripts describing the design and use of the software tools they created to an appropriate journal such as the F1000Research hackathons channel, GigaScience, or PLoS Computational Biology.

We will notify the first round of accepted applicants on July 2, 2021. Accepted applicants have until July 6 at noon EDT to confirm their participation. International applicants or those with particular skill sets may be accepted early. If you confirm, please make sure that you can participate, as confirming and not participate prevents other scientists from attending this event. Please provide a monitored email address, in case we have follow-up questions.

Legal Participants retain ownership of all intellectual property rights (including moral rights) to the code submitted to as well as developed in the codeathon. Employees of the U.S. Government attending as part of their official duties retain no copyright to their work and their work is in the public domain in the U.S. The Government disclaims any rights to the code submitted or developed in the codeathon. Participants agree to publish the code and any related data on GitHub.

Closed May 10, 2021 - Project Proposal Submissions

In this codeathon, we are looking for exciting projects which use SPARC data and/or SPARC tools and resources in novel ways, particularly in enhancing, demonstrating, or measuring the findability, accessibility, interoperability, or reusability (FAIR) of the data. Several prizes of up to US$10,000 each will be awarded.

Projects must 1) demonstrate the value of SPARC’s public data and/or 2) directly integrate into or with any of the following SPARC tools and resources to improve their existing capabilities via the various APIs and services available:

There are several tools and resources available that may be leveraged when designing the projects. Codeathon projects should result in code, tools (see SODA for an example), datasets, or other outputs which are open and freely available. See here for example project ideas.

SPARC investigators are generating lots of FAIR data from a range of species, and spanning all the major visceral organs and peripheral nerves, as they seek to better understand the autonomic nervous system. The data being collected includes microscopy, electrophysiological and mechanical time series, single-cell RNASeq, functional MRI, and more.

SPARC data is highly curated, ensuring the data is published in a FAIR manner and backed by a semantic knowledge base. SPARC data is further enriched by being mapped to 2D “flatmaps” that enable visual exploration of the topological anatomy of the peripheral nervous system. Where possible, data is also mapped to 3D organ scaffolds to provide a common coordinate system enabling comparison across subjects, species, and protocols as well as interactive environments to aid understanding and interpretation of SPARC data.

One special data resource, called “simulations”, consists of computational models and data analysis pipelines. These simulations can be run on o²S²PARC, which was designed to host, modularize, and ensure the reproducibility of simulations contributed by researchers. This is achieved by archiving contributed simulation code along with the code’s execution environment with versioning.

SPARC data, simulations, and maps are published on the SPARC Portal, an open-source platform for finding, exploring, visualizing, interacting with, and accessing SPARC data and associated computational models and analyses.

Each team will have access to training and computational resources in the cloud to turn your idea into a working prototype. Various SPARC experts will be available to help with technical advice as needed.

Here are examples of codeathon projects that may be of interest to potential applicants.

What is FAIR? The volume of publicly available data continues to rise exponentially, but the capacity for fully employing this data is being hampered by a series of limitations. FAIR is a very powerful initiative that has taken root worldwide. The initiative has the potential to significantly increase the value of life science data sets. While the concept shares some commonality with the semantic web, FAIR data goes further to expand opportunities for knowledge-sharing and value. Here are four foundation papers on this exploding field:

FAIR Codeathon FAQs:

Do I have to lead a team? You can choose to lead your project team, recommend someone, or we can try to find a suitable team lead. Providing a designated team lead dramatically increases the probability that we will select the project for the codeathon.

Do I need to assemble a team? No. We will create working groups of five to six individuals who have various backgrounds and relevant expertise to work on each project.

What are my responsibilities as a team lead? The team leader will coordinate a group of 5-6 people in defining the project and producing clear vision for developing a solution. To accomplish this goal, the team lead must define and delegate tasks, incorporate team members’ ideas to accomplish the goal, and ensure the team’s success.

What if I only want to participate? Applications for those who would like to participate in the codeathon will be available in early June

