SPOKE For Space Health Challenge

KnowHax 2025

Background:
NASA’s Open Science Data Repository (OSDR) is home to biological data and respective metadata from space-relevant experiments, including both physiological data (via the Ames Life Science Data Archive, ALSDA) and ‘omics data (via GeneLab). The primary goal of OSDR is to increase collaborative scientific data sharing and analysis that will lead to more rapid scientific advancement to address issues related to astronaut health such as Spaceflight Associated Neuro-ocular Syndrome ("SANS", Learn more here ). Through the National Science Foundation Proto-OKN effort, OSDR teamed up with SPOKE (spoke.ucsf.edu), a heterogeneous knowledge graph (KG) connecting biological and clinical data from over 60 databases, to help advance this goal by integrating molecular 'omics changes, including transcriptomics, associated with spaceflight into the SPOKE fabric.

This KnowHax challenge seeks to expand the OSDR/SPOKE integration to include physiological, phenotypic, and environmental data related to the space environment to help address critical issues related to astronaut health.

Challenge Question:

How can we integrate physiological, phenotypic, and environmental data types associated with Spaceflight Associated Neuro-ocular Syndrome (SANS) into the SPOKE fabric?

Description:
The SPOKE challenge is organized into 3 primary objectives that will bring together different data types on the Open Science Data Repository (OSDR) to integrate into the SPOKE fabric allowing users to address critical questions related to astronaut health.

1. Data Readiness: Evaluate all physiological and phenotypic data types associated with OSD datasets related to the eye to determine their level of SPOKE readiness (a full list of datasets will be provided to KnowHax participants), including identifying relevant nodes types that match OSD data. Develop a method to programmatically extract relevant information from data files and format the data structure to be compatible with the SPOKE fabric.

2. Environmental Data API and Formatting: Use the Environmental Data Application ("EDA", Learn more here ) to identify environmental datasets associated with the respective OSD datasets evaluated in objective 1. Develop an API to programmatically extract the relevant environmental data then format the data structure to be compatible with the SPOKE fabric, including identifying relevant nodes types.

3. SPOKE Integration: Create a beta version of the SPOKE KG Fabric that integrates the physiological, phenotypic, and environmental data (node) types that were assessed and formatted in objectives 1 and 2.

4. BONUS: Proof-Of-Principle Concept: Use the SPOKE beta version developed in objective 3 to evaluate transcriptomics, physiological, phenotypic, and environmental data types from relevant eye datasets hosted on OSDR to determine SANS risk and identify potential mitigation strategies.