SPRAS Tutorial
Purpose of this tutorial
This tutorial will introduce participants to SPRAS and demonstrate how it can be used to explore biological pathways from omics data.
Together, we will cover:
How to set up and run SPRAS
Running multiple algorithms with different parameters across one datasets
Using the post-analysis tools to evaluate and compare results
Building datasets for analysis
Other things you can do with SPRAS
Prerequisites for this tutorial
Required knowledge
Ability to run command line operations and modify YAML files.
Basic biology concepts
Option 1: Running SPRAS in a GitHub Codespace
SPRAS ships with a dev container, and the quickest way to use it is through GitHub Codespaces.
A Codespace builds the dev container on GitHub’s infrastructure and
opens it in your browser, so you do not need to install Docker or set up
a local Python environment. The .devcontainer configuration in SPRAS
sets up the environment for you.
Prerequisites
A GitHub account. Sign up at github.com if you do not have one.
Step 1: Create a Codespace
Go to github.com/codespaces.
Select New codespace.
In the repository field, search for and select
Reed-CompBio/spras.Select Create codespace.
GitHub builds the container from the SPRAS .devcontainer
configuration (the first build takes around 15 minutes) and opens a VS
Code environment in your browser with the SPRAS dependencies already
installed. Once the build finishes, you are ready to run SPRAS.
Note
All GitHub personal accounts include a quota of free compute time and storage for GitHub Codespaces. Usage beyond the included amounts is billed to the personal account. See the GitHub Codespaces billing documentation for details.
Account plan |
Storage per month |
Compute time per month |
|---|---|---|
GitHub Free for personal accounts |
15 GB-month |
120 hrs |
GitHub Pro |
20 GB-month |
180 hrs |
You will not be charged for codespace usage unless you exceed your quota. If you hit the limit, the free option is to switch to the local SPRAS setup.
Step 2: Set up the SPRAS environment
From the root directory of the SPRAS repository, create and activate the Conda environment, then install the SPRAS Python package.
First, create the environment:
conda env create -f environment.yml
conda init
Open a new terminal and then run:
conda activate spras
python -m pip install .
Note
The first command performs a one-time installation of the SPRAS dependencies by creating a Conda environment (an isolated space that keeps all required packages and versions separate from your system).
The second command activates the newly created environment so you can use these dependencies when running SPRAS; this step must be done each time you open a new terminal session.
The last command is a one-time installation of the SPRAS package into the environment.
Note
You may see the following error during installation:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
dsub 0.4.13 requires tenacity<=8.2.3, but you have tenacity 9.1.4 which is incompatible.
This is safe to ignore. We do not use dsub as a container option in this tutorial, and a fix is currently in progress. SPRAS and this tutorial will run correctly without a working dsub installation.
Step 3: Test the installation
Run the following command to confirm that SPRAS has been set up successfully from the command line:
python -c "import spras; print('SPRAS import successful')"
Option 2: Running SPRAS locally
Required software:
Conda : for managing environments
Docker : for containerized runs
Git: for cloning the SPRAS repository
A terminal or code editor (VS Code is recommended, but any terminal will work)
(Optional) Cytoscape for visualizing networks (download locally, the web version will not suffice)
Note
Mac users who experience performance issues with Docker Desktop can try OrbStack as an alternative.
Step 1: Clone the SPRAS repository
Visit the SPRAS GitHub repository and clone it locally
Note
If you are using the dev container, you can skip this step
Step 2: Set up the SPRAS environment
From the root directory of the SPRAS repository, create and activate the Conda environment and install the SPRAS python package:
conda env create -f environment.yml
conda activate spras
python -m pip install .
Note
The first command performs a one-time installation of the SPRAS dependencies by creating a Conda environment (an isolated space that keeps all required packages and versions separate from your system).
The second command activates the newly created environment so you can use these dependencies when running SPRAS; this step must be done each time you open a new terminal session.
The last command is a one-time installation of the SPRAS package into the environment.
Step 3: Test the installation
Run the following command to confirm that SPRAS has been set up successfully from the command line:
python -c "import spras; print('SPRAS import successful')"
Step 4: Start Docker
Before running SPRAS, make sure Docker Desktop is running.
Launch Docker Desktop and wait until it says “Docker is running”.
Note
SPRAS itself does not run inside a Docker container. However, Docker is required because SPRAS uses it to execute individual pathway reconstruction algorithms and certain post-analysis steps within isolated containers. These containers include all the necessary dependencies to run each algorithm or post analysis.
Note
Running tutorial locally will require downloading approximately 7 GB of Docker images and running many Docker containers.
SPRAS does not automatically clean up these containers or images after execution, so users will need to remove them manually if desired.
To stop all running containers: docker stop $(docker ps -a -q)
To remove all stopped containers: docker container prune
To remove unused Docker images: docker image prune
SPRAS Overview
What is pathway reconstruction?
A pathway is a type of graph that describes how different molecules interact with one another for a biological process.
Curated pathway databases provide useful well studied references of pathways but are often general or incomplete. This means they may miss context-specific details relevant to a particular condition or experiment.
Pathway reconstruction algorithms address this by mapping molecules of interest onto large-scale interaction networks (interactomes) to generate candidate context-specific subnetworks that better reflect the condition or experiment.
These algorithms allow researchers to propose computational-backed hypothetical subnetworks that capture the unique characteristics of a given context without having to experimentally test every individual interaction.
Running a single pathway reconstruction algorithm on a single dataset can be challenging, as each algorithm often requires its own input format, software environment, or even a full reimplementation. These challenges only grow when scaling up to using multiple algorithms and datasets.
What is SPRAS?
Signaling Pathway Reconstruction Analysis Streamliner (SPRAS) is a computational framework that unifies and simplifies the use of diverse pathway reconstruction algorithms.
SPRAS allows users to run multiple datasets across multiple algorithms and many parameter settings in a single scalable workflow. The framework automatically handles data preprocessing, algorithm execution, and post-processing, allowing users to run multiple algorithms seamlessly without manual setup. Built-in analysis tools enable users to explore, compare, and evaluate reconstructed pathways with ease.
SPRAS is implemented in Python and leverages two technologies for workflow automation:
Snakemake: a workflow management system that defines and executes jobs automatically, removing the need for users to write complex scripts
Docker: runs algorithms and post analysis in a containerized environment.
A key strength of SPRAS is automation. From provided input data and configurations, SPRAS can generate and execute complete workflows without requiring users to write complex scripts. This lowers the barrier to entry, allowing researchers to apply, evaluate, and compare multiple pathway reconstruction algorithms without deep computational expertise.