xmootoo/sss-official
A framework for variable-length time series classification (VSTC) with local signal interpretability.
Stochastic Sparse Sampling (SSS)
Overview
This is the official repository for the paper:
Which is the extension of our workshop paper, accepted to NeurIPS 2024 Workshop on Time Series in the Age of Large Models, found below:
Description
Stochastic Sparse Sampling (SSS) is a novel time series classification method for processing variable-length sequences. SSS outperforms many state-of-the-art machine learning and deep learning methods, benchmarked on the Epilepsy iEEG Multicenter Dataset for seizure onset zone (SOZ) localization.
Table of Contents
- Overview
- Installation
- Project Structure
- Data
- Usage
- Method
- Visualization
- Dataset Description
- License
- Citations
- Contact
- Acknowledgments
Installation
Dependencies
- Python
$\geq$ 3.10 - Additional dependencies listed in
requirements.txt
Using conda (recommended)
# Create and activate conda environment
conda create -n sss python=3.10
conda activate sss
# Install requirements
pip install -r requirements.txt
pip install -e .Using pip
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install requirements
pip install -r requirements.txt
pip install -e .Project Structure
project_root/
โ
โโโ sss/ # Main package directory
โ โโโ analysis/ # Visualization and analysis tools
โ โ โโโ soz/ # SOZ heatmap visualization
โ โ
โ โโโ config/ # Pydantic configuration classes
โ โ
โ โโโ exp/ # Experiment class
โ โ
โ โโโ jobs/ # Job configurations
โ โ
โ โโโ layers/ # Layers and model components
โ โ
โ โโโ models/ # Machine learning full models
โ โ
โ โโโ tuning/ # Hyperparameter optimization
โ โ
โ โโโ utils/ # Helper functions, preprocessing, logging
โ
โโโ data/ # Dataset files
โโโ download_data.sh # Data download script
โโโ main.py # Main execution script
โโโ setup.py # Package installation script
โโโ requirements.txt # Python dependencies
โโโ README.md # Documentation
Data
To download the required datasets, run:
chmod u+x download_data.sh
./download_data.shUsage
Running Experiments
Execute the main script with the desired model:
python main.py <model>Available models:
sss: Stochastic Sparse Sampling- Finite Context Models:
finite-context/dlinear: DLinearfinite-context/patchtst: PatchTSTfinite-context/timesnet: TimesNetfinite-context/moderntcn: ModernTCN
- Infinite Context Models:
infinite-context/mamba: Mambainfinite-context/gru: GRUsinfinite-context/lstm: LSTMsinfinite-context/rocket: ROCKET
Results are saved in the logs folder. For Distributed Data Parallel (DDP) or other configurations, modify sss/jobs/exp/<model>/args.yaml.
Method
While the majority of time series classification research has focused on modeling fixed-length sequences, variable-length time series classification (VTSC) remains critical in healthcare, where sequence length may vary among patients and events. To address this challenge, we propose
Algorithm
The SSS training algorithm is outlined below:
Visualization
The project includes a visualization tool (sss/analysis/soz/visualize.py) for generating SOZ (Seizure Onset Zone) heatmap analysis of trained models. This tool integrates with Neptune.ai for model management and visualization.
Prerequisites
- Neptune.ai account and API token
- Trained model saved to Neptune (via
main.py) - Neptune run ID for the model you want to analyze
Setup
- Set your Neptune API token as an environment variable:
export NEPTUNE_API_TOKEN='your-neptune-api-token'- Modify the
sss/analysis/soz/plot.yamlconfiguration file. Example:
run_id: "SOZ-33" # Your Neptune run ID
project_name: "your-project" # Your Neptune project name
mode: "train" # Dataset to visualize
...For a complete list of configuration options and their descriptions, refer to the Config class in visualize.py.
Running the Visualization
cd sss/analysis/soz
python visualize.pyThis will load the model from Neptune and generate SOZ heatmap that will be saved.
Examples
Dataset Description
The Epilepsy iEEG Multicenter Dataset consists of iEEG signals with SOZ clinical annotations from four medical centers including the Johns Hopkins Hospital (JHH), the National Institute of Health (NIH), University of Maryland Medical Center (UMMC), and University of Miami Jackson Memorial Hospital (UMH). Since UMH contained only a single patient with clinical SOZ annotations, we did not consider it in our main evaluations; however, we did use UMH within the multicenter evaluation in training set for both the all cluster evaluation, and out-of-distribution (OOD) experiments for SOZ localization on unseen medical centers.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Citations
If you use this code in your research or work, please cite our paper:
@article{mootoo2024stochastic,
title = {Stochastic Sparse Sampling: A Framework for Variable-Length Medical Time Series Classification},
author = {Mootoo, Xavier and D\'{i}az-Montiel, Alan A. and Lankarany, Milad and Tabassum, Hina},
journal = {arXiv preprint arXiv:2410.06412},
year = {2024},
url = {https://arxiv.org/abs/2410.06412},
eprint = {2410.06412},
archivePrefix = {arXiv},
primaryClass = {cs.LG}
}Contact
For queries, please contact the corresponding author through: xmootoo at gmail dot com.
Acknowledgments
Xavier Mootoo is supported by Canada Graduate Scholarships - Master's (CGS-M) funded by the Natural Sciences and Engineering Research Council (NSERC) of Canada, the Vector Scholarship in Artificial Intelligence, provided through the Vector Institute, Canada, and the Ontario Graduate Scholarship (OGS) granted by the provincial government of Ontario, Canada.
We extend our gratitude to Commune AI for generously providing the computational resources needed to carry out our experiments, in particular, we thank Luca Vivona (@LVivona) and Sal Vivona (@salvivona). Many thanks as well to Anastasios Angelopoulos (@aangelopoulos) and Daniele Grattarola (@danielegrattarola) for their valuable feedback and comments on our work.




