MILESTONE

Milestone is a local, reproducible workflow for bacterial wgMLST analysis:

schema creation or public schema import
allele calling
profile comparison (pairwise, batch, matrix)
schema QC and benchmark workflows
publication-ready packaging and reporting
Streamlit web UI for interactive runs

Live static site: https://fatmakahveci.github.io/milestone/

Quick Start

CLI setup

python -m venv .venv
source .venv/bin/activate
pip install -r webapp/requirements.txt

Run web app

streamlit run webapp/app.py

Run tests

pytest tests

Core Commands

1) Compare two profiles

python workflow/milestone.py profile_compare \
  --profile-a output/run_a/sample_a_wgmlst.tsv \
  --profile-b output/run_b/sample_b_wgmlst.tsv \
  --output-dir output/compare/sample_a_vs_b

Outputs:

wgmlst_profile_summary.json
wgmlst_profile_comparison.tsv

2) Build profile distance matrix

python workflow/milestone.py profile_matrix \
  --profiles output/run_a/sample_a_wgmlst.tsv output/run_b/sample_b_wgmlst.tsv output/run_c/sample_c_wgmlst.tsv \
  --output output/compare/matrix_abc

Outputs:

wgmlst_distance_matrix.tsv
wgmlst_distance_summary.json
wgmlst_distance_report.html

3) Benchmark against truth profiles

python workflow/milestone.py profile_benchmark \
  --predicted-dir output/predicted_profiles \
  --truth-dir validation/truth_profiles \
  --output output/benchmark_run

Outputs:

wgmlst_benchmark_summary.json
wgmlst_benchmark_per_sample.tsv
wgmlst_benchmark_per_locus.tsv
wgmlst_benchmark_report.html

4) Schema QC

python workflow/scripts/schema_qc.py \
  --schema-dir tests/fixtures/smoke_schema \
  --output-dir output/schema_qc

Outputs:

schema_qc_summary.json
schema_qc_issues.tsv

Public Schema / Benchmark Import

PubMLST scheme import

python workflow/scripts/import_pubmlst_scheme.py \
  --database pubmlst_neisseria_seqdef \
  --scheme-id 1 \
  --output-dir schemas/neisseria_wgmlst \
  --include-profiles

PubMLST benchmark pack import

python workflow/scripts/import_pubmlst_benchmark_pack.py \
  --scheme-database pubmlst_neisseria_seqdef \
  --isolate-database pubmlst_neisseria_isolates \
  --scheme-id 1 \
  --isolate-id 101 \
  --isolate-id 102 \
  --species "Neisseria meningitidis" \
  --output-dir benchmark_packs/neisseria_demo

EnteroBase scheme import

python workflow/scripts/import_enterobase_scheme.py \
  --database senterica \
  --scheme-name wgMLST \
  --output-dir public_schemes/enterobase_senterica

If EnteroBase requires authentication, set one of:

export ENTEROBASE_TOKEN="..."

export ENTEROBASE_USERNAME="..."
export ENTEROBASE_PASSWORD="..."

Publication / Validation Utilities

python workflow/milestone.py publication_readiness \
  --schema-qc-summary results/schema_qc/schema_qc_summary.json \
  --benchmark-summary results/benchmark/wgmlst_benchmark_summary.json \
  --schema-manifest results/schema/schema_manifest.json \
  --output results/publication_readiness

python workflow/milestone.py publication_package \
  --schema-qc-summary results/schema_qc/schema_qc_summary.json \
  --benchmark-summary results/benchmark/wgmlst_benchmark_summary.json \
  --schema-manifest results/schema/schema_manifest.json \
  --output results/milestone_publication_package.zip \
  --title "Milestone validation package"

Docker

Full stack

docker compose up --build

UI only

docker compose up --build milestone-web

Runtime image with full bioinformatics stack

docker compose up --build milestone-runtime

External tool smoke benchmarking

docker compose run --rm milestone-benchmark

Project Documentation

Notes

Milestone outputs *_wgmlst.tsv profile files.
Profile comparison is a schema-based discriminator; it is not a direct phylogeny engine.
Species-specific validation is still required before strong biological claims.

fatmakahveci/milestone

MILESTONE

Quick Start

CLI setup

Run web app

Run tests

Core Commands

1) Compare two profiles

2) Build profile distance matrix

3) Benchmark against truth profiles

4) Schema QC

Public Schema / Benchmark Import

PubMLST scheme import

PubMLST benchmark pack import

EnteroBase scheme import

Publication / Validation Utilities

Docker

Full stack

UI only

Runtime image with full bioinformatics stack

External tool smoke benchmarking

Project Documentation

Notes

Citation Metadata

On this page

Languages

Contributors