Agentic Spatial Pathologist Public Product Roadmap#
This repo is the public-facing product layer for the agentic spatial pathologist workflow.
Phase 1: Wrapper Product#
Current strategy:
keep
histosegas the execution enginekeep this repo lightweight and public-facing
expose a stable package name:
spathoprovide a simple CLI and README-first onboarding
What is already wrapped:
OpenAI-driven cluster annotation
structure discovery
H&E overlay generation
pathology review report generation
formal organ-pack metadata
workflow config schema export
artifact manifest generation
Why this repo exists#
histoseg began as a segmentation and contour-generation library.
The public product experience now needs a clearer identity:
disease-focused workflows
case-level reporting
reproducible workflow bundles
user-facing documentation
This repo becomes that layer.
Phase 2: Product Stabilization#
Near-term work:
Expand packaged organ packs beyond
lungandbreastAdd small regression tests with tiny fixtures
Add GitHub Actions for
pytest, package build, and CLI smoke testsStabilize config schema versioning and compatibility rules
Stabilize artifact manifest and report schema versions
Phase 3: Dependency Inversion#
The long-term goal is to reduce direct runtime coupling to sibling repos.
Planned moves:
move public-safe workflow code from
histosegintospathokeep only geometry/segmentation primitives in
histosegdefine organ packs under
spatho.organ_packssupport multiple providers: OpenAI, Anthropic, local models
Phase 4: Community Release#
Before broad public release:
rewrite README around
spatho, not just legacy wrapperspublish example datasets and example reports
document license boundaries clearly
add issue templates and contribution guide
Phase 5: stGPT Foundation Model -> Evidence Workbench#
The next AI upgrade should add an optional spatial transcriptomics foundation-model layer rather than replacing the existing workflow. The narrative is:
stGPT learns reusable contour/region morpho-molecular representations; spatho plans, validates, and turns them into auditable spatial pathology evidence.
The product should be described as a closed loop:
Model -> Evidence -> Agent -> Human Review -> Better Model
Planned moves:
define
stGPT Foundation: training, model architecture, checkpoint loading, embedding, and model packagingdefine
stGPT Evidence Suite: QC, deterministic splits, benchmark tables, ablations, domain-shift checks, and failure analysisdefine
stGPT Runtime / Tool API:embed_cells,evaluate_checkpoint,package_model, andexport_spatho_artifactsfirst; retrieval, imputation, niche scoring, region comparison, and structure explanation only after tested outputs existdefine
spatho Agentic Workbench: guardrailed workflow orchestration that checks QC before biological conclusionsdefine
spatho Reports: report sections that separate measured expression, model-derived evidence, warnings, and human review
Implementation guardrails:
precomputed stGPT artifacts must work without importing
stgptlocal_stgptis optional and should fail clearly when the package or model paths are missingfatal stGPT QC blocks a run when
stgpt_require_qc_pass=truewarning-only stGPT QC becomes cautionary report language, not a hard failure
imputation, reconstruction, or embeddings must never be described as measured expression
See stGPT Upgrade Plan for the detailed implementation route.
Repository Roles#
spatho: public product and user experience layerhistoseg: geometry and segmentation engineexample web apps: optional deployment surfaces, not the core product
Agentic Spatial Pathologist v0.1#
The v0.1 platform target is a Xenium-native auditable evidence loop:
stGPT learns Xenium-native morpho-molecular representations; spatho turns them into auditable spatial pathology evidence.
The fixed demo question is: “Which H&E-defined structures in this Xenium case show reproducible morpho-molecular programs, and do those findings pass QC?”
The workbench contract is intentionally conservative:
stGPT evidence is model-derived support, not measured expression or diagnosis.
QC fatal errors block biological conclusions.
QC warning-only evidence is labelled cautionary.
Every report claim should carry an evidence ID, artifact path or ID, model/checkpoint provenance, QC status, and human-review state.
pyXenium LazySlide/PLIP/mTM summaries are optional evidence sources consumed by spatho; the LR benchmark scaffold remains separate backlog.