All agents
Specialised agent

ML / AI Agent

Omi's specialised co-pilot for ml / ai work

I learn the associations between your single-cell data and clinical outcomes — response vs non-response, progression-free survival, disease subtype — and build models that actually generalise. Think of me as the friendly statistician who also knows deep learning, single-cell quirks, and how to not overfit your 12-patient cohort.

What I can do for you

I train classifiers (logistic regression, random forests, XGBoost, MLPs) on cell-type proportions, pseudobulk expression, or learned embeddings to predict clinical outcomes — with proper cross-validation, calibration, and confidence intervals so you don't fool yourself.

I build patient-level representations from single-cell data using MIL (multiple-instance learning), scPoli, scFoundation, or Geneformer embeddings, and use them to predict response, survival, or subtype in a way that respects the hierarchical (cells-in-patients) structure.

I run feature importance (SHAP, permutation), surface the genes and cell populations driving the prediction, and translate that into testable biological hypotheses — not a black box, an explainable one.

I handle small-cohort reality with stratified CV, nested CV for hyperparameter tuning, leakage checks, and class-imbalance strategies — and I'll tell you honestly when your sample size just isn't enough rather than letting you publish a 0.99 AUC that won't replicate.

Examples of what you can ask me

Copy any of these straight into the demo, or adapt them to your data.

  • 1"Train a model to predict immunotherapy response from baseline scRNA."
  • 2"Build a classifier for COVID severity using PBMC composition."
  • 3"Which cell populations are most predictive of relapse?"
  • 4"Use Geneformer embeddings to predict patient subtype."
  • 5"Run SHAP on my XGBoost model and explain the top features biologically."
  • 6"Cross-validate my classifier with leave-one-patient-out CV."

How I work

I run real Scanpy (Python) or Seurat (R) code on the secure MCP server — no hallucinations, no made-up gene lists. Every result comes with the exact code I executed and the parameters I used, so your analysis is fully reproducible and ready for the Methods section.

Best for

Translational researchers, clinician-scientists with patient cohorts, biomarker discovery teams, and computational biologists building predictive models from single-cell data who want rigour, not hype.

References

  • XGBoost (Chen & Guestrin, 2016) – KDD
  • SHAP (Lundberg & Lee, 2017) – NeurIPS
  • scPoli (De Donno et al., 2023) – Nature Methods
  • Geneformer (Theodoris et al., 2023) – Nature
  • scFoundation (Hao et al., 2024) – Nature Methods
  • scikit-learn (Pedregosa et al., 2011) – JMLR

Try ML / AI now

Jump into the demo with a starter prompt already loaded. Upload your data, or play with our example dataset first.

Other agents you might like

TCR Agent

Track clonotypes and link them to gene expression.

Learn more

Multiomics Agent

Integrate scRNA + scATAC and more.

Learn more

Mutation Calling Agent

Detect variants in single cells.

Learn more