All agents
Specialised agent

Quality Control & Smart Preprocessing Agent

Omi's specialised co-pilot for quality control work

I'm the first agent you'll meet — and the one that saves you the most heartache. I run smart, dataset-aware QC on your raw data, catch problems early (empty droplets, dying cells, doublets, ambient RNA, failed samples), and recommend preprocessing steps tailored to your specific data, not generic defaults.

What I can do for you

I run EmptyDrops or CellBender to separate real cells from empty droplets, detect doublets with Scrublet, scDblFinder, or DoubletFinder, and remove ambient RNA contamination with SoupX or DecontX — and explain what each one is doing in plain English.

I compute per-cell QC metrics (UMI count, gene count, % mitochondrial, % ribosomal, % hemoglobin) and recommend thresholds based on your specific dataset distribution rather than the textbook 5%/200/2500 numbers that almost never fit your data.

I flag failed samples, suspicious clusters (stress signatures, cell-cycle dominance, contaminating tissues), and technical artefacts before they pollute your downstream analysis — a 30-second sanity check that saves you a week of confused re-clustering.

I recommend the right normalization (log1p vs scran vs SCT vs Pearson residuals), HVG selection strategy, and dimensionality choices for your data size and complexity, with a one-paragraph explanation of why.

Examples of what you can ask me

Copy any of these straight into the demo, or adapt them to your data.

  • 1"Run full QC on my raw 10x output and recommend filtering thresholds."
  • 2"Detect and remove doublets in my dataset."
  • 3"Is sample 3 a failed run? Compare its QC metrics to the others."
  • 4"Remove ambient RNA contamination with SoupX."
  • 5"Recommend normalization and HVG strategy for my 200k-cell dataset."
  • 6"Flag any clusters dominated by stress or cell-cycle signatures."

How I work

I run real Scanpy (Python) or Seurat (R) code on the secure MCP server — no hallucinations, no made-up gene lists. Every result comes with the exact code I executed and the parameters I used, so your analysis is fully reproducible and ready for the Methods section.

Best for

Literally everyone running single-cell analysis. Especially first-timers, core facility staff handling many users' data, and PIs reviewing student/postdoc analyses who want a sanity check.

References

  • EmptyDrops (Lun et al., 2019) – Genome Biology
  • CellBender (Fleming et al., 2023) – Nature Methods
  • Scrublet (Wolock et al., 2019) – Cell Systems
  • scDblFinder (Germain et al., 2021) – F1000Research
  • DoubletFinder (McGinnis et al., 2019) – Cell Systems
  • SoupX (Young & Behjati, 2020) – GigaScience
  • DecontX (Yang et al., 2020) – Genome Biology
  • scran (Lun et al., 2016) – F1000Research
  • SCTransform (Hafemeister & Satija, 2019) – Genome Biology

Try Quality Control now

Jump into the demo with a starter prompt already loaded. Upload your data, or play with our example dataset first.

Other agents you might like

TCR Agent

Track clonotypes and link them to gene expression.

Learn more

Multiomics Agent

Integrate scRNA + scATAC and more.

Learn more

Mutation Calling Agent

Detect variants in single cells.

Learn more