Benchmarks¶
SpatialPerturb 当前把 benchmark 固定成两条主轨道:
shen_2026_core目标是复现空间扰动数据上的 intrinsic / neighbor / ligand-receptor / power / figure 主链。cross_platform_concordance目标是比较 spatial 和 dissociated reference 中的 perturbation signatures 与 programs。breast_reference_projection目标是把 breast Perturb-seq reference programs 投影到未扰动 Xenium WTA tissue,并输出 cell-level、cell-type/ROI-level 和 neighborhood-level program scores。
查看 catalog¶
import spatialperturb as sp
sp.available_datasets()
sp.available_benchmarks()
Public benchmark backbone¶
gse241115_breast_cropseq¶
accession:
GSE241115role: primary breast cancer CROP-seq reference for reference projection
raw format: flat GEO
RAW.tarwith 10xmtx/tsvfiles andprotospacer_calls_per_cell.csv.gzstatus: automatic
fetch -> prepare -> loadsupportednote: sgRNA / intergenic guide features are tracked as
barcode_columnsand excluded from expression DE/program genes
gse281048_pathway_atlas¶
accession:
GSE281048role: optional pathway Perturb-seq atlas; default downstream filter is
cell_line == "MCF7"raw format: Seurat
.rds.gzstatus: automatic fetch supported; prepare requires
Rscriptand Seurat
shen_2026_scrnaseq¶
accession:
GSE274058role: reference / cross-platform track
raw format: nested
10x tar.gzstatus: automatic
fetch -> prepare -> loadsupported
shen_2026_stereoseq¶
accession:
GSE274447role: spatial core track
raw format:
tar of GEFstatus: automatic fetch and extraction supported; final prepare still expects a preconverted
.h5ador tabular cell-level export
运行 core benchmark¶
import spatialperturb as sp
results = sp.run_core_benchmark(
"demo_spatialperturb",
config={
"cache_dir": ".spatialperturb-cache",
"method": "pseudobulk",
"sample_col": "sample",
"reference_dataset": "demo_spatialperturb",
"concordance_level": "both",
},
output_dir="reports/demo_spatialperturb",
)
这个入口会自动:
载入 prepared dataset
补 spatial graph(如果还没建)
运行
intrinsic_de运行
neighbor_de运行
differential_lr运行
power_curve如果给了 reference,再运行
platform_concordance输出 tables、figures、
manifest.json和input.h5ad
运行 cross-platform benchmark¶
spatial, reference = sp.load_demo_dataset(paired=True)
spatial_de = sp.intrinsic_de(
spatial,
perturbation="Lrrk2",
control="control",
method="pseudobulk",
sample_col="sample",
)
reference_de = sp.intrinsic_de(
reference,
perturbation="Lrrk2",
control="control",
method="pseudobulk",
sample_col="sample",
)
concordance = sp.run_cross_platform_benchmark(
spatial_de,
reference_de,
config={"top_n": 50, "level": "both"},
)
运行 breast reference projection benchmark¶
import spatialperturb as sp
results = sp.run_reference_projection_benchmark(
"/data/taobo.hu/SpatialPerturb/prepared/xenium_wta_breast.h5ad",
reference_datasets=["gse241115_breast_cropseq"],
config={
"cache_dir": "/data/taobo.hu/SpatialPerturb/cache",
"k": 15,
"groupby": ["cell_type", "roi"],
"reference_effect_size_only": True,
},
output_dir="/data/taobo.hu/SpatialPerturb/reports/breast_reference_projection",
)
输出包括:
tables/program_scores_cell_level.tsv.gztables/program_scores_by_group.tsvtables/neighbor_program_scores_cell_level.tsv.gztables/neighbor_program_scores_by_group.tsvtables/reference_de.tsvtables/reference_program_membership.tsvfigures/program_scores_heatmap.pngfigures/neighbor_program_scores_heatmap.pngmanifest.jsonbiological_interpretation.md
在 full-scale runs 中,reference_effect_size_only=True 会用 log2 fold-change 排名构建 programs;这适合 program projection,但 reference_de.tsv 中的 p-value/FDR 不应当用于显著性声明。
Benchmark 输出目录¶
run_core_benchmark(..., output_dir=...) 会生成固定目录结构:
tables/intrinsic_de.tsvtables/neighbor_de.tsvtables/differential_lr.tsvtables/power_curve.tsvtables/platform_concordance.tsv(如果提供 reference)figures/workflow_schema.pngfigures/assignment_qc.pngfigures/own_vs_neighbor.pngfigures/lr_differential.pngfigures/platform_concordance.pngfigures/power_curve.pngmanifest.jsonconfig.jsoninput.h5ad