Semi-parametric Empirical Bayes Method for Multiplet Detection in snATAC-seq with Probabilistic Multi-omic Integration.
Yuntian Wu, Haoran Hu, Wei Chen, Johann E Gudjonsson, Lam C Tsoi, Xiaoquan Wen
Abstract
Open AccessMultiplets, formed when multiple cells are captured in a droplet, produce hybrid molecular profiles that confound single-cell analyses. Detecting multiplets in single-nucleus ATAC-seq (snATAC-seq) data is particularly challenging due to sparsity and overdispersion of chromatin accessibility measurements. We introduce SEBULA, a semi-parametric empirical Bayes model that yields well-calibrated posterior probabilities for multiplet detection, enabling principled false discovery rate control. SEBULA also integrates probabilistic evidence with complementary signals from other modalities, such as scRNA-seq. Benchmarking on simulations and seven annotated trimodal DOGMA-seq datasets demonstrates SEBULA's superior performance. The open-source software is computationally efficient.