Similar to other droplet-based single cell assays, single nucleus ATAC-seq (snATAC-seq) data harbor multiplets that confound downstream analyses. Detecting multiplets in snATAC-seq data is particularly challenging due to its sparsity and trinary nature (0 reads: closed chromatin, 1: open in one allele, 2: open in both alleles), yet offers a unique opportunity to infer multiplets when >2 uniquely aligned reads are observed at multiple loci. Here, we implemented the first read count-based multiplet detection method, ATAC-DoubletDetector, that detects multiplets independently of cell-type. Using PBMC and pancreatic islet datasets, ATAC-DoubletDetector captured simulated heterotypic multiplets (different cell-types) with [~]0.60 recall, showing [~]24% improvement over state of the art. ATAC-DoubletDetector detected homotypic multiplets with [~]0.61 recall, representing the first method to detect multiplets originating from the same cell type. Using our novel clustering-based algorithm, multiplets were annotated to their cellular origins with [~]85% accuracy. Application of ATAC-DoubletDetector will improve downstream analysis of snATAC-seq.
Support the authors with ResearchCoin
Support the authors with ResearchCoin