【專題演講】110/5/13(四)15:30-16:30 朱是鍇醫學助理研究員

摘 要

Normalization and batch correction are critical steps in processing single cell RNA sequencing (scRNA-seq) data, which remove unwanted technical effect and systematic biases that unmask biological signal of interest. Although numerous computational methods have already been developed, there is no guidance for choosing the appropriate procedures in different scenarios.  In this study, by assessing the performance of popular scRNA-seq noise reduction procedures (i.e., combining normalization and batch correction methods), we aim to help users select the best method in different scenarios. We use both synthetic and real datasets to set up multiple scenarios, which include relative magnitude of batch effect, imbalanced composition in cell groups, multiple batches/cell groups, dropout rate, and variable library size. Multiple quantitative metrics, which measure batch effect removal, retained within-batch cell structure, retained within-batch gene structure, are calculated after adjustment to evaluate the performance. Results show that batch effect can be removed on most procedures when they are not confounded with biological effect, even when they are the major contributor to the variations. Moreover, imbalanced composition in cell groups can differentiate the method performance. The results also show that normalizing data by different methods can greatly affect overall performance of correction. The results of this study serve as a guideline for selecting suitable noise reduction procedures.