Neural networks and supervised machine learning (ML) techniques can characterise cells studied using single cell RNA-sequencing (scRNA-seq), scientists from Carnegie Mellon University (CMU) have learnt. This could aid others in identifying new cell subtypes and in discerning diseased cells.

The new method, detailed in Nature Communications, analyses all scRNA-seq data and picks those parameters which can determine one cell from another. This means all cell types can be analysed and compared. The scQuery makes the method usable by all researchers.

The researchers said scRNA-seq studies profile thousands of cells in heterogeneous environments: “Current methods for characterising cells perform unsupervised analysis followed by assignment using a small set of known marker genes. Such approaches are limited to a few, well-characterised cell types. We developed an automated pipeline to download, process, and annotate publicly available scRNA-seq datasets to enable large-scale supervised characterisation.

“We extend supervised neural networks to obtain efficient and accurate representations for scRNA-seq data. We apply our pipeline to analyse data from over 500 different studies with over 300 unique cell types and show that supervised methods outperform unsupervised methods for cell type identification.”

This new type of sequencing is set to support the National Institutes of Health’s Human BioMolecular Atlas Program, which is building a 3D map of the human body that shows how tissues differ on a cellular level.