News

Integration of single-cell multiomics data allows a more precise identification of rare cell types and states

Researchers at the Josep Carreras Leukaemia Research Institute have demonstrated that combining data from different origins enables a more precise characterisation of cell type’s diversity into tissues and organs. The team also introduced scOMM, a new machine learning tool for classifying and compare cell types across the distinct data modalities. The tool is openly available to researchers worldwide and has been successfully tested in kidney tissue, one of the most cellularly diverse organs in the human body and validated in blood and heart tissue. As the results demonstrate, when it comes to understanding complex tissues, the whole is greater than the sum of its parts.

Integration of single-cell multiomics data allows a more precise identification of rare cell types and states
Integration of single-cell multiomics data allows a more precise identification of rare cell types and states

Characterising all cell types in the human body is essential to understand how our tissues and organs work. This information can drive major advances in healthcare and medicine through a deep understanding of the subtle interplay among the different components of every organ. This is the core objective of the Human Cell Atlas, an international collaborative research consortium with 18 scientific networks across 103 countries.

This is a challenging task because organs have many kinds of cells, some in small amounts, and to distinguish one cell type from another one must profile their molecular features, like which genes are active or which regions of the DNA are accessible for gene regulation. To do so, researchers use several methodologies of single cell analysis, like scRNAseq or snATACseq, each getting a part of the story, but none getting it all. Wouldn’t it be great to integrate different methodologies to gain insight and resolution?

This is the object of the latest publication from the Cellular Systems Genomics Group at the Josep Carreras Leukaemia Research Institute, published in the prestigious open-access journal Genome Biology. The team, led by Dr. Elisabetta Mereu, also developed an interpretable machine learning algorithm, scOMM, capable of classifying cell types consistently across different single-cell methods and evaluating how well different integration strategies perform. Together, the integration strategies and scOMM establish a robust approach for cell atlas generation in highly complex tissues.

As a proof of concept, the team, spearheaded by Mario Acera Mateos and Jessica Kanglin Li, together with colleagues from the MIT and Harvard University, characterized kidney samples from 19 donors. After the analysis of 199,744 cells, they could successfully identify the presence of two rare cell types, known to be relevant in diseased organs, previously undetected in the available kidney cell atlases.

After replicating the benchmarking analysis in heart and independent kidney datasets, the team concluded that improvements of data integration can be generalized, emphasizing the robustness and transferability of their framework across tissues and experimental protocols.

Works like this open the door to using integrative tools to better characterize the diversity of cell types and functional cellular states in the bone marrow in leukaemia patients or the lymph nodes in lymphoma, providing a deeper understanding of the heterogeneity, at the cellular level, of these diseases.

Reference article:

Acera-Mateos, M., Adiconis, X., Li, JK. et al. “Systematic evaluation of single-cell multimodal data integration enhances cell type resolution and discovery of clinically relevant states in complex tissues”. Genome Biol 27, 64 (2026). https://doi.org/10.1186/s13059-026-04002-4



Back