Integrated machine learning and multimodal data fusion for patho-phenotypic feature recognition in iPSC models of dilated cardiomyopathy


Wali R, Xu H, Cheruiyot C, Saleem HN, Janshoff A, Habeck M, Ebert A


Biological Chemistry


Biol Chem. 2024 Apr 24.


Integration of multiple data sources presents a challenge for accurate prediction of molecular patho-phenotypic features in automated analysis of data from human model systems. Here, we applied a machine learning-based data integration to distinguish patho-phenotypic features at the subcellular level for dilated cardiomyopathy (DCM). We employed a human induced pluripotent stem cell-derived cardiomyocyte (iPSC-CM) model of a DCM mutation in the sarcomere protein troponin T (TnT), TnT-R141W, compared to isogenic healthy (WT) control iPSC-CMs. We established a multimodal data fusion (MDF)-based analysis to integrate source datasets for Ca2+ transients, force measurements, and contractility recordings. Data were acquired for three additional layer types, single cells, cell monolayers, and 3D spheroid iPSC-CM models. For data analysis, numerical conversion as well as fusion of data from Ca2+ transients, force measurements, and contractility recordings, a non-negative blind deconvolution (NNBD)-based method was applied. Using an XGBoost algorithm, we found a high prediction accuracy for fused single cell, monolayer, and 3D spheroid iPSC-CM models (≥92 ± 0.08 %), as well as for fused Ca2+ transient, beating force, and contractility models (>96 ± 0.04 %). Integrating MDF and XGBoost provides a highly effective analysis tool for prediction of patho-phenotypic features in complex human disease models such as DCM iPSC-CMs.


Pubmed Link