版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:La Pitie Salpetriere Hosp Brain & Spine Inst Bioinformat & Biostat Core Facil Paris France Pierre & Marie Curie Univ Paris France Inst Cardiometab & Nutr ICANalyt Dept Paris France MedDay Pharmaceut Co SpectMet Platform Paris France CEA Saclay LEMM Lab Gif Sur Yvette France Inst Pasteur Stat Genet Grp Paris France Bioinformat Biostat Core Facil Paris France Pitie Salpetriere Univ Hosp Genet Paris France Univ Pierre & Marie Curie UPMC Genet Paris France Pitie Salpetriere Univ Hosp Paris France CentraleSupelec Lab L2S Gif Sur Yvette France
出 版 物:《BRIEFINGS IN BIOINFORMATICS》 (生物信息学简报)
年 卷 期:2018年第19卷第6期
页 面:1356-1369页
核心收录:
学科分类:0710[理学-生物学] 07[理学] 09[农学]
基 金:Assistance-Publique des Hopitaux de Paris French Ministry of Health (PHRC BIOSCA) [RCB: 2010-A01324-35] Cognacq-Jay foundation program 'Investissements d'avenir' [ANR-10-IAIHU-06] patients' association Connaitre les Syndromes Cerebelleux (CSC)
主 题:data integration Regularized Generalized Canonical Correlation Analysis biomarker discovery spinocerebellar ataxia
摘 要:The growing number of modalities (e.g. multi-omics, imaging and clinical data) characterizing a given disease provides physicians and statisticians with complementary facets reflecting the disease process but emphasizes the need for novel statistical methods of data analysis able to unify these views. Such data sets are indeed intrinsically structured in blocks, where each block represents a set of variables observed on a group of individuals. Therefore, classical statistical tools cannot be applied without altering their organization, with the risk of information loss. Regularized generalized canonical correlation analysis (RGCCA) and its sparse generalized canonical correlation analysis (SGCCA) counterpart are component-based methods for exploratory analyses of data sets structured in blocks of variables. Rather than operating sequentially on parts of the measurements, the RGCCA/SGCCA-based integrative analysis method aims at summarizing the relevant information between and within the blocks. It processes a priori information defining which blocks are supposed to be linked to one another, thus reflecting hypotheses about the biology underlying the data blocks. It also requires the setting of extra parameters that need to be carefully ***, we provide practical guidelines for the use of RGCCA/SGCCA. We also illustrate the flexibility and usefulness of RGCCA/SGCCA on a unique cohort of patients with four genetic subtypes of spinocerebellar ataxia, in which we obtained multiple data sets from brain volumetry and magnetic resonance spectroscopy, and metabolomic and lipidomic analyses. As a first step toward the extraction of multimodal biomarkers, and through the reduction to a few meaningful components and the visualization of relevant variables, we identified possible markers of disease progression.