版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Center for Genetics Children’s Hospital Oakland Research Institute Oakland CA 94609 United States Knowledge Synthesis Inc. Berkeley CA 94710 United States Department of Computer Science University of San Francisco San Francisco CA 94117 United States Department of Microbiology and Immunology University of California San Francisco San Francisco CA 94143 United States Division of Bioinformatics National Marrow Donor Program Minneapolis MN 55413 United States
出 版 物:《Human Immunology》
年 卷 期:2014年第75卷第6期
页 面:481-481页
学科分类:1001[医学-基础医学(可授医学、理学学位)] 10[医学]
摘 要:Aim Analyses of highly polymorphic HLA and KIR locus data require specialized tools and methods, but most modern analytical programs are tailored for SNPs. A typical analytical workflow for HLA and KIR data requires trafficking data between several programs, which is time intensive, error prone and limits reproducibility. The IDAWG has been developing an integrated data-management and analysis sy stem tailored to these important genetic systems. Methods We have developed the Toolkit for Immunogenomic Data Exchange and Storage (TIDES) and Bridging ImmunoGenomic Data-Analysis Workflow Gaps (BIGDAWG) systems to address the unmet need for consistent HLA and KIR data management and analysis. TIDES is a free, open-source system that converts HLA and KIR genotypes derived from widely used genotyping platforms into Genotype List (GL) Strings, which are registered with the NMDP’s GL Service (***). BIGDAWG is an automated pipeline, scripted in R, that performs common data analyses of multi-locus highly polymorphic genetic data characteristic of HLA and KIR genes. Results TIDES stores GL String-encoded data with project-related data, and exports these to PED and POP data-analysis formats. TIDES can be deployed on an AWS EC2 or equivalent Linux environment, on a local network or on a standalone machine, per user preference. A prototype TIDES implementation is deployed at ***. Starting with unambiguous multi-locus genotype data for case-control groups, BIGDAWG estimates user-specified haplotypes, bins low-frequency haplotypes to enable chi-squared testing, calculates odds-ratios, confidence intervals and p-values for each haplotype, and generates figures and tables for each comparison. Conclusions The integration of TIDES and BIGDAWG will create a workflow that accepts ambiguous genotype data and returns case-control analysis results. This approach streamlines data analysis and allows the consistent storage and exchange of HLA and KIR genot