MicroRNAs (miRNAs) have significant biological roles at the molecular level by regulating genes post-transcriptionally. To understand the functional effects of miRNAs in different biological contexts, it is essential ...
详细信息
ISBN:
(纸本)9781509016129
MicroRNAs (miRNAs) have significant biological roles at the molecular level by regulating genes post-transcriptionally. To understand the functional effects of miRNAs in different biological contexts, it is essential to elucidate miRNA -mRNA regulatory modules (MRMs). The computational complexity for inferencing MRMs is very high due to the many-to-many relationships between miRNAs and mRNAs and inferencing MRMs is still a challenging unresolved problem. In this paper, we propose a novel iterative segmented least square method for functional MRM discovery. Our method operates in two steps: 1) grouping and ordering the miRNAs and mRNAs to build per-sample matrices representing miRNA-mRNA regulations, and 2) determining maximum sized modules from structured miRNA-mRNA matrices. In experiments with human breast cancer data sets from TCGA, we show that our method outperforms existing methods in terms of both GO similarity and cluster evaluation. In addition, we show that modules determined by our method can be used for breast cancer survival prediction and subtype classification.
Quality control (QC) becomes more important in pre-processing analysis of high dimensional omics data. Several routine QC processes became a standard process in omics data analysis. The standard QC analysis includes c...
详细信息
Universally conserved 16S rRNA gene sequences generated using high-throughput sequencing technique has become powerful tool to analysis the robust diversity and characterizing microbial ***,even full-length 16S rRNA s...
详细信息
Universally conserved 16S rRNA gene sequences generated using high-throughput sequencing technique has become powerful tool to analysis the robust diversity and characterizing microbial ***,even full-length 16S rRNA sequence can be obtained from PacBio(R)SMRT sequencer with high yield and *** partial region sequences have been used as sequence tags in microbial community analysis with sequencing bias and absence of taxonomic classification below genus level,the analysis using full-length 16S rRNA is expected to improve the result *** this study,soil metagenome,fecal metagenome and a synthetic mock community DNA were profiled for bacterial 16S with SMRT sequencing using P6/*** 16S rDNA PCR and its representing SMRT sequencing were performed five times *** SMRT cell of full-length 16S rDNA reads analyzed using three different CCS filtering condition,CCS with minimum 6 full passes,minimum 90%,and 99%predicted *** sorting and 12~18kbp length filtering was followed by primer trimming *** checked as error correction and UCHIME from USEARCH program used for detect ***-chimeric CCS reads analyzed for community profiling through in house *** reads accuracy evaluated by number of mismatches and insertion/deletion *** community profile evaluated in classification rate at taxonomic levels and its accuracy,taxon *** soil and fecal data,we were able to sort out non-chimeric sequences based on the reproduction of highly similar sequences from multiple PCR ***,we demonstrate the usefulness of full-length 16S rRNA gene amplicon sequencing in microbial ecology,and suggest the optimal method for generation and analysis of barcoded full-length 16S rDNA sequence data.
Steganography is the science of unnoticeably concealing a secret message within a certain image, called a cover image. The cover image with the secret message is called a stego image. Steganography is commonly used fo...
详细信息
New in silico tools that make use of genome-scale metabolic flux modeling are improving the design of metabolic engineering strategies. This review highlights the latest developments in this area, explains the interfa...
详细信息
New in silico tools that make use of genome-scale metabolic flux modeling are improving the design of metabolic engineering strategies. This review highlights the latest developments in this area, explains the interface between these in silico tools and the experimental implementation tools of metabolic engineers, and provides a way forward so that in silico predictions can better mimic reality and more experimental methods can be considered in simulation studies. The several methodologies for solving genome-scale models (eg, flux balance analysis [FBA], parsimonious FBA, flux variability analysis, and minimization of metabolic adjustment) all have unique advantages and applications. There are two basic approaches to designing metabolic engineering strategies in silico, and both have demonstrated success in the literature. The first involves: 1) making a genetic manipulation in a model; 2) testing for improved performance through simulation; and 3) iterating the process. The second approach has been used in more recently designed in silico tools and involves: 1) comparing metabolic flux profiles of a wild-type and ideally engineered state and 2) designing engineering strategies based on the differences in these flux profiles. Improvements in genome-scale modeling are anticipated in areas such as the inclusion of all relevant cellular machinery, the ability to understand and anticipate the results of combinatorial enrichment experiments, and constructing dynamic and flexible biomass equations that can respond to environmental and genetic manipulations.
Network biology has been successfully used to help reveal complex mechanisms of disease, especially cancer. On the other hand, network biology requires in-depth knowledge to construct diseasespecific networks, but our...
详细信息
Drugs are classified according to their biological and chemical reactions, and the systems that they target. Thus, an accurate and efficient prediction method for drug class discovery would reveal key properties of ca...
详细信息
ISBN:
(纸本)9781728118680
Drugs are classified according to their biological and chemical reactions, and the systems that they target. Thus, an accurate and efficient prediction method for drug class discovery would reveal key properties of candidate drugs, significantly conserving time and resources in drug repositioning and design. Previous approaches, based on data mining or statistics, required complicated feature construction in advance. Knowing that deep learning can identifying patterns in high-dimensional datasets without elaborate feature selection or engineering, we constructed a model for predicting drug classes using deep neural networks - with biological and chemical structure data. Our proposed model outperforms previous learning-based methods in terms of prediction accuracy.
The fixation index (Fst) is one of the most widely used measurements of genetic distance between populations. The data set from the international HapMap project has been served as a reference data set for population d...
详细信息
The fixation index (Fst) is one of the most widely used measurements of genetic distance between populations. The data set from the international HapMap project has been served as a reference data set for population differentiation studies. Fst is commonly used in order to compare the sample data with HapMap data. In this study, however, we show that Fst without consideration of sample sizes may mislead the result. In particular, we first demonstrate that Fst suffers from imbalance of sample sizes through simulation studies and through the analysis of a large scale Korean genome-wide association (GWA) data. Then, we propose a modified version of Fst which is shown to be more robust to imbalance of sample size. In addition, the chi-square test commonly used for homogeneity test is shown to perform similarly to the modified version of Fst.
There is an increasing interest in the pathway analysis of multiple genes and complex traits in association studies. Recently, a number of methods of pathway analysis have been developed to detect the novel pathways a...
详细信息
ISBN:
(纸本)9781467368001
There is an increasing interest in the pathway analysis of multiple genes and complex traits in association studies. Recently, a number of methods of pathway analysis have been developed to detect the novel pathways associated with human complex traits. In this paper, we propose a novel statistical approach for competitive pathway analysis based on Structural Equation Modeling (CPA-SEM), taking advantage of prior knowledge on existing relationships between genes in a pathway. Our CPA-SEM identifies pathways associated with traits of interest. The CPA-SEM approach is different from the previous SEM-based approaches in that it considers all possible sub-pathways into account and performs permutation based robust analysis. We applied the proposed CPA-SEM method to gene expression data of gastric cancer (GSE27342), and found that mTOR signaling pathway was significantly associated with gastric cancer. This pathway has previously been reported to be associated with gastric cancer. In conclusion, our CPA-SEM analysis provides a better understanding of biological mechanism by identifying pathways associated with a trait of interest.
Multifactor dimensionality reduction (MDR) has been successfully applied to identification of gene-gene interactions for the complex traits. Generalized MDR (GMDR) was its extension that allows adjustment for covariat...
详细信息
Multifactor dimensionality reduction (MDR) has been successfully applied to identification of gene-gene interactions for the complex traits. Generalized MDR (GMDR) was its extension that allows adjustment for covariates. The current GMDR software mainly focuses on candidate gene association studies with a relatively small number of genetic markers and has some limitations to be extended to genome-wide association studies (GWAS) with a large number of genetic markers. We develop GWAS-GMDR, an effective parallel computing program package with special features for GWAS with a large number of genetic markers by using distributed job scheduling method and/or CUDA-enabled high-performance graphic processing units (GPU). First, GWAS-GMDR implements an effective memory handling algorithm and efficient procedures for GMDR to make joint analysis of multiple genes feasible for GWAS. Second, a weighted version of cross-validation consistency based on `top-K selection' (WCVC K ) is proposed to report multiple candidates for causal gene-gene interactions. Third, various performance measures are implemented to evaluate MDR classifiers, including balanced accuracy, tau-b, likelihood ratio and normalized mutual information. Fourth, some popular methods for handling missing genotypes are implemented. Finally, our applications support both CPU-based and GPU-based parallel computing system. We applied our applications using a real genome wide data set from WTCCC Crohn's disease dataset to identify two-way interaction models in genome-wide scale. The GWAS-GMDR package is a powerful tool for the gene-gene interaction analysis in a genome-wide scale. High-performance implementations are provided as native binaries for Linux, Mac OS X and Windows systems.
暂无评论