Sequence-to-function models can predict gene expression from sequence data and be used to link genetic information with transcriptomics data to understand regulatory processes and their effects on complex phenotypes. ...
详细信息
Drug synergy prediction is a challenging and important task in the treatment of complex diseases including cancer. In this manuscript, we present a unified Model, known as BAITSAO, for tasks related to drug synergy pr...
Single-cell RNA-seq (scRNA-seq) has become a prominent tool for studying human biology and disease. The availability of massive scRNA-seq datasets and advanced machine learning techniques has recently driven the devel...
详细信息
We introduce ChromActivity, a computational framework for predicting and annotating regulatory activity across the genome through integration of multiple epigenomic maps and various functional characterization dataset...
详细信息
Background. Convenient programmatic access to different biological databases allows automated integration of scientific knowledge. Many databases support a function to download files or data snapshots, or a webservice...
详细信息
Background. Convenient programmatic access to different biological databases allows automated integration of scientific knowledge. Many databases support a function to download files or data snapshots, or a webservice that offers "live" data. However, the functionality that a database offers cannot be represented in a static data download file, and webservices may consume considerable computational resources from the host server. Results. MetNetAPI is a versatile Application programming Interface (API) to the MetNetDB database. It abstracts, captures and retains operations away from a biological network repository and website. A range of database functions, previously only available online, can be immediately (and independently from the website) applied to a dataset of interest. Data is available in four layers: molecular entities, localized entities (linked to a specific organelle), interactions, and pathways. Navigation between these layers is intuitive (e.g. one can request the molecular entities in a pathway, as well as request in what pathways a specific entity participates). Data retrieval can be customized: Network objects allow the construction of new and integration of existing pathways and interactions, which can be uploaded back to our server. In contrast to webservices, the computational demand on the host server is limited to processing data-related queries only. Conclusions. An API provides several advantages to a systems biology software platform. MetNetAPI illustrates an interface with a central repository of data that represents the complex interrelationships of a metabolic and regulatory network. As an alternative to data-dumps and webservices, it allows access to a current and "live" database and exposes analytical functions to application developers. Yet it only requires limited resources on the server-side (thin server/fat client setup). The API is available for Java, *** and R programming environments and offers flexible query and bro
Transcriptomes provide highly informative molecular phenotypes that, combined with gene perturbation, can connect genotype to phenotype. An ultimate goal is to perturb every gene and measure transcriptome changes, how...
Transcriptomes provide highly informative molecular phenotypes that, combined with gene perturbation, can connect genotype to phenotype. An ultimate goal is to perturb every gene and measure transcriptome changes, however, this is challenging, especially in whole animals. Here, we present ‘Worm Perturb-Seq (WPS)’, a method that provides high-resolution RNA-sequencing profiles for hundreds of replicate perturbations at a time in living animals. WPS introduces multiple experimental advances combining strengths of Caenhorhabditis elegans genetics and multiplexed RNA-sequencing with a novel analytical framework, EmpirDE. EmpirDE leverages the unique power of large transcriptomic datasets and improves statistical rigor by using gene-specific empirical null distributions to identify DEGs. We apply WPS to 103 nuclear hormone receptors (NHRs) and find a striking ‘pairwise modularity’ in which pairs of NHRs regulate shared target genes. We envision the advances of WPS to be useful not only for C. elegans, but broadly for other models, including human cells.
A fundamental step in many data-analysis techniques is the construction of an affinity matrix describing similarities between data points. When the data points reside in Euclidean space, a widespread approach is to fr...
详细信息
Background. There are two main technologies for transcriptome profiling, namely, tiling microarrays and high-throughput sequencing. Recently there has been a tremendous amount of excitement about the latter because of...
详细信息
Complex diseases are often the downstream event of a number of risk factors, including both environmental and genetic variables. To better understand the mechanism of disease onset, it is of great interest to systemat...
详细信息
Complex diseases are often the downstream event of a number of risk factors, including both environmental and genetic variables. To better understand the mechanism of disease onset, it is of great interest to systematically investigate the crosstalk among various risk factors. Bayesian networks provide an intuitive graphical interface that captures not only the association but also the conditional independence and dependence structures among the variables, resulting in sparser relationships between risk factors and the disease phenotype than traditional correlation-based methods. In this paper, we apply a Bayesian network to dissect the complex regulatory relationships among disease traits and various risk factors for the Genetic Analysis Workshop 17 simulated data. We use the Bayesian network as a tool for the risk prediction of disease outcome.
Label-free alignment between datasets collected at different times, locations, or by different instruments is a fundamental scientific task. Hyperbolic spaces have recently provided a fruitful foundation for the devel...
ISBN:
(纸本)9781713845393
Label-free alignment between datasets collected at different times, locations, or by different instruments is a fundamental scientific task. Hyperbolic spaces have recently provided a fruitful foundation for the development of informative representations of hierarchical data. Here, we take a purely geometric approach for label-free alignment of hierarchical datasets and introduce hyperbolic Procrustes analysis (HPA). HPA consists of new implementations of the three prototypical Procrustes analysis components: translation, scaling, and rotation, based on the Riemannian geometry of the Lorentz model of hyperbolic space. We analyze the proposed components, highlighting their useful properties for alignment. The efficacy of HPA, its theoretical properties, stability and computational efficiency are demonstrated in simulations. In addition, we showcase its performance on three batch correction tasks involving gene expression and mass cytometry data. Specifically, we demonstrate high-quality unsupervised batch effect removal from data acquired at different sites and with different technologies that outperforms recent methods for label-free alignment in hyperbolic spaces.
暂无评论