Motivation Drug discovery practitioners in industry and academia use semantic tools to extract information from online scientific literature to generate new insights into targets, therapeutics and diseases. However, d...
Motivation Drug discovery practitioners in industry and academia use semantic tools to extract information from online scientific literature to generate new insights into targets, therapeutics and diseases. However, due to complexities in access and analysis, patent-based literature is often overlooked as a source of information. As drug discovery is a highly competitive field, naturally, tools that tap into patent literature can provide any actor in the field an advantage in terms of better informed decision-making. Hence, we aim to facilitate access to patent literature through the creation of an automatic tool for extracting information from patents described in existing public *** Here, we present PEMT, a novel patent enrichment tool, that takes advantage of public databases like ChEMBL and SureChEMBL to extract relevant patent information linked to chemical structures and/or gene names described through FAIR principles and metadata annotations. PEMT aims at supporting drug discovery and research by establishing a patent landscape around genes of interest. The pharmaceutical focus of the tool is mainly due to the subselection of International Patent Classification codes, but in principle, it can be used for other patent fields, provided that a link between a concept and chemical structure is investigated. Finally, we demonstrate a use-case in rare diseases by generating a gene-patent list based on the epidemiological prevalence of these diseases and exploring their underlying patent *** and implementation PEMT is an open-source Python tool and its source code and PyPi package are available at https://***/fraunhofer-ITMP/PEMT and https://***/project/PEMT/, *** information Supplementary data are available at Bioinformatics online.
In many applications, particularly in the engineering field, the need to consider uncertainties is recognized. To reduce the number of necessary simulations, metamodels can be used. We present a novel method on the ba...
详细信息
ISBN:
(纸本)9783950353709
In many applications, particularly in the engineering field, the need to consider uncertainties is recognized. To reduce the number of necessary simulations, metamodels can be used. We present a novel method on the basis of metamodels, that allows us to model not only the deterministic responses, but also the propagation of uncertainty in the entire design space. Our procedure makes it possible to determine the robust optimum quickly with common multi-criteria optimization algorithms. The novel approach offers the possibility to include the tolerance of the metamodel in the calculation. We introduce a new class of robustness measures that characterizes the propagation of uncertainty more accurately than usual: the median as measure of central tendency and the difference between median and a high quantile q as measure of dispersion. This allows the user to adjust the degree of robustness to his wishes via q. It can handle even extremely skewed distributions in an appropriate way. For the determination of the quantiles we use a novel combination of sampling scheme and nonparametric quantile estimation. This enables a fast computation on a local level. The suitability of the proposed proceeding is proved on several examples. The applicability is demonstrated on a real life example from automotive industry.
Knowledge graphs play a central role in big data integration, especially for connecting data from different domains. Bringing unstructured texts, e.g. from scientific literature, into a structured, comparable format i...
详细信息
Robust optimization determines how the input variables dispersion is propagated on the output variables. This is of great practical relevance: For example, the quality of a product is influenced decisively by producti...
详细信息
ISBN:
(纸本)9789609999465
Robust optimization determines how the input variables dispersion is propagated on the output variables. This is of great practical relevance: For example, the quality of a product is influenced decisively by production tolerances. In industrial applications it is important to characterize the range of variation with appropriate measures. The industry is particularly interested in accurate limits of the output distribution or its centered part. In this article the mathematical characterization of robustness is discussed under the viewpoint of its practical applicability. It is shown, that the usually used robustness measures mean for central tendency and standard deviation for dispersion produce inaccurate limits. Instead several measures based on quantiles are proposed. The median is used as measure of central tendency while different quantile ranges are used as measure of dispersion. They are compared for the robust optimization of mathematical functions and industrial applications. The advantages of the quantile measures are pointed out. The computation of quantiles is expensive, because it needs many function evaluations. Due to their long runtime only a few simulations can be executed in practice. A methodology tailored to this situation is proposed. It is based on the use of metamodels. Starting with a few real simulations a metamodel for the system is build. Further ones for median and dispersion of the output variables are derived from it. This enables the user to perform a full multicriteria robust optimization on the whole parameter range. The methodology takes the tolerances of the metamodels into account. A new measure for the tolerance of metamodels which are derived from metamodels is presented. It is used to estimate the accuracy of the quantile models. The tolerance can easily be integrated in the robust optimization process.
Knowledge graphs have been shown to play an important role in recent knowledge mining and discovery, for example in the field of life sciences or bioinformatics. Although a lot of research has been done on the field o...
详细信息
Quantum energies which are used in applications are usually composed of repulsive and attractive terms. The objective of this study is to use an accurate and efficient fitting of the repulsive energy instead of using ...
详细信息
Quantum energies which are used in applications are usually composed of repulsive and attractive terms. The objective of this study is to use an accurate and efficient fitting of the repulsive energy instead of using standard parametrizations. The investigation is based on Density Functional Theory and Tight Binding simulations. Our objective is not only to capture the values of the repulsive terms but also to efficiently reproduce the elastic properties and the forces. The elasticity values determine the rigidity of a material when some traction or load is applied on it. The pair-potential is based on an exponential term corrected by B-spline terms. In order to accelerate the computations, one uses a hierarchical optimization for the B-splines on different levels. Carbon graphenes constitute the configurations used in the simulations. We report on some results to show the efficiency of the B-splines on different levels.
Using accurate quantum energy computations in nanotechnologic applications is usually very computationally intensive. That makes it difficult to apply in subsequent quantum simulation. In this paper, we present some p...
详细信息
Using accurate quantum energy computations in nanotechnologic applications is usually very computationally intensive. That makes it difficult to apply in subsequent quantum simulation. In this paper, we present some preliminary results pertaining to stochastic methods for alleviating the numerical expense of quantum estimations. The initial information about the quantum energy originates from the Density Functional Theory. The determination of the parameters is performed by using methods stemming from machine learning. We survey the covariance method using marginal likelihood for the statistical simulation. More emphasis is put at the position of equilibrium where the total atomic energy attains its minimum. The originally intensive data can be reproduced efficiently without losing accuracy. A significant acceleration gain is perceived by using the proposed method.
Incorporating distant information via manually selected skip chain templates has been shown to be beneficial for the performance of conditional random field models in contrast to a simple linear chain based structure ...
详细信息
Incorporating distant information via manually selected skip chain templates has been shown to be beneficial for the performance of conditional random field models in contrast to a simple linear chain based structure (Sutton and McCallum, 2007;Galley, 2006;Liu et al., 2010). The set of properties to be captured by a template is typically manually chosen with respect to the application domain. In this paper, a search strategy to find meaningful skip chains independent from the application domain is proposed. From a huge set of potentially beneficial templates, some can be shown to have a positive impact on the performance. The search for a meaningful graphical structure demonstrates the usefulness of the approach with an increase of nearly 2% F1 measure on a publicly available data set (Klinger et al., 2008).
We overview the methods for nonlinear metamodeling of a simulation database featuring continuous exploration of simulation results, tolerance prediction, sensitivity analysis, robust multiobjective optimization and ra...
详细信息
ISBN:
(纸本)9789898425782
We overview the methods for nonlinear metamodeling of a simulation database featuring continuous exploration of simulation results, tolerance prediction, sensitivity analysis, robust multiobjective optimization and rapid interpolation of bulky FEM data. Large scatter of simulation results, in crash-test simulations caused for example by buckling, is still a challenging issue for increasing predictability of simulation and accuracy of optimization results. For industrially relevant simulations with large scatter, novel stochastic methods are introduced and their efficiency is demonstrated for benchmark cases.
暂无评论