Several classification methods assume that the underlying distributions follow tree-structured graphical models. Indeed, trees capture statistical dependencies between pairs of variables, which may be crucial to attai...
详细信息
Several classification methods assume that the underlying distributions follow tree-structured graphical models. Indeed, trees capture statistical dependencies between pairs of variables, which may be crucial to attaining low classification errors. In this setting, the optimal classifier is linear in the log-transformed univariate and bivariate densities that correspond to the tree edges. In practice, observed data may not be well approximated by trees. Yet, motivated by the importance of pairwise dependencies for accurate classification, here we propose to approximate the optimal decision boundary by a sparse linear combination of the univariate and bivariate log-transformed densities. Our proposed approach is semi-parametric in nature: we non-parametrically estimate the univariate and bivariate densities, remove pairs of variables that are nearly independent using the Hilbert-Schmidt independence criterion, and finally construct a linear SVM using the retained log-transformed densities. We demonstrate on synthetic and real data sets, that our classifier, named SLB (sparse log-bivariate density), is competitive with other popular classification methods.
The stability and contraction properties of positive integral semigroups on Polish spaces are investigated. Our novel analysis is based on the extension of V -norm contraction methods, associated to functionally weigh...
详细信息
This work proposes a new fault diagnosis approach for a wind energy conversion (WEC) system. The proposed technique merges the benefits of feature extraction based on Gaussian Process Regression (GPR) and Multi-Class ...
详细信息
This work proposes a new fault diagnosis approach for a wind energy conversion (WEC) system. The proposed technique merges the benefits of feature extraction based on Gaussian Process Regression (GPR) and Multi-Class Random Forest (MCRF)-based fault classification where instances are classified into one or more classes. In the developed GPR-MCRF approach, the nonlinear statistical features including the mean vector M GPR and the variance matrix C GPR are computed using the GPR model with aim of extracting the most relevant features from the WEC system. Then, these features are introduced to the RF classifier for classification and diagnosis purposes. Therefore, the application of the GPR-MCRF technique for WEC systems aims to enhance the use of the classical raw data-based MCRF and diagnosis accuracy. Three kinds of faults (wear-out, open-circuit, and short-circuit faults) are considered in this work. Different case studies are investigated in order to illustrate the effectiveness and robustness of the developed technique compared to the state-of-the-art methods. The obtained results show that the the developed GPR-MCRF technique is an effective feature extraction and fault diagnosis technique for WEC systems.
The development of powerful natural language models have increased the ability to learn meaningful representations of protein sequences. In addition, advances in high-throughput mutagenesis, directed evolution, and ne...
详细信息
In this paper we analyze the dynamical behavior of the tumor suppressor protein p53, an essential player in the cellular stress response, which prevents a cell from dividing if severe DNA damage is present. When this ...
详细信息
In this paper we analyze the dynamical behavior of the tumor suppressor protein p53, an essential player in the cellular stress response, which prevents a cell from dividing if severe DNA damage is present. When this response system is malfunctioning, e.g. due to mutations in p53, uncontrolled cell proliferation may lead to the development of cancer. Understanding the behavior of p53 is thus crucial to prevent its failing. It has been shown in various experiments that periodicity of the p53 signal is one of the main descriptors of its dynamics, and that its pulsing behavior (regular vs. spontaneous) indicates the level and type of cellular stress. In the present work, we introduce an algorithm to score the local periodicity of a given time series (such as the p53 signal), which we call Detrended Autocorrelation Periodicity Scoring (DAPS). It applies pitch detection (via autocorrelation) on sliding windows of the entire time series to describe the overall periodicity by a distribution of localized pitch scores. We apply DAPS to the p53 time series obtained from single cell experiments and establish a correlation between the periodicity scoring of a cell’s p53 signal and the number of cell division events. In particular, we show that high periodicity scoring of p53 is correlated to a low number of cell divisions and vice versa. We show similar results with a more computationally intensive state-of-the-art periodicity scoring algorithm based on topology known as Sw1PerS. This correlation has two major implications: It demonstrates that periodicity scoring of the p53 signal is a good descriptor for cellular stress, and it connects the high variability of p53 periodicity observed in cell populations to the variability in the number of cell division events.
In recent years, tremendous progress has been made on numerical algorithms for solving partial differential equations (PDEs) in a very high dimension, using ideas from either nonlinear (multilevel) Monte Carlo or deep...
详细信息
We introduce a new intrinsic measure of local curvature on point-cloud data called diffusion curvature. Our measure uses the framework of diffusion maps, including the data diffusion operator, to structure point cloud...
详细信息
The aim of the present study is to contribute to the knowledge about the functioning of the neuronal circuits. We built a mathematical-computational model using graph theory for a complex neurophysiological circuit co...
详细信息
The aim of the present study is to contribute to the knowledge about the functioning of the neuronal circuits. We built a mathematical-computational model using graph theory for a complex neurophysiological circuit consisting of a reverberating neuronal circuit and a parallel neuronal circuit, which could be coupled. Implementing our model in C++ and applying neurophysiological values found in the literature, we studied the discharge pattern of the reverberant circuit and the parallel circuit separately for the same input signal pattern, examining the influence of the refractory period and the synaptic delay on the respective output signal patterns. Then, the same study was performed for the complete circuit, in which the two circuits were coupled, and the parallel circuit could then influence the functioning of the reverberant. The results showed that the refractory period played an important role in forming the pattern of the output spectrum of a reverberating circuit. The inhibitory action of the parallel circuit was able to regulate the reverberation frequency, suggesting that parallel circuits may be involved in the control of reverberation circuits related to motive activities underlying precision tasks and perhaps underlying neural work processes and immediate memories.
This paper examines the reproducibility of massive information analytics under particular factors. The paper proposes the “performing Scalable Inference” technique to cope with scalability troubles and to exploit cu...
This paper examines the reproducibility of massive information analytics under particular factors. The paper proposes the “performing Scalable Inference” technique to cope with scalability troubles and to exploit current big statistics platforms for efficient computing and statistics garage of the statistics. In particular, the paper describes how to perform leak-free, parallelizable visible analytics over massive datasets using present extensive records analytics frameworks such as Apache Flink. This method presents an automated manner to execute analytics that preserves reproducibility and the ability to make adjustments without re-running the entire technique. The paper also demonstrates how these analytics may help several real-world use instances, explore affected person cohorts for studies, and develop stratified patient cohorts for hospital therapy. In the end, the paper observes how the proposed method may be exercised within the real world. Actively scalable inference for massive information analytics is pivotal in optimizing decision-making and allocation of assets. Typically, such inferences are made based on information accumulated from numerous sources, databases, unstructured data, and different digital sources. So one can ensure scalability, a complete cloud-primarily based platform has to be hired. This solution will permit the ***, deploying the essential records series and evaluation algorithms are prime here. It could permit the platform to recognize the styles inside the statistics and discover any ability correlations or traits. Additionally, predictive analytics and system mastering strategies may be incorporated to provide insights into the results of the information. In the long run, by leveraging those techniques, the platform can draw efficient inferences and appropriately compare situations in an agile and green way..
Hard-to-predict bursts of COVID-19 pandemic revealed significance of statistical modeling which would resolve spatio-temporal correlations over geographical areas, for example spread of the infection over a city with ...
详细信息
Hard-to-predict bursts of COVID-19 pandemic revealed significance of statistical modeling which would resolve spatio-temporal correlations over geographical areas, for example spread of the infection over a city with census tract granularity. In this manuscript, we provide algorithmic answers to the following two inter-related public health challenges of immense social impact which have not been adequately addressed (1) Inference Challenge: assuming that there are N census blocks (nodes) in the city, and given an initial infection at any set of nodes, e.g. any N of possible single node infections, any N(N-1)=2 of possible two node infections, etc, what is the probability for a subset of census blocks to become infected by the time the spread of the infection burst is stabilized? (2) Prevention Challenge: What is the minimal control action one can take to minimize the infected part of the stabilized state footprint? To answer the challenges, we build a Graphical Model of pandemic of the attractive Ising (pair-wise, binary) type, where each node represents a census tract and each edge factor represents the strength of the pairwise interaction between a pair of nodes, e.g. representing the inter-node travel, road closure and related, and each local bias/field represents the community level of immunization, acceptance of the social distance and mask wearing practice, etc. Resolving the Inference Challenge requires finding the Maximum-A-Posteriory (MAP), i.e. most probable, state of the Ising Model constrained to the set of initially infected nodes. (An infected node is in the +1 state and a node which remained safe is in the-1 state.) We show that almost all attractive Ising Models on dense graphs result in either of the two possibilities (modes) for the MAP state: either all nodes which were not infected initially became infected, or all the initially uninfected nodes remain uninfected (susceptible). This bi-modal solution of the Inference Challenge allows us to re-sta
暂无评论