Chord progressions are the building blocks from which tonal music is constructed. The choice of a particular representation for chords has a strong impact on statistical modeling of the dependence between chord symbol...
详细信息
Chord progressions are the building blocks from which tonal music is constructed. The choice of a particular representation for chords has a strong impact on statistical modeling of the dependence between chord symbols and the actual sequences of notes in polyphonic music. Melodic prediction is used in this paper as a benchmark task to evaluate the quality of four chord representations using two probabilistic model architectures derived from Input/Output Hidden Markov Models (IOHMMs). Likelihoods and conditional and unconditional prediction error rates are used as complementary measures of the quality of each of the proposed chord representations. We observe empirically that different chord representations are optimal depending on the chosen evaluation metric. Also, representing chords only by their roots appears to be a good compromise in most of the reported experiments. (C) 2009 Elsevier B.V. All rights reserved.
Mobile ad hoc networks rely on the opportunistic interaction of autonomous nodes to form networks without the use of infrastructure. Given the radically decentralized nature of such networks, their potential for auton...
详细信息
Mobile ad hoc networks rely on the opportunistic interaction of autonomous nodes to form networks without the use of infrastructure. Given the radically decentralized nature of such networks, their potential for autonomous communication is significantly improved when the need for a priori consensus among the nodes is kept to a minimum. This paper addresses an issue within the domain of semantic content discovery, namely, its current reliance on the preexisting agreement between the schema of content providers and consumers. We present OntoMobil, a semantic discovery model for ad hoc networks that removes the assumption of a globally known schema and allows nodes to publish information autonomously. The model relies on the randomized dissemination and replication of metadata through a gossip protocol. Given schemas with partial similarities, the randomized metadata dissemination mechanism facilitates eventual semantic agreement and provides a substrate for the scalable discovery of content. A discovery protocol can then utilize the replicated metadata to identify content within a predictable number of hops using semantic queries. A stochastic analysis of the gossip protocol presents the different trade-offs between discoverability and replication. We evaluate the proposed model by comparing OntoMobil against a broadcast-based protocol and demonstrate that semantic discovery with proactive replication provides good scalability properties, resulting in a high discovery ratio with less overhead than a reactive nonreplicated discovery approach.
Economists attempting to build econometric or forecasting models are frequently restricted due to data scarcity in terms of short time series of data, and also of parameter non constancy and under-specification. In th...
详细信息
Economists attempting to build econometric or forecasting models are frequently restricted due to data scarcity in terms of short time series of data, and also of parameter non constancy and under-specification. In this case, a realistic alternative is often to guess rather than to estimate parameters of such models. An algorithm of repetitive guessing (drawing) parameters from iteratively changing distributions, with the objective of minimizing the squares of ex-post prediction errors, weighted by penalty weights and subject to a learning process, has been recently introduced. Despite obvious advantages, especially when applied for undersized empirical models with a large number of parameters, applications of Repetitive Stochastic Guesstimation have been, so far, limited. This has presumably been caused by the lack of rigorous proof of its convergence. Such proof for a class of linear models, both identifiable (in the economic sense) and not, is provided in this article.
We exhibit a probabilistic symbolic algorithm for solving zero-dimensional sparse systems. Our algorithm combines a symbolic homotopy procedure, based on a flat deformation of a certain morphism of affine varieties, w...
详细信息
We exhibit a probabilistic symbolic algorithm for solving zero-dimensional sparse systems. Our algorithm combines a symbolic homotopy procedure, based on a flat deformation of a certain morphism of affine varieties, with the polyhedral deformation of Huber and Sturmfels. The complexity of our algorithm is cubic in the size of the combinatorial structure of the input system. This size is mainly represented by the cardinality and mixed volume of Newton polytopes of the input polynomials and an arithmetic analogue of the mixed volume associated to the deformations under consideration.
An ensemble is a group of learners that work together as a committee to solve a problem. The existing ensemble learning algorithms often generate unnecessarily large ensembles, which consume extra computational resour...
详细信息
An ensemble is a group of learners that work together as a committee to solve a problem. The existing ensemble learning algorithms often generate unnecessarily large ensembles, which consume extra computational resource and may degrade the generalization performance. Ensemble pruning algorithms aim to find a good subset of ensemble members to constitute a small ensemble, which saves the computational resource and performs as well as, or better than, the unpruned ensemble. This paper introduces a probabilistic ensemble pruning algorithm by choosing a set of "sparse" combination weights, most of which are zeros, to prune the ensemble. In order to obtain the set of sparse combination weights and satisfy the nonnegative constraint of the combination weights, a left-truncated, nonnegative, Gaussian prior is adopted over every combination weight. Expectation propagation (EP) algorithm is employed to approximate the posterior estimation of the weight vector. The leave-one-out (LOO) error can be obtained as a by-product in the training of EP without extra computation and is a good indication for the generalization error. Therefore, the LOO error is used together with the Bayesian evidence for model selection in this algorithm. An empirical study on several regression and classification benchmark data sets shows that our algorithm utilizes far less component learners but performs as well as, or better than, the unpruned ensemble. Our results are very competitive compared with other ensemble pruning algorithms.
We present a stochastic sequence evolution model to obtain alignments and estimate mutation rates between two homologous sequences. The model allows two possible evolutionary behaviors along a DNA sequence in order to...
详细信息
We present a stochastic sequence evolution model to obtain alignments and estimate mutation rates between two homologous sequences. The model allows two possible evolutionary behaviors along a DNA sequence in order to determine conserved regions and take its heterogeneity into account. In our model, the sequence is divided into slow and fast evolution regions. The boundaries between these sections are not known. It is our aim to detect them. The evolution model is based on a fragment insertion and deletion process working on fast regions only and on a substitution process working on fast and slow regions with different rates. This model induces a pair hidden Markov structure at the level of alignments, thus making efficient statistical alignment algorithms possible. We propose two complementary estimation methods, namely, a Gibbs sampler for Bayesian estimation and a stochastic version of the EM algorithm for maximum likelihood estimation. Both algorithms involve the sampling of alignments. We propose a partial alignment sampler, which is computationally less expensive than the typical whole alignment sampler. We show the convergence of the two estimation algorithms when used with this partial sampler. Our algorithms provide consistent estimates for the mutation rates and plausible alignments and sequence segmentations on both simulated and real data.
We discuss the complexity of robust symbolic algorithms solving a significant class of zero-dimensional square polynomial systems with rational coefficients over the complex numbers, called generalized Pham systems, w...
详细信息
We discuss the complexity of robust symbolic algorithms solving a significant class of zero-dimensional square polynomial systems with rational coefficients over the complex numbers, called generalized Pham systems, which represent the class of zero-dimensional homogeneous complete-intersection systems with "no points at infinity". Our notion of robustness models the behavior of all known universal methods for solving (parametric) polynomial systems avoiding unnecessary branchings and allowing the solution of certain limit problems. We first show that any robust algorithm solving generalized Pham systems has complexity at least polynomial in the B,zout number of the system under consideration. Then we exhibit a robust probabilistic algorithm which solves generalized Pham systems with quadratic complexity in the B,zout number of the input system. The algorithm consists in a series of homotopies deforming the input system into a system which is "easy to solve", together with a "projection algorithm" which allows to move the solutions of the known instance to the solutions of an arbitrary instance along the parameter space.
A recent probabilistic approach for searching in high dimensional metric spaces is based on predicting the distances between database elements according to how they order their distances towards some set of distinguis...
详细信息
ISBN:
(纸本)9780769537658
A recent probabilistic approach for searching in high dimensional metric spaces is based on predicting the distances between database elements according to how they order their distances towards some set of distinguished elements, called permutants. In the preprocessing phase a set of permutants is chosen, and are sorted (permuted) by their distances against every database element. The permutations form the index. When a query is given, its corresponding permutation is computed, and as similar elements will (probably) have a similar permutation the database is compared in the order induced by the similarity between permutations. This works well but has relatively high CPU time due to computing the distances between permutations and (partially) sorting the database by the similarity. We improve this by identifying and solving this as another metric space problem. This avoids many distance computations between the permutants. The experimental results show that this works extremely well in practice.
A new method for tracking multiple objects in an intelligent space is proposed in this paper. The observation model is based on a camera ring statically mounted at the ceiling of the environment in order to obtain all...
详细信息
ISBN:
(纸本)9781424450589
A new method for tracking multiple objects in an intelligent space is proposed in this paper. The observation model is based on a camera ring statically mounted at the ceiling of the environment in order to obtain all relevant information related to the different objects that wonder (get into and go out) in the space of interest. In the paper, the two subsystems used to track all static and dynamic entities wondering in the intelligent space: a three-dimensional reconstruction of these entities;and, an individual track of all these entities in their movement along the environment with probabilistic techniques. The reliability, and robustness of the proposal presented is finally also demonstrated in this paper with different tests.
In order to determine the similarity between two planar shapes, which is an important problem in computer vision and pattern recognition, it is necessary to first match the two shapes as good as possible. As sets of a...
详细信息
ISBN:
(纸本)9783642002014
In order to determine the similarity between two planar shapes, which is an important problem in computer vision and pattern recognition, it is necessary to first match the two shapes as good as possible. As sets of allowed transformation to match shapes we consider translations, rigid motions, and similarities. We present a generic probabilistic algorithm based on random sampling for matching shapes which are modelled by sets of curves. The algorithm is applicable to the three considered classes of transformations. We analyze which similarity measure is optimized by the algorithm and give rigorous bounds on the number of samples necessary to get a prespecified approximation to the optimal match within a prespecified probability.
暂无评论