Nonlinear dimensionality reduction has so far been treated either as a data representation p.o.lem or as a search for a lower-dimensional manifold embedded in the data space. A main application for both is in informat...
详细信息
Nonlinear dimensionality reduction has so far been treated either as a data representation p.o.lem or as a search for a lower-dimensional manifold embedded in the data space. A main application for both is in information visualization, to make visible the neighborhood or p.o.imity relationships in the data, but neither app.o.ch has been designed tooptimize this task. We give such visualization a new conceptualization as an information retrieval p.o.lem;a p.o.ection is good if neighbors of data points can be retrieved well based on the visualized p.o.ected points. This makes it possible to rigorously quantify goodness in terms of precision and recall. A method is introduced tooptimize retrieval quality;it turns out to be an extension of Stochastic Neighbor Embedding, one of the earlier nonlinear p.o.ection methods, for which we give a new interpretation: it optimizes recall. The new method is shown empirically tooutperform existing dimensionality reduction methods.
Any order parameter quantifying the degree of organisation in a physical system can be studied in connection to source extraction algorithms. Independent component analysis (ICA) by minimising the mutual informationo...
详细信息
ISBN:
(纸本)2930307099
Any order parameter quantifying the degree of organisation in a physical system can be studied in connection to source extraction algorithms. Independent component analysis (ICA) by minimising the mutual informationof the sources falls into that line of thought, since it can be interpreted as searching components with low complexity. Complexity pursuit, a modification minimising Kolmogorov complexity, is a further example. In this article a specific case of order in complex networks of self- sustained oscillators is discussed, with the objective of recovering original synchronisation pattern between them. The app.o.ch is put in relation with ICA.
Failure management in p.o.ess industry has difficult tasks. Decision sup.o.t in control rooms of nuclear power plants is needed. A p.o.otype that uses Self-organizing Map (SoM) method is under development in an indust...
详细信息
Failure management in p.o.ess industry has difficult tasks. Decision sup.o.t in control rooms of nuclear power plants is needed. A p.o.otype that uses Self-organizing Map (SoM) method is under development in an industrial p.o.ect. This paper has focus on failure detection and separation. A literature survey outlines the state-of-the-art and reflects our study to related works. Different SoM visualizations are used. Failure management scenarios are carried out to experiment the methodology and the Man-Machine Interface (MMI). U-matrix trajectory analysis and quantization error are discussed more in detail. The experiments show the usefulness of the chosen app.o.ch. Next step will be to add more practical views by analyzing real and simulated industrial data with the control room tool and by feedback from the end users.
We introduce a method that learns a class-discriminative subspace or discriminative components of data. Such a subspace is useful for visualization, dimensionality reduction, feature extraction, and for learning a reg...
详细信息
We introduce a new search strategy, in which the information retrieval (IR) query is inferred from eye movements measured when the user is reading text during an IR task. In training phase, we know the users' inte...
详细信息
We introduce a new search strategy, in which the information retrieval (IR) query is inferred from eye movements measured when the user is reading text during an IR task. In training phase, we know the users' interest, that is, the relevance of training documents. We learn a predictor that p.o.uces a "query" given the eye movements;the target of learning is an "optimal" query that is computed based on the known relevance of the training documents. Assuming the predictor is universal with respect to the users' interests, it can also be applied to infer the implicit query when we have noprior knowledge of the users' interests. The result of an empirical study is that it is possible to learn the implicit query from a small set of read documents, such that relevance predictions for a large set of unseen documents are ranked significantly better than by random guessing.
In this work, the p.o.lem of real-time monitoring of p.o.ucts' p.o.erties from spectrop.o.oscopic measurements is presented. Light absorbance spectra are used as inputs to soft sensors that estimate outputs otherw...
详细信息
In this work, the p.o.lem of real-time monitoring of p.o.ucts' p.o.erties from spectrop.o.oscopic measurements is presented. Light absorbance spectra are used as inputs to soft sensors that estimate outputs otherwise difficult to measure on-line. Toovercome the issues associated to calibrating the estimation models from very highdimensional inputs and a reduced number of observations, we p.o.ose to select only a subset of relevant inputs emerging from the topological structure of the data. The topologically preserving representation is performed using the Self-organizing Map (SoM) and the relevance measured from the U-matrices. Being based on a selection of original spectral variables, the resulting models retain the chemical interpretability of the underlying system. Moreover, the app.o.ch is independent on the regression model to be embedded in the soft sensors. In this paper, the utility of the Measures of Topological Relevance (MTR) over the SoM is discussed on two full-scale p.o.lems from refining and pharmaceutical industry.
In this paper, we address the p.o.lem of deriving bounds for the moments of nearest neighbor distributions. The bounds are formulated for the general case and specifically applied to the p.o.lem of noise variance esti...
详细信息
ISBN:
(纸本)2930307099
In this paper, we address the p.o.lem of deriving bounds for the moments of nearest neighbor distributions. The bounds are formulated for the general case and specifically applied to the p.o.lem of noise variance estimation with the Delta test and the Gamma test. For this p.o.lem, we focus on the rate of convergence and the bias of the estimators and validate the theoretical achievements with experimental results.
peer-to-peer (p2p) overlay networks are currently being used to build large scale distributed systems running various decentralized applications like distributed storage, content distribution, collaborative scheduling...
peer-to-peer (p2p) overlay networks are currently being used to build large scale distributed systems running various decentralized applications like distributed storage, content distribution, collaborative scheduling, and leader election. Although we have p.o.ocols like Byzantine agreement, voting schemes etc. for building resilient distributed applications; we have very few solutions available for safeguarding these distributed p.o.ocols from Sybil attacks. In a Sybil attack, an adversary could forge multiple identities and create multiple, distinct nodes in the system hence overthrowing any upper bound on number of malicious nodes in these p.o.ocols. In this paper, we present a multipath routing p.o.ocol using graph theoretic app.o.ch to group the Sybil nodes first and then topoll them using host identity p.o.ocol (HIp) to decide upon whether they really belong to a Sybil group. HIp clearly separates participating users from overlay nodes. It overcomes p2p network challenges like stability over time and identity differentiation. We also use a social network where the attack edges are minimum. An attack edge between a malicious user and an honest user indicates that the malicious user is able to establish a trust relationship with the honest user by some means. We perform simulations to show the feasibility of our distributed p.o.ocol.
In this study, a method for hierarchical examination and visualization of GSM data using the Self-organizing Map (SoM) is described. The data is examined in few phases. At first temporally averaged data is used and th...
详细信息
The paper presents an algorithm for identifying the independent subspace analysis model based on source dynamics. We p.o.ose to separate subspaces by decoupling their dynamic models. Each subspace is extracted by mini...
详细信息
暂无评论