Observer agreement is often regarded as the sine qua non of observational research. Cohen's kappa is a widely used index and is appropriate when discrete entities-such as a turn-of-talk or a demarcated time interv...
详细信息
Observer agreement is often regarded as the sine qua non of observational research. Cohen's kappa is a widely used index and is appropriate when discrete entities-such as a turn-of-talk or a demarcated time interval-are presented to pairs of observers to code. kappa-like statistics and agreement matrices are also used for the timed-event sequential data produced when observers first segment and then code events detected in the stream of behavior, noting onset and offset times. Such kappa s are of two kinds: time-based and event-based. Available for download is a computer program (OASTES;Observer Agreement for Simulated Timed Event Sequences) that simulates the coding of observers of a stated accuracy and then computes agreement statistics for two time-based kappa s (with and without tolerance) and three event-based kappa s (one implemented in The Observer, one in INTERACT, and one in GSEQ). On the basis of simulation results presented here, and due to the somewhat different information provided by each, the reporting of both a time-based and an event-based kappa is recommended.
The join operation combines information from multiple data sources. Efficient processing of join queries is a pivotal issue in most database systems. My PhD research focuses on joins in two categories of novel applica...
详细信息
The join operation combines information from multiple data sources. Efficient processing of join queries is a pivotal issue in most database systems. My PhD research focuses on joins in two categories of novel applications. The first is continuous joins in data streams. Specifically, we exploit two key properties of the streaming join. First, the initial plan of a long query may gradually become inefficient due to changes in data characteristics. This necessitates dynamic plan migration, an online transition from the old plan to a more efficient one generated based on current statistics. The only known solutions MS and PT have some serious shortcomings. Hence, we propose HybMig, which combines their merits, and outperforms them on every aspect. Another important property is that an output tuple from an upstream join (called the producer) may never generate any result in downstream operators (the consumers) during its entire lifespan. Motivated by this, we propose just-in-time (JIT) processing, a novel methodology that enables a producer to selectively generate outputs based on feedback returned from consumers that express their current demand. Extensive experiments show that JIT achieves significant savings in terms of both CPU time and memory consumption. The second class of joins studied in this thesis are authenticated ones in outsourced databases. In particular, database outsourcing requires that the query server constructs a proof of result correctness, which can be verified by the client using the data owner's signature. Addressing such queries, we propose a comprehensive set of new solutions that cover the entire spectrum of index availability. Furthermore, we extend them to authenticate complex queries, involving multi-way joins and other relational operators. Our experiments demonstrate that, the proposed methods outperform two existing benchmark solutions, often by orders of magnitude.
In order to obtain the greatest economic benefit of synthetic circuit with the minimum energy consumption of the unit methanol, this paper analyzes the mechanism of the methanol synthesis, brings forward a control alg...
详细信息
ISBN:
(纸本)9780769535166
In order to obtain the greatest economic benefit of synthetic circuit with the minimum energy consumption of the unit methanol, this paper analyzes the mechanism of the methanol synthesis, brings forward a control algorithm based on chain system, finds out two major factors which affect the energy consumption and output, builds a chain system model made up of two related chains and designs the prediction algorithm and the chain control system for the methanol synthesis process. The simulation indicates that the chain system model has high stability, anti-interference ability and fault-folerance, so the process can be controlled effectively.
An adaptive denoising algorithm based on local sparse representation (local SR) is proposed. The basic idea is applying SR locally to clusters of signals embedded in a high-dimensional space of delayed coordinates. ...
详细信息
An adaptive denoising algorithm based on local sparse representation (local SR) is proposed. The basic idea is applying SR locally to clusters of signals embedded in a high-dimensional space of delayed coordinates. The clusters of signals are represented by the sparse linear combinations of atoms depending on the nature of the signal. The algorithm is applied to noisy chaotic signals denoising for testing its performance. In comparison with recently reported leading alternative denoising algorithms such as kernel principle component analysis (Kernel PCA), local independent component analysis (local ICA), local PCA, and wavelet shrinkage (WS), the proposed algorithm is more efficient.
Data mining is the process of extracting hidden patterns from data. As more data is gathered, with the amount of data doubling every three years, data mining is becoming an increasingly important tool to transform thi...
详细信息
ISBN:
(数字)9781614707974
ISBN:
(纸本)9781607412892
Data mining is the process of extracting hidden patterns from data. As more data is gathered, with the amount of data doubling every three years, data mining is becoming an increasingly important tool to transform this data into information. It is commonly used in a wide range of profiling practices, such as marketing, surveillance, fraud detection and scientific discovery. Consequently, data management is the development, execution and supervision of plans, policies, programs and practices that control, protect, deliver and enhance the value of data and information assets. Data management comprises all the disciplines related to managing data as a valuable resource. This new and important book gathers the latest research from around the globe in these fields and relative topics such as:cognitive finance, data mining of the Indian mineral industry, managing building information models, a new co-training method for data mining, and others.
We propose a new approach to rigorously prove the existence of the steady-state degree distribution for the BA network. The approach is based on a vector Markov chain of vertex numbers in the network evolving process....
详细信息
We propose a new approach to rigorously prove the existence of the steady-state degree distribution for the BA network. The approach is based on a vector Markov chain of vertex numbers in the network evolving process. This framework provides a rigorous theoretical basis for the rate equation approach which has been widely applied to many problems in the field of complex networks, e.g., epidemic spreading and dynamic synchronization.
The transmembrane permeation of eight small (molecular weight < 100) organic molecules across a phospholipid bilayer is investigated by multiscale molecular dynamics simulation. The bilayer and hydrating water are ...
详细信息
The transmembrane permeation of eight small (molecular weight < 100) organic molecules across a phospholipid bilayer is investigated by multiscale molecular dynamics simulation. The bilayer and hydrating water are represented by simplified, efficient coarse-grain models, whereas the permeating molecules are described by a standard atomic-level force-field. Permeability properties are obtained through a refined version of the z-constraint algorithm. By constraining each permeant at selected depths inside the bilayer, we have sampled free energy differences and diffusion coefficients across the membrane. These data have been combined, according to the inhomogeneous solubility-diffusion model, to yield the permeability coefficients. The results are generally consistent with previous atomic-level calculations and available experimental data. Computationally, Our multiscale approach proves 2 orders of magnitude faster than traditional atomic-level methods.
In this paper, based on a weighted projection of a bipartite user-object network, we introduce a personalized recommendation algorithm, called network-based inference (NBI), which has higher accuracy than the classica...
详细信息
In this paper, based on a weighted projection of a bipartite user-object network, we introduce a personalized recommendation algorithm, called network-based inference (NBI), which has higher accuracy than the classical algorithm, namely collaborative filtering. In NBI, the correlation resulting from a specific attribute may be repeatedly counted in the cumulative recommendations from different objects. By considering the higher order correlations, we design an improved algorithm that can, to some extent, eliminate the redundant correlations. We test our algorithm on two benchmark data sets, MovieLens and Netflix. Compared with NBI, the algorithmic accuracy, measured by the ranking score, can be further improved by 23% for MovieLens and 22% for Netflix. The present algorithm can even outperform the Latent Dirichlet Allocation algorithm, which requires much longer computational time. Furthermore, most previous studies considered the algorithmic accuracy only;in this paper, we argue that the diversity and popularity, as two significant criteria of algorithmic performance, should also be taken into account. With more or less the same accuracy, an algorithm giving higher diversity and lower popularity is more favorable. Numerical results show that the present algorithm can outperform the standard one simultaneously in all five adopted metrics: lower ranking score and higher precision for accuracy, larger Hamming distance and lower intra-similarity for diversity, as well as smaller average degree for popularity.
Imaging algorithms recently developed in ultrasonic nondestructive testing (NDT) have shown good potential for defect characterization. Many of them are based on the concept of collecting the full matrix of data, obta...
详细信息
Imaging algorithms recently developed in ultrasonic nondestructive testing (NDT) have shown good potential for defect characterization. Many of them are based on the concept of collecting the full matrix of data, obtained by firing each element of an ultrasonic phased array independently, while collecting the data with all elements. Because of the finite sound velocity in the test structure, 2 consecutive firings must be separated by a minimum time interval. Depending on the number of elements in a given array, this may become problematic if data must be collected within a short time, as it is often the case, for example, in an industrial context. An obvious way to decrease the duration of data capture is to use a sparse transmit aperture, in which only a restricted number of elements are used to transmit ultrasonic waves. This paper compares 2 approaches aimed at producing an image on the basis of restricted data: the common source method and the effective aperture technique. The effective aperture technique is based on the far-field approximation, and no similar approach exists for the near-field. This paper investigates the performance of this technique in near-field conditions, where most NDT applications are made. First, these methods are described and their point spread functions are compared with that of the Total Focusing Method (TFM), which consists of focusing the array at every point in the image. Then, a map of efficiency is given for the different algorithms in the near-field. The map can be used to select the most appropriate algorithm. Finally, this map is validated by testing the different algorithms on experimental data.
暂无评论