We study the following problem: Given an array A storing N real numbers, preprocess it to allow fast reporting of the K smallest elements in the subarray A[i, j] in sorted order, for any triple (i, j, K) with 1 ≤ i ...
详细信息
ISBN:
(纸本)9780898719932
We study the following problem: Given an array A storing N real numbers, preprocess it to allow fast reporting of the K smallest elements in the subarray A[i, j] in sorted order, for any triple (i, j, K) with 1 ≤ i ≤ j ≤ N and 1 ≤ K ≤ j - i + 1. We are interested in scenarios where the array A is large, necessitating an I/O-efficient solution. For a parameter f with 1 ≤ f ≤ log_mn, we construct a data structure that uses O((N/f) log_mn) space and achieves a query bound of O(log_BN + fK/B) I/Os, where B is the block size, M is the size of the main memory, n := N/B, and m := M/B. Our main contribution is to show that this solution is nearly optimal. To be precise, we show that achieving a query bound of O((logn)~α + fK/B) I/Os, for any constantα, requires Ω(N((f~(-1)log_M n)/log(f~(-1)log_M n)) space, assuming B = Ω(logN). For M ≥ B~(1+ε), this is within a log log_mn factor of the upper bound. The lower bound assumes indivisibility of records and holds even if we assume K is always set to j - 1 + 1. We also show that it is the requirement that the K smallest elements be reported in sorted order which makes the problem hard. If the K smallest elements in the query range can be reported in any order, then we can obtain a linear-size data structure with a query bound of O(log~B N + K/B) I/Os.
Unsupervised classification has emerged as a popular technique for pattern recognition, image processing and data mining. It has a crucial contribution in the resolution of the problems arising from content-based imag...
详细信息
We present a language independent approach for conflation that does not depend on predefined rules or prior knowledge of the target language. The proposed unsupervised method is based on an enhancement of the pure n-g...
详细信息
We present a language independent approach for conflation that does not depend on predefined rules or prior knowledge of the target language. The proposed unsupervised method is based on an enhancement of the pure n-gram model that is used to group related words based on a revised string-similarity measure. In order to detect and eliminate terms that are created by this process, but that are most likely not relevant for the query (”noisy terms”), an approach based on mutual information scores computed based on web statistical cooccurrences data is proposed. Furthermore, an evaluation of this approach is presented.
Particle Swarm Optimisation (PSO) algorithm is known to be better than Genetic Algorithm (GA) as fewer operators are needed in its algorithm. However, it still has some weaknesses such as immature convergence;a condit...
详细信息
This paper introduces an approach to build a geographical ontology of countries at the global scale. Our approach is based on the free online encyclopedia Wikipedia to extract lists of places, then rebuild a hierarchy...
详细信息
Firearms identification is a vital aim of firearm analysis. The firing pin impression image on a cartridge case from a fired bullet is one of the most significant clues in firearms identification. In this study, a set...
详细信息
Firearms identification is a vital aim of firearm analysis. The firing pin impression image on a cartridge case from a fired bullet is one of the most significant clues in firearms identification. In this study, a set of data which focused on selected 6 features of firing pin impression images before an entirety of five different pistols of South African made; the Parabellum Vector SPI 9mm model, were used. The numerical features are geometric moments of whole image computed from a total of 747 cartridge case images. Under pattern recognition theory, the supervised features of firing pin impression images were then trained and validated using a two-layer backpropagation neural network (BPNN) design with computed hidden layers. A two-layer 6-7-5 connections BPNN of sigmoid/linear transfer functions with `trainlm' algorithm was found to yield the best classification result using cross-validation, where 96% of the images were correctly classified according to the pistols used. Moreover, the network was trained under very small mean-square error (MSE=0.01). This means that neural network method is capable to learn and validate well the numerical features of whole firing pin impression with high precision and fast classification results.
Particle Swarm Optimisation (PSO) algorithm is known to be better than Genetic Algorithm (GA) as fewer operators are needed in its algorithm. However, it still has some weaknesses such as immature convergence; a condi...
详细信息
Particle Swarm Optimisation (PSO) algorithm is known to be better than Genetic Algorithm (GA) as fewer operators are needed in its algorithm. However, it still has some weaknesses such as immature convergence; a condition whereby PSO tends to get trapped in a local optimum. This condition prevents them from being converged towards a better position. Various techniques have been proposed to tackle this problem by many means. This paper attempts to integrate several velocity-based reinitialisation (VBR) approaches in PSO for solving feature selection problem. Five benchmark datasets of various features dimension were used to implement the approaches. The results were analysed based on classifier performance and the selected number of features. The findings show that the proposed VBR is generally significantly better than the existing VBR approaches.
This paper describes the CDRAS (Call Detail Records Analysis System) system, the motivation behind it, its approach and its background. The system aims at dealing with the notorious Man-in-the-Middle attack in the con...
详细信息
In this paper, we use strict mathematics reasoning to discover the relation between the threshold and reduction in Fuzzy Variable Precision Rough Sets (FVPRS), i.e., the reductions act as a nested structure with the m...
详细信息
In this paper, we use strict mathematics reasoning to discover the relation between the threshold and reduction in Fuzzy Variable Precision Rough Sets (FVPRS), i.e., the reductions act as a nested structure with the monotonously increasing threshold. By using the nested structure of reductions, we could design algorithms to quickly find different reductions when a reduction is required. Here `different' means the reductions obtained using different thresholds.
暂无评论