data preprocessing is important in machinelearning, datamining, and patternrecognition. In particular, selecting relevant features in high-dimensional data is often necessary to efficiently construct models that ac...
详细信息
Feature selection is one of the most important issues in the fields such as datamining, patternrecognition and machinelearning. In this study, a new feature selection approach that combines the Fisher criterion and...
详细信息
ISBN:
(纸本)9781424409723
Feature selection is one of the most important issues in the fields such as datamining, patternrecognition and machinelearning. In this study, a new feature selection approach that combines the Fisher criterion and principal feature analysis (PFA) is proposed in order to identify the important (relevant and irredundant) feature subset. the Fisher criterion is used to remove features that are noisy or irrelevant, and then PFA is used to choose a subset of principal features. the proposed approach was evaluated in pattern classification on five publicly available datasets. the experimental results show that the proposed approach can largely reduce the feature dimensionality with little loss of classification accuracy.
the field of time series datamining has seen an explosion of interest in recent years. this interest has flowed over into many applications areas, including fiber manufacturing systems. the volume of time series data...
详细信息
ISBN:
(纸本)9780889866461
the field of time series datamining has seen an explosion of interest in recent years. this interest has flowed over into many applications areas, including fiber manufacturing systems. the volume of time series data generated by a fiber monitoring system can be huge. this limits the applicability of datamining algorithms to this problem domain. A widely used solution is to reduce the data size through feature extraction. Four of the mostly commonly used feature extraction techniques are Fourier transforms, Wavelets, Piecewise Aggregate Approximation, and Piecewise Linear Approximation (PLA). In this paper, we first empirically demonstrate that PLA techniques produce the highest quality features for this problem domain. We then introduce a novel PLA algorithm that is shown to produce higher quality features than any other currently available techniques.
datamining is rapidly evolving areas of research that are at the intersection of several disciplines, including statistics, databases, patternrecognition, and high-performance and parallel computing. In this paper w...
详细信息
ISBN:
(纸本)0769528740
datamining is rapidly evolving areas of research that are at the intersection of several disciplines, including statistics, databases, patternrecognition, and high-performance and parallel computing. In this paper we propose a novel mining algorithm, called ARMAGA (Association rules mining Algorithm based on a novel Genetic Algorithm), to mine the association rules from an image database, where every image is represented by the ARAMGA representation. We first take advantage of the genetic algorithm designed specifically for discovering association rules. Second we propose the Algorithm Compared to the algorithm in [1], and the ARMAGA algorithm avoids generating impossible candidates, and therefore is more efficient in terms of the execution time.
In the paper, the fractal property of rotating machinery vibration signals and the principle of fractal data compression are summarized reviewed. Based on the fractal property, an approach for vibration signal data co...
详细信息
ISBN:
(纸本)9781424410651
In the paper, the fractal property of rotating machinery vibration signals and the principle of fractal data compression are summarized reviewed. Based on the fractal property, an approach for vibration signal data compression and reconstruction is proposed. In this method, a signal is represented by parameters of affine maps and is reconstructed according to self-similarity represented by the IFS parameters. the total data size of such a representation is far less than the original time domain data size. To demonstrate the effectiveness of this method to resolving the bottleneck in remote transmission of large amount signals and improving the capability of remote equipment fault diagnosis system, the presented method has been applied to some actual vibration signals as well as simulation signals.
the hybridization of optimization techniques can exploit the strengths of different approaches and avoid their weaknesses. In this work we present a hybrid optimization algorithm based on the combination of Evolution ...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
the hybridization of optimization techniques can exploit the strengths of different approaches and avoid their weaknesses. In this work we present a hybrid optimization algorithm based on the combination of Evolution Strategies (ES) and Locally Weighted Linear Regression (LWLR). In this hybrid a local algorithm (LWLR) proposes a new solution that is used by a global algorithm (ES) to produce new better solutions. this new hybrid is applied in solving an interesting and difficult problem in astronomy, the two-dimensional fitting of brightness profiles in galaxy images. the use of standardized fitting functions is arguably the most powerful method for measuring the large-scale features (e.g. brightness distribution) and structure of galaxies, specifying parameters that can provide insight into the formation and evolution of galaxies. Here we employ the hybrid algorithm ES+LWLR to find models that describe the bi-dimensional brightness profiles for a set of optical galactic images. Models are created using two functions: de Vaucoleurs and exponential, which produce models that are expressed as sets of concentric generalized ellipses that represent the brightness profiles of the images. the problem can be seen as an optimization problem because we need to minimize the difference between the flux from the model and the flux from the original optical image, following a normalized Euclidean distance. We solved this optimization problem using our hybrid algorithm ES+LWLR. We have obtained results for a set of 100 galaxies, showing that hybrid algorithm is very well suited to solve this problem.
the S-transform is a time frequency analysis technique combining properties of the short-time Fourier and wavelet transforms. It provides frequency-dependent resolution while maintaining g a direct relationship with t...
详细信息
ISBN:
(纸本)9781424410651
the S-transform is a time frequency analysis technique combining properties of the short-time Fourier and wavelet transforms. It provides frequency-dependent resolution while maintaining g a direct relationship withthe Fourier spectrum. However, the frequency resolution of S-transform in high-frequency is unsatisfactory. In this paper, we present a data-adaptive S-transform by optimizing the window width according to the measure of 'concentration'. the proposed method is tested on a set of synthetic signals. the result shows that the proposed algorithm achieves higher resolution and energy concentration than the original S-transform and short-time Fourier transform.
this paper proposes an effective data hiding scheme for a binary host image for the purpose of access control, authentication and copyright protection of digital media. the binary image is divided into blocks of size ...
详细信息
ISBN:
(纸本)9781424410651
this paper proposes an effective data hiding scheme for a binary host image for the purpose of access control, authentication and copyright protection of digital media. the binary image is divided into blocks of size 2x2. these blocks are classified as embeddable or non-embeddable blocks according to their characteristic values, and a binary data sequence is embedded in those embeddable blocks by changing their characteristic values. the advantages of the method are low calculation, high embedding capacity, good security and maintaining high qualify of the host image.
data clustering is a long standing research problem in patternrecognition, computer vision, machinelearning, and datamining with applications in a number of diverse disciplines. the goal is to partition a set of n ...
详细信息
Lumber moisture content is a key parameter for regulating and controlling wood drying process. Its precision directly affects the drying quality, cost and drying time. In this paper a fusion model capable of on-line m...
详细信息
ISBN:
(纸本)9781424410651
Lumber moisture content is a key parameter for regulating and controlling wood drying process. Its precision directly affects the drying quality, cost and drying time. In this paper a fusion model capable of on-line measuring lumber moisture content is presented. Models for predicting lumber moisture content are established using both back-propagation neural networks (BPNN) and dynamical recurrent Mural networks (DRNN). Furthermore, the two models are integrated by arithmetic average and recursive estimation algorithm. the simulation result, which is worked out by experimental data, shows that fusion model have a higher predictive precision than any one of BP Mural network's and DRNN's, therefore, this method is proved to be feasible.
暂无评论