The enlarging volumes of data resources produced in real world makes classification of very large scale data a challenging task. Therefore, parallel process of very large high dimensional data is very important. Hyper...
The enlarging volumes of data resources produced in real world makes classification of very large scale data a challenging task. Therefore, parallel process of very large high dimensional data is very important. Hyper-Surface Classification (HSC) is approved to be an effective and efficient classification algorithm to handle two and three dimensional data. Though HSC can be extended to deal with high dimensional data with dimension reduction or ensemble techniques, it is not trivial to tackle high dimensional data directly. Inspired by the decision tree idea, an improvement of HSC is proposed to deal with high dimensional data directly in this work. Furthermore, we parallelize the improved HSC algorithm (PHSC) to handle large scale high dimensional data based on MapReduce framework, which is a current and powerful parallel programming technique used in many fields. Experimental results show that the parallel improved HSC algorithm not only can directly deal with high dimensional data, but also can handle large scale data set. Furthermore, the evaluation criterions of scaleup, speedup and sizeup validate its efficiency.
With the rapid development of XML language which has good flexibility and interoperability, more and more log files of software running information are represented in XML format, especially for Web services. Fault dia...
With the rapid development of XML language which has good flexibility and interoperability, more and more log files of software running information are represented in XML format, especially for Web services. Fault diagnosis by analyzing semi-structured and XML like log files is becoming an important issue in this area. For most related learning methods, there is a basic assumption that training data should be in identical structure, which does not hold in many situations in practice. In order to learn from training data in different structures, we propose a similarity-based Bayesian learning approach for fault diagnosis in this paper. Our method is to first estimate similarity degrees of structural elements from different log files. Then the basic structure of combined Bayesian network (CBN) is constructed, and the similarity-based learning algorithm is used to compute probabilities in CBN. Finally, test log data can be classified into possible fault categories based on the generated CBN. Experimental results show our approach outperforms other learning approaches on those training datasets which have different structures.
A variety of wavelet transform methods have been introduced to remove noise from images. However, many of these algorithms remove the fine details and smooth the structures of the image when removing noise. The wavele...
详细信息
A variety of wavelet transform methods have been introduced to remove noise from images. However, many of these algorithms remove the fine details and smooth the structures of the image when removing noise. The wavelet coefficient magnitude sum (WCMS) algorithm can preserve edges, but it is at the expense of removing noise. The Non-Local means algorithm can removing noise effective. But it tend to cause distortion ( eg white). Meanwhile, when the noise is large, the method is not so effective. In this paper, we propose an efficient denoising algorithm. we denoised the image with non-local means algorithm in the spatial domain and WCMS algorithm in wavelet domain, weighted, combined them and got the image that we want. The experiment shows that our algorithm can improve PSNR form 0.6 dB to 1.0 dB and the image boundary is more clearly.
In our real world, there usually exist several different objects in one image, which brings intractable challenges to the traditional pattern recognition methods to classify the images. In this paper, we introduce a C...
详细信息
In our real world, there usually exist several different objects in one image, which brings intractable challenges to the traditional pattern recognition methods to classify the images. In this paper, we introduce a Conditional Random Fields (CRFs) model to deal with the Multi-label Image Classification problem. Considering the correlations of the objects, a second-order CRFs is constructed to capture the semantic associations between labels. Different initial feature weights are set to introduce the voting techniques for a better performance. We evaluate our methods on MSRC dataset and demonstrate high precision, recall and F 1 measure, showing that our method is competitive.
Cone-Beam Computed Tomography (CBCT) has always been in the forefront of medical image processing. The denoising as a image pre-processing, has a great affected on the image analysis and recognition. In this paper, a ...
详细信息
Cone-Beam Computed Tomography (CBCT) has always been in the forefront of medical image processing. The denoising as a image pre-processing, has a great affected on the image analysis and recognition. In this paper, a new algorithm for image denoising was proposed. By thresholding the interscale wavelet coefficient magnitude sum(WCMS) within a cone of influence (COI), the wavelet coefficients are classified into 2 categories: irregular coefficients, and edge-related and regular coefficients. They are processed by different ways. Meanwhile according to the projection image sequences characteristics in CBCT system, an effective noise variance estimated methods was proposed. The experiment shows that our algorithm can improve PSNR form 1.3dB to 2.6dB, and the image border is more clearly.
Support Vector Machine (SVM) is a classification technique of machine learning based on statistical learning theory. A quadratic optimization problem needs to be solved in the algorithm, and with the increase of the s...
详细信息
This paper is concerned with blind separation of convolutive sources. The main idea is to make an explicit exploitation of block Toeplitz structure and block-inner diagonal structure in autocorrelation matrices of sou...
详细信息
This paper is concerned with blind separation of convolutive sources. The main idea is to make an explicit exploitation of block Toeplitz structure and block-inner diagonal structure in autocorrelation matrices of source signals at different time delays as well as of inherent relations among these matrices. With implementation of joint block diagonalization, a tri-quadratic cost function is introduced so that the mixture matrix can be extracted from a set of the correlation matrices of the observed vector sequence without pre-whitening. In this novel one-stage algorithm, every iteration step involves finding the closed solution to the corresponding least squares problem. Once the estimate of the mixing matrix is obtained, the source signals are retrieved by the classical least squares methods. The performance of the proposed algorithm is illustrated by simulation results.
Conventional pulse compression use a periodical echo of single receive antenna, which is modulated by a certain carrier-frequency, in other words, single spectrum is exploited. But for MIMO radar, as the multi-carrier...
详细信息
Conventional pulse compression use a periodical echo of single receive antenna, which is modulated by a certain carrier-frequency, in other words, single spectrum is exploited. But for MIMO radar, as the multi-carrier-frequency signals are transmitted simultaneously, if the spectrum of the target echo after channel separation can be combined to form the whole band spectrum echo, the corresponding range resolution can improve several times as compared with the conventional method, and it will be more convenient for follow-up detection and tracking. Considering the difference between the frequency modulation band and the interval between the adjacent frequencies, the spectrum joint after channel separation will be overlapped or spaced. The methods of spectrum moving of each echo and the spectrum extrapolation with Root-MUSIC algorithm are proposed, by which high-resolution range profile of the target is obtained. Simulation results verify the validity of these methods.
Although many methods of refining initialization have appeared, the sensitivity of K-Means to initial centers is still an obstacle in applications. In this paper, we investigate a new class of clustering algorithm, K-...
详细信息
ISBN:
(纸本)9781424475421
Although many methods of refining initialization have appeared, the sensitivity of K-Means to initial centers is still an obstacle in applications. In this paper, we investigate a new class of clustering algorithm, K-Alpha Means (KAM), which is insensitive to the initial centers. With K-Harmonic Means as a special case, KAM dynamically weights data points during iteratively updating centers, which deemphasizes data points that are close to centers while emphasizes data points that are not close to any centers. Through replacing minimum operator in K-Means by alpha-mean operator, KAM significantly improves the clustering performances.
Co-occurrence histograms of oriented gradients (CoHOG) are powerful descriptors in object detection. In this paper, we propose to utilize a very large pool of CoHOG features with variable-location and variable-size bl...
详细信息
Co-occurrence histograms of oriented gradients (CoHOG) are powerful descriptors in object detection. In this paper, we propose to utilize a very large pool of CoHOG features with variable-location and variable-size blocks to capture salient characteristics of the object structure. We consider a CoHOG feature as a block with a special pattern described by the offset. A boosting algorithm is further introduced to select the appropriate locations and offsets to construct an efficient and accurate cascade classifier. Experimental results on public datasets show that our approach simultaneously achieves high accuracy and fast speed on both pedestrian detection and car detection tasks.
暂无评论