This paper presents an efficient and robust automatic process for large-scale sports video analysis. The proposed system firstly identifies the genre of the query video, and then accomplishes the interesting event det...
详细信息
Roadmap methods were widely used in route planning fields, both for robots and unmanned aircrafts. Traditional roadmap is constituted by connecting the vertexes of convex obstacle, which is related to the locations of...
详细信息
Many theoretical and experimental results have appeared recently on the stability of T-S fuzzy systems and the convergence of the Particle Swarm Optimization (PSO) algorithm. In this paper, we present a T-S fuzzy stoc...
详细信息
Many theoretical and experimental results have appeared recently on the stability of T-S fuzzy systems and the convergence of the Particle Swarm Optimization (PSO) algorithm. In this paper, we present a T-S fuzzy stochastic PSO model in which the PSO algorithm is viewed as a time-invariant linear plant with a time-varying feedback controller that is embedded in the T-S fuzzy state system. The randomly weighted sum of the cognition component and social component is used as the state feedback controller in the local linear state system, and the PSO algorithm is theoretically improved from one that performs single stochastic optimization to one that performs fuzzy stochastic optimization. Conditions for asymptotic stability of the new model are given using the T-S fuzzy stability theory.
Data indexing is common in data mining when working with high-dimensional, large-scale data sets. Hadoop, a cloud computing project using the MapReduce framework in Java, has become of significant interest in distribu...
详细信息
Data indexing is common in data mining when working with high-dimensional, large-scale data sets. Hadoop, a cloud computing project using the MapReduce framework in Java, has become of significant interest in distributed data mining. A feasible distributed data indexing algorithm is proposed for Hadoop data mining, based on ZSCORE binning and inverted indexing and on the Hadoop SequenceFile format. A data mining framework on Hadoop using the Java Persistence API (JPA) and MySQL Cluster is proposed. The framework is elaborated in the implementation of a decision tree algorithm on Hadoop. We compare the data index-ing algorithm with Hadoop MapFile indexing, which performs a binary search, in a modest cloud environment. The results show the algorithm is more efficient than naïve MapFile indexing. We compare the JDBC and JPA implementations of the data mining framework. The performance shows the framework is efficient for data mining on Hadoop.
The enlarging volumes of data resources produced in real world makes classification of very large scale data a challenging task. Therefore, parallel process of very large high dimensional data is very important. Hyper...
The enlarging volumes of data resources produced in real world makes classification of very large scale data a challenging task. Therefore, parallel process of very large high dimensional data is very important. Hyper-Surface Classification (HSC) is approved to be an effective and efficient classification algorithm to handle two and three dimensional data. Though HSC can be extended to deal with high dimensional data with dimension reduction or ensemble techniques, it is not trivial to tackle high dimensional data directly. Inspired by the decision tree idea, an improvement of HSC is proposed to deal with high dimensional data directly in this work. Furthermore, we parallelize the improved HSC algorithm (PHSC) to handle large scale high dimensional data based on MapReduce framework, which is a current and powerful parallel programming technique used in many fields. Experimental results show that the parallel improved HSC algorithm not only can directly deal with high dimensional data, but also can handle large scale data set. Furthermore, the evaluation criterions of scaleup, speedup and sizeup validate its efficiency.
With the rapid development of XML language which has good flexibility and interoperability, more and more log files of software running information are represented in XML format, especially for Web services. Fault dia...
With the rapid development of XML language which has good flexibility and interoperability, more and more log files of software running information are represented in XML format, especially for Web services. Fault diagnosis by analyzing semi-structured and XML like log files is becoming an important issue in this area. For most related learning methods, there is a basic assumption that training data should be in identical structure, which does not hold in many situations in practice. In order to learn from training data in different structures, we propose a similarity-based Bayesian learning approach for fault diagnosis in this paper. Our method is to first estimate similarity degrees of structural elements from different log files. Then the basic structure of combined Bayesian network (CBN) is constructed, and the similarity-based learning algorithm is used to compute probabilities in CBN. Finally, test log data can be classified into possible fault categories based on the generated CBN. Experimental results show our approach outperforms other learning approaches on those training datasets which have different structures.
In our real world, there usually exist several different objects in one image, which brings intractable challenges to the traditional pattern recognition methods to classify the images. In this paper, we introduce a C...
详细信息
In our real world, there usually exist several different objects in one image, which brings intractable challenges to the traditional pattern recognition methods to classify the images. In this paper, we introduce a Conditional Random Fields (CRFs) model to deal with the Multi-label Image Classification problem. Considering the correlations of the objects, a second-order CRFs is constructed to capture the semantic associations between labels. Different initial feature weights are set to introduce the voting techniques for a better performance. We evaluate our methods on MSRC dataset and demonstrate high precision, recall and F 1 measure, showing that our method is competitive.
Artificial Neural Networks (ANNs), as a nonlinear and adaptive informationprocessing systems, play an important role in machine learning, artificial intelligence, and data mining. But the performance of ANNs is sensi...
详细信息
Artificial Neural Networks (ANNs), as a nonlinear and adaptive informationprocessing systems, play an important role in machine learning, artificial intelligence, and data mining. But the performance of ANNs is sensitive to the number of neurons, and chieving a better network performance and simplifying the network topology are two competing objectives. While Genetic Algorithms (GAs) is a kind of random search algorithm which simulates the nature selection and evolution, which has the advantages of good global search abilities and learning the approximate optimal solution without the gradient information of the error functions. This paper makes a brief survey on ANNs optimization with GAs. Firstly, the basic principles of ANNs and GAs are introduced, by analyzing the advantages and disadvantages of GAs and ANNs, the superiority of using GAs to optimize ANNs is expressed. Secondly, we make a brief survey on the basic theories and algorithms of optimizing the network weights, optimizing the network architecture and optimizing the learning rules, and make a discussion on the latest research progresses. At last, we make a prospect on the development trend of the theory.
Back-Propagation (BP) neural network, as one of the most mature and most widespread algorithms, has the ability of large scale computing and has unique advantages when dealing with nonlinear high dimensional data. But...
详细信息
Back-Propagation (BP) neural network, as one of the most mature and most widespread algorithms, has the ability of large scale computing and has unique advantages when dealing with nonlinear high dimensional data. But when we manipulate high dimensional data with BP neural network, many feature variables provide enough information, but too many network inputs go against designing of the hidden-layer of the network and take up plenty of storage space as well as computing time, and in the process interfere the convergence of the training network, even influence the the accuracy of recognition finally. Factor analysis (FA) is a multivariate analysis method which transforms many feature variables into few synthetic variables. Aiming at the characteristics that the samples processed have more feature variables, combining with the structure feature of BP neural network, a FA-BP neural network algorithm is proposed. Firstly we reduce the dimensionality of the feature factor using FA, and then regard the features reduced as the input of the BP neural network, carry on network training and simulation with low dimensional data that we get. This algorithm here can simplify the network structure, improve the velocity of convergence, and save the running time. Then we apply the new algorithm in the field of pest prediction to emulate. The results show that under the prediction precision is not reduced, the error of the prediction value is reduced by using the new algorithm, and therefore the algorithm is effective.
By use of the properties of ant colony algorithm and genetic algorithm, a novel ant colony genetic hybrid algorithm, whose framework of hybrid algorithm is genetic algorithm, is proposed to solve the traveling salesma...
详细信息
暂无评论