Diagnosability indicates that whether the fault can be detected in finite time, which is an important property in model based diagnosis. As diagnosis depends on the sensor placement and the modeling, it is hard to mak...
详细信息
Diagnosability indicates that whether the fault can be detected in finite time, which is an important property in model based diagnosis. As diagnosis depends on the sensor placement and the modeling, it is hard to make a choice whether to place more sensors for testing or to compute more diagnosis pathways in practical application. In this paper, a method is proposed to resolve this by defining key point. The key points take priority on testing among the existing sensors. In these key points, the observation can be optimized and be diagnosis tested efficiently. Experimental result indicates that this method achieves efficient faults distinction and identification, and reduces the cost of sensors and the computational complexity of diagnostic in the case of normal behavior.
Feature selection is an effective technique to put the high dimension of data down, which is prevailing in many application domains, such as text categorization and bio-informatics, and can bring many advantages, such...
详细信息
In this paper, we propose PSOfold, a particle swarm optimization for RNA secondary structure prediction. PSOfold is based on the recently published IPSO. We present two strategies to improve the performance of IPSO. F...
详细信息
Filtering techniques are used in Constraint Satisfaction Problems to remove all the local inconsistencies during a processing step or prune the search tree efficiently during search. Local consistencies are used as pr...
详细信息
Most existing storages of the ontology use relational databases as a backend to manage RDF data. This motivates us to translate SPARQL queries, the proposed standard for RDF querying, into equivalent SQL queries. At t...
详细信息
Microarray data are highly redundant and noisy, and most genes are believed to be uninformative with respect to studied classes, as only a fraction of genes may present distinct profiles for different classes of sampl...
详细信息
Microarray data are highly redundant and noisy, and most genes are believed to be uninformative with respect to studied classes, as only a fraction of genes may present distinct profiles for different classes of samples. This paper proposed a novel hybrid framework (NHF) for the classification of high dimensional microarray data, which combined information gain(IG), F-score, genetic algorithm(GA), particle swarm optimization(PSO) and support vector machines(SVM). In order to identify a subset of informative genes embedded out of a large dataset which is contaminated with high dimensional noise, the proposed method is divided into three stages. In the first stage, IG is used to construct a ranking list of features, and only 10% features of the ranking list are provided for the second stage. In the second stage, PSO performs the feature selection task combining SVM. F-score is considered as a part of the objective function of PSO. The feature subsets are filtered according to the ranking list from the first stage, and then the results of it are supplied to the initialization of GA. Both the SVM parameter optimization and the feature selection are dynamically executed by PSO. In the third stage, GA initializes the individual of population from the results of the second stage, and an optimal result of feature selection is gained using GA integrating SVM. Both the SVM parameter optimization and the feature selection are dynamically performed by GA. The performance of the proposed method was compared with that of the PSO based, GA based, Ant colony optimization (ACO) based and simulated annealing (SA) based methods on five benchmark data sets, leukemia, colon, breast cancer, lung carcinoma and brain cancer. The numerical results and statistical analysis show that the proposed approach is capable of selecting a subset of predictive genes from a large noisy data set, and can capture the correlated structure in the data. In addition, NHF performs significantly better than th
The current GPM algorithm needs many iterations to get good process models with high fitness which makes the GPM algorithm usually time-consuming and sometimes the result can not be accepted. To mine higher quality mo...
详细信息
The current GPM algorithm needs many iterations to get good process models with high fitness which makes the GPM algorithm usually time-consuming and sometimes the result can not be accepted. To mine higher quality model in shorter time, a heuristic solution by adding log-replay based crossover operator and direct/indirect dependency relation based mutation operator is put forward. Experiment results on 25 benchmark logs show encouraging results.
The loss assessment is an important operation of claim process in insurance industry. On the growing tide of making the insurance information system the in-depth support to optimizing operation and serving insurant, a...
详细信息
The loss assessment is an important operation of claim process in insurance industry. On the growing tide of making the insurance information system the in-depth support to optimizing operation and serving insurant, a methodological framework for the loss assessment is given based on SOA technology, Under the framework, the operation process design, the client design, the service design and the database design are given. These design results have been validated by an actual application system.
The core idea of clustering algorithm is the division of data into groups of similar objects. Some clustering algorithms are proven good performance on document clustering, such as k-means and UPGMA etc. However, few ...
详细信息
Image annotation is a challenging problem due to the rapid growing of real world image archives. In this paper, we propose a novel approach to the solving of this problem based on a variant of the support vector clust...
详细信息
Image annotation is a challenging problem due to the rapid growing of real world image archives. In this paper, we propose a novel approach to the solving of this problem based on a variant of the support vector clustering (SVC) algorithm, i.e., the support vector description of clusters. The system has two major components, the training process and the annotating process. In the training process, clusters of image manually annotated by descriptive words are used as training instances. Each cluster is described by a one-cluster SVC model. The proposed model can exploit the advantage of SVC for its ability to delineate cluster boundaries of arbitrary shape. Moreover, the training process of the one-cluster SVC model is formulated as the process of building density estimator for underlying distribution of the cluster. In the annotating process, for a test image, the probability of this instance being generated by each model is computed. And then the relevant words are selected based on the obtained probabilities. Simulated experiments were conducted on the Corel60k data set. The results demonstrate the performance of the proposed algorithm, compared with the performance of other algorithms.
暂无评论