Microarray data are highly redundant and noisy, and most genes are believed to be uninformative with respect to studied classes, as only a fraction of genes may present distinct profiles for different classes of sampl...
详细信息
Microarray data are highly redundant and noisy, and most genes are believed to be uninformative with respect to studied classes, as only a fraction of genes may present distinct profiles for different classes of samples. This paper proposed a novel hybrid framework (NHF) for the classification of high dimensional microarray data, which combined information gain(IG), F-score, genetic algorithm(GA), particle swarm optimization(PSO) and support vector machines(SVM). In order to identify a subset of informative genes embedded out of a large dataset which is contaminated with high dimensional noise, the proposed method is divided into three stages. In the first stage, IG is used to construct a ranking list of features, and only 10% features of the ranking list are provided for the second stage. In the second stage, PSO performs the feature selection task combining SVM. F-score is considered as a part of the objective function of PSO. The feature subsets are filtered according to the ranking list from the first stage, and then the results of it are supplied to the initialization of GA. Both the SVM parameter optimization and the feature selection are dynamically executed by PSO. In the third stage, GA initializes the individual of population from the results of the second stage, and an optimal result of feature selection is gained using GA integrating SVM. Both the SVM parameter optimization and the feature selection are dynamically performed by GA. The performance of the proposed method was compared with that of the PSO based, GA based, Ant colony optimization (ACO) based and simulated annealing (SA) based methods on five benchmark data sets, leukemia, colon, breast cancer, lung carcinoma and brain cancer. The numerical results and statistical analysis show that the proposed approach is capable of selecting a subset of predictive genes from a large noisy data set, and can capture the correlated structure in the data. In addition, NHF performs significantly better than th
The current GPM algorithm needs many iterations to get good process models with high fitness which makes the GPM algorithm usually time-consuming and sometimes the result can not be accepted. To mine higher quality mo...
详细信息
The current GPM algorithm needs many iterations to get good process models with high fitness which makes the GPM algorithm usually time-consuming and sometimes the result can not be accepted. To mine higher quality model in shorter time, a heuristic solution by adding log-replay based crossover operator and direct/indirect dependency relation based mutation operator is put forward. Experiment results on 25 benchmark logs show encouraging results.
The loss assessment is an important operation of claim process in insurance industry. On the growing tide of making the insurance information system the in-depth support to optimizing operation and serving insurant, a...
详细信息
The loss assessment is an important operation of claim process in insurance industry. On the growing tide of making the insurance information system the in-depth support to optimizing operation and serving insurant, a methodological framework for the loss assessment is given based on SOA technology, Under the framework, the operation process design, the client design, the service design and the database design are given. These design results have been validated by an actual application system.
Standard pattern classifiers perform on all data features. Whereas, some of the features are redundant or irrelevant, which reduce prediction accuracy, and increase running time of classifier. The purpose of this stud...
详细信息
Most previous works on combinational equivalence checking use BDDs and other Boolean level representations to formulate and solve the problem, and therefore, not utilizing the word-level information inherently present...
详细信息
Analyzing two improved YCH schemes and a multi-secret sharing scheme based on homogeneous linear recursion, we propose and implement a new verifiable multi-secret sharing model based on Shamir secret sharing. The time...
详细信息
Analyzing two improved YCH schemes and a multi-secret sharing scheme based on homogeneous linear recursion, we propose and implement a new verifiable multi-secret sharing model based on Shamir secret sharing. The time complexity of this model in the phase of secrets recovery is O(k×t2), which is superior to other two improved YCH models (O(t3) (t>k) O(k3) (t≤k), O(k×(n+k)2)), and the time of secrets synthesis in the actual simulation is less than that of the other three models. Further, we compare the advantages and disadvantages of the four models on the time complexity, verifiability and open values. When n>k, the open values the new model needs are fewer than those of the other two improved YCH models. The experimental results show that the new model is better than the other three models on the time of secrets recovery.
Nowadays, e-mail is one of the most inexpensive and expeditious means of communication. However, a principal problem of any internet user is the increasing number of spam, and therefore an efficient spam filtering met...
详细信息
Nowadays, e-mail is one of the most inexpensive and expeditious means of communication. However, a principal problem of any internet user is the increasing number of spam, and therefore an efficient spam filtering method is imperative. Feature selection is one of the most important factors, which can influence the classification accuracy rate. To improve the performance of spam prediction, this paper proposes a new fuzzy adaptive multi-population parallel genetic algorithm (FAMGA) for feature selection. To maintain the diversity of population, a few studies of multi-swarm strategy are reported, whereas the dynamic parameter setting has not been considered further. The proposed method is based on multiple subpopulations and each subpopulation runs in independent memory space. For the purpose of controlling the subpopulations adaptively, we put forward two regulation strategies, namely population adjustment and subpopulation adjustment. In subpopulation adjustment, a controller is designed to adjust the crossover rate for each subpopulation, and in population adjustment, a controller is designed to adjust the size of each subpopulation. Three publicly available benchmark corpora for spam filtering, the PU1, Ling-Spam and Spam Assassin, are used in our experiments. The results of experiments show that the proposed method improves the performance of spam filtering, and is significantly better than other feature selection methods. Thus, it is proved that the multi-population regulation strategy can find the optimal feature subset, and prevent premature convergence of the population.
A trusted cryptography module (TCM) is introduced into vehicular electronic systems to solve security issues of a mobile terminal in the Internet of vehicles, such as authentication, integrity assessment and so on. TC...
详细信息
A trusted cryptography module (TCM) is introduced into vehicular electronic systems to solve security issues of a mobile terminal in the Internet of vehicles, such as authentication, integrity assessment and so on. TCM has many security features, including integrity measure, integrity report and trusted storage, which can make a vehicular electronic system achieve a trusted boot process and encrypt data and information for storage protection, thereby improving the privacy and security of vehicular platform information. Referring to the TNC(trusted network connect)architecture, we design a kind of vehicular terminal access model compatible with the TNC to authenticate and authorize the trusted vehicular platform with TCM to access Internet of vehicles. The vehicular terminal access model proposed will improve the credibility and security of the Internet of vehicles.
A new method for simulating the folding pathway of RNA secondary structure using the modified ant colony algorithmis *** a given RNA sequence,the set of all possible stems is obtained and the energy of each stem iscal...
详细信息
A new method for simulating the folding pathway of RNA secondary structure using the modified ant colony algorithmis *** a given RNA sequence,the set of all possible stems is obtained and the energy of each stem iscalculated and stored at the initial ***,a more realistic formula is used to compute the energy ofmulti-branch loop in the following *** a folding pathway is simulated,including such processes as constructionof the heuristic information,the rule of initializing the pheromone,the mechanism of choosing the initial andnext stem and the strategy of updating the pheromone between two different *** by testing RNA sequences withknown secondary structures from the public databases,we analyze the experimental data to select appropriate values *** measure indexes show that our procedure is more consistent with phylogenetically proven structures thansoftware RNAstructure sometimes and more effective than the standard Genetic Algorithm.
For the low accuracy and imprecise of contour detection evaluation algorithm for a natural image with Ground Truth, we proposed a neighborhood matching algorithm, which had been combined with the original algorithm, a...
详细信息
For the low accuracy and imprecise of contour detection evaluation algorithm for a natural image with Ground Truth, we proposed a neighborhood matching algorithm, which had been combined with the original algorithm, and then modified the accuracy of the formula, at last we got the contour detection evaluation algorithm of this paper. Through assessing the contour detection images processed by Canny operator, compared with the original algorithm, The algorithm improved the accuracy value of assessment of contour detection.
暂无评论