Detecting the boundaries of protein domains is an important and challenging task in both experimental and computational structural biology. In this paper, a promising method for detecting the domain structure of a pro...
详细信息
Detecting the boundaries of protein domains is an important and challenging task in both experimental and computational structural biology. In this paper, a promising method for detecting the domain structure of a protein from sequence information alone is presented. The method is based on analyzing multiple sequence alignments derived from a database search. Multiple measures are defined to quantify the domain information content of each position along the sequence. Then they are combined into a single predictor using support vector machine. What is more important, the domain detection is first taken as an imbal- anced data learning problem. A novel undersampling method is proposed on distance-based maximal entropy in the feature space of Support Vector Machine (SVM). The overall precision is about 80%. Simulation results demonstrate that the method can help not only in predicting the complete 3D structure of a protein but also in the machine learning system on general im- balanced datasets.
Full-text indices are data structures that can be used to find any substring of a given string. Many full-text indices require space larger than the original string. In this paper, we introduce the canonical Huffman c...
详细信息
Full-text indices are data structures that can be used to find any substring of a given string. Many full-text indices require space larger than the original string. In this paper, we introduce the canonical Huffman code to the wavelet tree of a string T[1. . .n]. Compared with Huffman code based wavelet tree, the memory space used to represent the shape of wavelet tree is not needed. In case of large alphabet, this part of memory is not negligible. The operations of wavelet tree are also simpler and more efficient due to the canonical Huffman code. Based on the resulting structure, the multi-key rank and select functions can be performed using at most nH0 + jRj(lglgn + lgn lgjRj)+O(nH0) bits and in O(H0) time for average cases, where H0 is the zeroth order empirical entropy of T. In the end, we present an efficient construction algorithm for this index, which is on-line and linear.
To obtain the optimal partition of a data set, a hybrid clustering algorithm, PKPSO, based on PSO is proposed. In the proposed PKPSO the PSO algorithm is effectively integrated with the K means algorithm. Among the po...
详细信息
To obtain the optimal partition of a data set, a hybrid clustering algorithm, PKPSO, based on PSO is proposed. In the proposed PKPSO the PSO algorithm is effectively integrated with the K means algorithm. Among the population, selected candidate solutions are further optimized to improve the accuracy by the K-means algorithm. By analyzing the algorithm, the criterions for control parameters selection are determined. Partional clustering result by the proposed PKPSO is compared with that by PSO or by K-means algorithm, and results show that the global convergent property of PKPSO is better than that of the other algorithms. The PKPSO can not only overcome the shortcoming of local minimum trapping of the K-means, but also the solution precision and algorithm stability are better than that of the other two algorithm.
An efficient collision detection method based on separating bounding volume (SBV) is proposed. The positions and shapes of SBVs are determined by the optimal separating support hyper planes of two objects. SBVs not on...
详细信息
A clonal selection based memetic algorithm is proposed for solving job shop scheduling problems in this paper. In the proposed algorithm, the clonal selection and the local search mechanism are designed to enhance exp...
详细信息
A clonal selection based memetic algorithm is proposed for solving job shop scheduling problems in this paper. In the proposed algorithm, the clonal selection and the local search mechanism are designed to enhance exploration and exploitation. In the clonal selection mechanism, clonal selection, hypermutation and receptor edit theories are presented to construct an evolutionary searching mechanism which is used for exploration. In the local search mechanism, a simulated annealing local search algorithm based on Nowicki and Smutnicki's neighborhood is presented to exploit local optima. The proposed algorithm is examined using some well-known benchmark problems. Numerical results validate the effectiveness of the proposed algorithm.
Distributed constraint optimization problem (DCOP) is a kind of optimization problem oriented to large-scale, open and dynamic network environments, which has been widely applied in many fields such as computational g...
详细信息
Distributed constraint optimization problem (DCOP) is a kind of optimization problem oriented to large-scale, open and dynamic network environments, which has been widely applied in many fields such as computational grid, multimedia networks, e-business, enterprise resource planning and so on. Besides the features such as non-linear and constraint-satisfaction which the traditional optimization problems have, DCOP has its distinct features including dynamic evolution, regional information, localized control and asynchronous updating of network states. It has become a challenge for computer scientists to find out a large-scale, parallel and intelligent solution for DCOP. So far, there have been a lot of methods for solving this problem. However, most of them are not completely decentralized and require prior knowledge such as the global structures of networks as their inputs. Unfortunately, for many applications the assumption that the global views of networks can not be obtained beforehand is not true due to their huge sizes, geographical distributions or decentralized controls. To solve this problem, a self-organizing behavior based divide and conquer algorithm is presented, in which multiple autonomous agents distributed in networks work together to solve the DCOP through a proposed self-organization mechanism. Compared with existing methods, this algorithm is completely decentralized and demonstrates good performance in both efficiency and effectiveness.
Reinforcement learning gets optimal policy through trial-and-error and interaction with dynamic environment. Its properties of self-improving and online learning make reinforcement learning become one of most importan...
详细信息
Reinforcement learning gets optimal policy through trial-and-error and interaction with dynamic environment. Its properties of self-improving and online learning make reinforcement learning become one of most important machine learning methods. Against reinforcement learning has been 'curse of dimensionality' troubled by the problem the question, a method of heuristic contour list is proposed on the basis of relational reinforcement learning. The method can represent states, actions and Q-functions through using first-order predications with contour list. Thus advantages of Prolog list can be exerted adequately. The method is to combine logical predication rule with reinforcement learning. A new logical reinforcement learning-CCLORRL is formed and its convergence is proved. The method uses contour shape predicates to build shape state tables, drastically reducing the state space;Using heuristic rules to guide the choice of action can reduce choice blindness when the sample does not exist in the state space. The CCLORRL algorithm is used in the Tetris game. Experiments show that the method is more efficient.
In this paper, we apply the least-square support vector machine (LS-SVM) to operon prediction of Escherichia coli (***), with different combinations of intergenic distance, gene expression data, and phylogenetic profi...
详细信息
ISBN:
(纸本)9781424440085
In this paper, we apply the least-square support vector machine (LS-SVM) to operon prediction of Escherichia coli (***), with different combinations of intergenic distance, gene expression data, and phylogenetic profile. Experimental results demonstrate that the WO pairs tend to have shorter intergenic distances, higher correlation coefficient and much stronger relation of co-envoled between phylogenetic profiles. Also, we dealt with the data sets extracted from WOs¿ and TUBs¿, processed the intergenic distances with log-energy entropy, de-noised the Pearson correlation coefficients of two genes expression data with wavelet transform, and computed the Hamming distances of two phylogenetic profiles. Then we trained LS-SVM using part of the data sets and tested the trained classifier model using the rest data sets. It shows that different combinations of features could affect the prediction results. When the combination of intergenic distance, gene expression data and phylogenetic profile is taken as the input of LS-SVM in the linear kernel type, good results can be obtained, of which the accuracy, sensitivity and specificity are 92.34%, 93.54%, and 90.73%, respectively.
In this paper an efficient copyright protection watermarking algorithm is proposed. By embedding orthogonal vector into the wavelet-tree structure of the host image, we get the watermarked image. At the same time huma...
详细信息
In this paper an efficient copyright protection watermarking algorithm is proposed. By embedding orthogonal vector into the wavelet-tree structure of the host image, we get the watermarked image. At the same time human vision system is considered to get perceptual results. We design an elaborate function for blind watermarking scheme. This function can dynamically determine the embedding position. The theory and experimental results show that our method successfully survives image processing operation, noise adding, the JPEG lossy compression and image cropping. Especially, the scheme is robust towards image sharpening and image enhancement.
暂无评论