Detecting the boundaries of protein domains is an important and challenging task in both experimental and computational structural biology. In this paper, a promising method for detecting the domain structure of a pro...
详细信息
Detecting the boundaries of protein domains is an important and challenging task in both experimental and computational structural biology. In this paper, a promising method for detecting the domain structure of a protein from sequence information alone is presented. The method is based on analyzing multiple sequence alignments derived from a database search. Multiple measures are defined to quantify the domain information content of each position along the sequence. Then they are combined into a single predictor using support vector machine. What is more important, the domain detection is first taken as an imbal- anced data learning problem. A novel undersampling method is proposed on distance-based maximal entropy in the feature space of Support Vector Machine (SVM). The overall precision is about 80%. Simulation results demonstrate that the method can help not only in predicting the complete 3D structure of a protein but also in the machine learning system on general im- balanced datasets.
Full-text indices are data structures that can be used to find any substring of a given string. Many full-text indices require space larger than the original string. In this paper, we introduce the canonical Huffman c...
详细信息
Full-text indices are data structures that can be used to find any substring of a given string. Many full-text indices require space larger than the original string. In this paper, we introduce the canonical Huffman code to the wavelet tree of a string T[1. . .n]. Compared with Huffman code based wavelet tree, the memory space used to represent the shape of wavelet tree is not needed. In case of large alphabet, this part of memory is not negligible. The operations of wavelet tree are also simpler and more efficient due to the canonical Huffman code. Based on the resulting structure, the multi-key rank and select functions can be performed using at most nH0 + jRj(lglgn + lgn lgjRj)+O(nH0) bits and in O(H0) time for average cases, where H0 is the zeroth order empirical entropy of T. In the end, we present an efficient construction algorithm for this index, which is on-line and linear.
To obtain the optimal partition of a data set, a hybrid clustering algorithm, PKPSO, based on PSO is proposed. In the proposed PKPSO the PSO algorithm is effectively integrated with the K means algorithm. Among the po...
详细信息
To obtain the optimal partition of a data set, a hybrid clustering algorithm, PKPSO, based on PSO is proposed. In the proposed PKPSO the PSO algorithm is effectively integrated with the K means algorithm. Among the population, selected candidate solutions are further optimized to improve the accuracy by the K-means algorithm. By analyzing the algorithm, the criterions for control parameters selection are determined. Partional clustering result by the proposed PKPSO is compared with that by PSO or by K-means algorithm, and results show that the global convergent property of PKPSO is better than that of the other algorithms. The PKPSO can not only overcome the shortcoming of local minimum trapping of the K-means, but also the solution precision and algorithm stability are better than that of the other two algorithm.
A clonal selection based memetic algorithm is proposed for solving job shop scheduling problems in this paper. In the proposed algorithm, the clonal selection and the local search mechanism are designed to enhance exp...
详细信息
A clonal selection based memetic algorithm is proposed for solving job shop scheduling problems in this paper. In the proposed algorithm, the clonal selection and the local search mechanism are designed to enhance exploration and exploitation. In the clonal selection mechanism, clonal selection, hypermutation and receptor edit theories are presented to construct an evolutionary searching mechanism which is used for exploration. In the local search mechanism, a simulated annealing local search algorithm based on Nowicki and Smutnicki's neighborhood is presented to exploit local optima. The proposed algorithm is examined using some well-known benchmark problems. Numerical results validate the effectiveness of the proposed algorithm.
In this paper, we apply the least-square support vector machine (LS-SVM) to operon prediction of Escherichia coli (***), with different combinations of intergenic distance, gene expression data, and phylogenetic profi...
详细信息
ISBN:
(纸本)9781424440085
In this paper, we apply the least-square support vector machine (LS-SVM) to operon prediction of Escherichia coli (***), with different combinations of intergenic distance, gene expression data, and phylogenetic profile. Experimental results demonstrate that the WO pairs tend to have shorter intergenic distances, higher correlation coefficient and much stronger relation of co-envoled between phylogenetic profiles. Also, we dealt with the data sets extracted from WOs¿ and TUBs¿, processed the intergenic distances with log-energy entropy, de-noised the Pearson correlation coefficients of two genes expression data with wavelet transform, and computed the Hamming distances of two phylogenetic profiles. Then we trained LS-SVM using part of the data sets and tested the trained classifier model using the rest data sets. It shows that different combinations of features could affect the prediction results. When the combination of intergenic distance, gene expression data and phylogenetic profile is taken as the input of LS-SVM in the linear kernel type, good results can be obtained, of which the accuracy, sensitivity and specificity are 92.34%, 93.54%, and 90.73%, respectively.
In this paper an efficient copyright protection watermarking algorithm is proposed. By embedding orthogonal vector into the wavelet-tree structure of the host image, we get the watermarked image. At the same time huma...
详细信息
In this paper an efficient copyright protection watermarking algorithm is proposed. By embedding orthogonal vector into the wavelet-tree structure of the host image, we get the watermarked image. At the same time human vision system is considered to get perceptual results. We design an elaborate function for blind watermarking scheme. This function can dynamically determine the embedding position. The theory and experimental results show that our method successfully survives image processing operation, noise adding, the JPEG lossy compression and image cropping. Especially, the scheme is robust towards image sharpening and image enhancement.
Focused on a variation of the Euclidean traveling salesman problem (TSP), namely the prize-collecting traveling salesman problem with time windows (PCTSPTW), this paper presents a novel ant colony optimization solving...
详细信息
Focused on a variation of the Euclidean traveling salesman problem (TSP), namely the prize-collecting traveling salesman problem with time windows (PCTSPTW), this paper presents a novel ant colony optimization solving method. The time window constraints are considered in the computation for the probability of selection of the next city. The parameters of the algorithm are analyzed by experiments. Numerical results also show that the proposed method is effective for the PCTSPTW problem.
The E-commerce information on the Surface Web is supported by the Deep Web, which can not be accessed directly by the search engines or the web crawlers. The only way to access the backend database is through query in...
详细信息
The flowshop scheduling problem has been widely studied in the literature and many techniques have been applied to it, but few algorithms have been proposed to solve it using particle swarm optimization algorithm (PSO...
The flowshop scheduling problem has been widely studied in the literature and many techniques have been applied to it, but few algorithms have been proposed to solve it using particle swarm optimization algorithm (PSO) based algorithm. In this paper, an improved PSO algorithm (IPSO) based on the ldquoall differentrdquo constraint is proposed to solve the flowshop scheduling problem with the objective of minimizing makespan. It combines the particle swarm optimization algorithm with genetic operators together effectively. When a particle is going to stagnates, the mutation operator is used to search its neighborhood. The proposed algorithm is tested on different scale benchmarks and compared with the recently proposed efficient algorithms. The results show that both the solution quality and the convergent speed of the IPSO algorithm precede the other two recently proposed algorithms. It can be used to solve large scale flowshop scheduling problem effectively.
暂无评论