The study of codon usage bias is an important research area that contributes to our understanding of molecular evolution, phylogenetic relationships, respiratory lifestyle, and other characteristics. Translational eff...
详细信息
The study of codon usage bias is an important research area that contributes to our understanding of molecular evolution, phylogenetic relationships, respiratory lifestyle, and other characteristics. Translational efficiency bias is perhaps the most well-studied codon usage bias, as it is frequently utilized to predict relative protein expression levels. We present a novel approach to isolating translational efficiency bias in microbial genomes. There are several existent methods for isolating translational efficiency bias. Previous approaches are susceptible to the confounding influences of other potentially dominant biases. Additionally, existing approaches to identifying translational efficiency bias generally require both genomic sequence information and prior knowledge of a set of highly expressed genes. This novel approach provides more accurate results from sequence information alone by resisting the confounding effects of other biases. We validate this increase in accuracy in isolating translational efficiency bias on 10 microbial genomes, five of which have proven particularly difficult for existing approaches due to the presence of strong confounding biases.
Metaheuristic search techniques have been extensively used to automate the process of generating test cases, and thus providing solutions for a more cost-effective testing process. This approach to test automation, of...
详细信息
Metaheuristic search techniques have been extensively used to automate the process of generating test cases, and thus providing solutions for a more cost-effective testing process. This approach to test automation, often coined "Search-based Software Testing" (SBST), has been used for a wide variety of test case generation purposes. Since SBST techniques are heuristic by nature, they must be empirically investigated in terms of how costly and effective they are at reaching their test objectives and whether they scale up to realistic development artifacts. However, approaches to empirically study SBST techniques have shown wide variation in the literature. This paper presents the results of a systematic, comprehensive review that aims at characterizing how empirical studies have been designed to investigate SBST cost-effectiveness and what empirical evidence is available in the literature regarding SBST cost-effectiveness and scalability. We also provide a framework that drives the data collection process of this systematic review and can be the starting point of guidelines on how SBST techniques can be empirically assessed. The intent is to aid future researchers doing empirical studies in SBST by providing an unbiased view of the body of empirical evidence and by guiding them in performing well-designed and executed empirical studies.
In recent years, more and more high-throughput data sources useful for protein complex prediction have become available (e.g., gene sequence, mRNA expression, and interactions). The integration of these different data...
详细信息
In recent years, more and more high-throughput data sources useful for protein complex prediction have become available (e.g., gene sequence, mRNA expression, and interactions). The integration of these different data sources can be challenging. Recently, it has been recognized that kernel-based classifiers are well suited for this task. However, the different kernels (data sources) are often combined using equal weights. Although several methods have been developed to optimize kernel weights, no large-scale example of an improvement in classifier performance has been shown yet. In this work, we employ an evolutionary algorithm to determine weights for a larger set of kernels by optimizing a criterion based on the area under the ROC curve. We show that setting the right kernel weights can indeed improve performance. We compare this to the existing kernel weight optimization methods (i.e., (regularized) optimization of the SVM criterion or aligning the kernel with an ideal kernel) and find that these do not result in a significant performance improvement and can even cause a decrease in performance. Results also show that an expert approach of assigning high weights to features with high individual performance is not necessarily the best strategy.
Random testing is a low cost strategy that can be applied to a wide range of testing problems. While the cost and straightforward application of random testing are appealing, these benefits must be evaluated against t...
详细信息
Random testing is a low cost strategy that can be applied to a wide range of testing problems. While the cost and straightforward application of random testing are appealing, these benefits must be evaluated against the reduced effectiveness due to the generality of the approach. Recently, a number of novel techniques, coined Adaptive Random Testing, have sought to increase the effectiveness of random testing by attempting to maximize the testing coverage of the input domain. This paper presents the novel application of an evolutionary search algorithm to this problem. The results of an extensive simulation study are presented in which the evolutionary approach is compared against the Fixed Size Candidate Set (FSCS), Restricted Random Testing (RRT), quasi-random testing using the Sobol sequence (Sobol), and random testing (RT) methods. The evolutionary approach was found to be superior to FSCS, RRT, Sobol, and RT amongst block patterns, the arena in which FSCS, and RRT have demonstrated the most appreciable gains in testing effectiveness. The results among fault patterns with increased complexity were shown to be similar to those of FSCS, and RRT;and showed a modest improvement over Sobol, and RT. A comparison of the asymptotic and empirical runtimes of the evolutionary search algorithm, and the other testing approaches, was also considered, providing further evidence that the application of an evolutionary search algorithm is feasible, and within the same order of time complexity as the other adaptive random testing approaches.
UCS is a s (u) under bar pervised learning (c) under bar lassifier (s) under bar ystem that was introduced in 2003 for classification in data mining tasks. The representation of a rule in UCS as a univariate classific...
详细信息
UCS is a s (u) under bar pervised learning (c) under bar lassifier (s) under bar ystem that was introduced in 2003 for classification in data mining tasks. The representation of a rule in UCS as a univariate classification rule is straightforward for a human to understand. However, the system may require a large number of rules to cover the input space. Artificial neural networks ( NNs), on the other hand, normally provide a more compact representation. However, it is not a straightforward task to understand the network. In this paper, we propose a novel way to incorporate NNs into UCS. The approach offers a good compromise between compactness, expressiveness, and accuracy. By using a simple artificial NN as the classifier's action, we obtain a more compact population size, better generalization, and the same or better accuracy while maintaining a reasonable level of expressiveness. We also apply negative correlation learning ( NCL) during the training of the resultant NN ensemble. NCL is shown to improve the generalization of the ensemble.
The complexity of the selection procedure of a genetic algorithm that requires reordering, if we restrict the class of the possible fitness functions to varying fitness functions, is O(N log N), where N is the size of...
详细信息
The complexity of the selection procedure of a genetic algorithm that requires reordering, if we restrict the class of the possible fitness functions to varying fitness functions, is O(N log N), where N is the size of the population. The quantum genetic optimization algorithm (QGOA) exploits the power of quantum computation in order to speed up genetic procedures. In QGOA, the classical fitness evaluation and selection procedures are replaced by a single quantum procedure. While the quantum and classical geneticalgorithms use the same number of generations, the QGOA requires fewer operations to identify the high-fitness subpopulation at each generation. We show that the complexity of our QGOA is o(1) in terms of number of oracle calls in the selection procedure. Such theoretical results are confirmed by the simulations of the algorithm.
The cDNA microarray is an important tool for generating large data sets of gene expression measurements. An efficient design is critical to ensure that the experiment will be able to address relevant biological questi...
详细信息
The cDNA microarray is an important tool for generating large data sets of gene expression measurements. An efficient design is critical to ensure that the experiment will be able to address relevant biological questions. Microarray experimental design can be treated as a multicriterion optimization problem. For this class of problems, evolutionaryalgorithms (EAs) are well suited, as they can search the solution space and evolve a design that optimizes the parameters of interest based on their relative value to the researcher under a given set of constraints. This paper introduces the use of EAs for optimization of experimental designs of spotted microarrays using a weighted objective function. The EA and the various criteria relevant to design optimization are discussed. Evolved designs are compared with designs obtained through exhaustive search with results suggesting that the EA can find just as efficient optimal or near-optimal designs within a tractable timeframe.
Providing efficient workload management is an important issue for a large-scale heterogeneous distributed computing environment where a set of periodic applications is executed. The considered shipboard distributed sy...
详细信息
Providing efficient workload management is an important issue for a large-scale heterogeneous distributed computing environment where a set of periodic applications is executed. The considered shipboard distributed system is expected to operate in an environment where the input workload is likely to change unpredictably, possibly invalidating a resource allocation that was based on the initial workload estimate. The tasks consist of multiple strings, each made up of an ordered sequence of applications. There is a quality of service (QoS) minimum throughput constraint that must be satisfied for each application in a string, and a maximum utilization constraint that must be satisfied on each of the hardware resources in the system. The challenge, therefore, is to efficiently and robustly manage both computation and communication resources in this unpredictable environment to achieve high performance while satisfying the imposed constraints. This work addresses the problem of finding a robust initial allocation of resources to strings of applications that is able to absorb some level of unknown input workload increase without rescheduling. The proposed hybrid two-stage method of finding a near-optimal allocation of resources incorporates two specially designed mapping techniques: (1) the Permutation Space Genitor-Based heuristic, and (2) the follow-up Branch-and-Bound heuristic based on an Integer Linear Programming (ILP) problem formulation. The performance of the proposed resource allocation method is evaluated under different simulation scenarios and compared to an iteratively computed upper bound. (C) 2007 Elsevier Inc. All rights reserved.
Computational Intelligence (CI) is an umbrella term for modern problem solvers such as evolutionaryalgorithms, Neural Networks and Fuzzy Logic. These methods have received increasing attention due to their simplicity...
详细信息
Computational Intelligence (CI) is an umbrella term for modern problem solvers such as evolutionaryalgorithms, Neural Networks and Fuzzy Logic. These methods have received increasing attention due to their simplicity, robustness and generality. The Collaborative Research Center "Computational Intelligence" (SFB 531) is concerned with the theoretical foundations and applications of CI methods. This article focuses on two examples from different research domains within SFB 531. First, exemplary results for the runtime analysis of evolutionaryalgorithms are summarized and evaluated. Second, applied research on mold temperature control is dealt with. Here it is stressed how the wide variety of CI methods leads to very efficient solutions to the problem and how still substantial improvements can be obtained by hybridization with expert's knowledge.
A hybrid evolutionary model is used to propose a hierarchical homology of protein sequences to identify protein functions systematically. The proposed model offers considerable potentials, considering the inconsistenc...
详细信息
A hybrid evolutionary model is used to propose a hierarchical homology of protein sequences to identify protein functions systematically. The proposed model offers considerable potentials, considering the inconsistency of existing methods for predicting novel proteins. Because some novel proteins might align without meaningful conserved domains, maximizing the score of sequence alignment is not the best criterion for predicting protein functions. This work presents a decision model that can minimize the cost of making a decision for predicting protein functions using the hierarchical homologies. Particularly, the model has three characteristics: (i) it is a hybrid evolutionary model with multiple fitness functions that uses genetic programming to predict protein functions on a distantly related protein family, (ii) it incorporates modified robust point matching to accurately compare all feature points using the moment invariant and thin-plate spline theorems, and (iii) the hierarchical homologies holding up a novel protein sequence in the form of a causal tree can effectively demonstrate the relationship between proteins. This work describes the comparisons of nucleocapsid proteins from the putative polyprotein SARS virus and other coronaviruses in other hosts using the model. (c) 2005 Elsevier Ltd. All rights reserved.
暂无评论