Background: RNA-binding proteins (RBPs) play crucial roles in post-transcriptional control of RNA. RBPs are designed to efficiently recognize specific RNA sequences after it is derived from the DNA sequence. To satisf...
详细信息
Background: RNA-binding proteins (RBPs) play crucial roles in post-transcriptional control of RNA. RBPs are designed to efficiently recognize specific RNA sequences after it is derived from the DNA sequence. To satisfy diverse functional requirements, RNA binding proteins are composed of multiple blocks of RNA-binding domains (RBDs) presented in various structural arrangements to provide versatile functions. the ability to computationally predict RNA-binding residues in a RNA-binding protein can help biologists reveal important site-directed mutagenesis in wet-lab experiments. Results: the proposed prediction framework named "ProteRNA" combines a SVM-based classifier with conserved residue discovery by WildSpan to identify the residues that interact with RNA in a RNA-binding protein. Although these conserved residues can be either functionally conserved residues or structurally conserved residues, they provide clues on the important residues in a protein sequence. In the independent testing dataset, ProteRNA has been able to deliver overall accuracy of 89.78%, MCC of 0.2628, F-score of 0.3075, and F-0.5-score of 0.3546. Conclusions: this article presents the design of a sequence-based predictor aiming to identify the RNA-binding residues in a RNA-binding protein by combining machinelearning and patternmining approaches. RNA-binding proteins have diverse functions while interacting with different categories of RNAs because these proteins are composed of multiple copies of RNA-binding domains presented in various structural arrangements to expand the functional repertoire of RNA-binding proteins. Furthermore, predicting RNA-binding residues in a RNA-binding protein can help biologists reveal important site-directed mutagenesis in wet-lab experiments.
Traditional kernelised classification methods Could not perforin well sometimes because of the using of a single and fixed kernel, especially oil sonic complicated data sets. In this paper. a novel optimal double-kern...
详细信息
ISBN:
(纸本)9783642030697
Traditional kernelised classification methods Could not perforin well sometimes because of the using of a single and fixed kernel, especially oil sonic complicated data sets. In this paper. a novel optimal double-kernel combination (ODKC) method is proposed for complicated classification tasks. Firstly, data sets are mapped by two basic kernels into different feature spaces respectively, and then three kinds of optimal composite kernels are constructed by integrating information of the two feature spaces. Comparative experiments demonstrate the effectiveness of our methods.
this work presents an image analysis framework driven by emerging evidence and constrained by the semantics expressed in an ontology. Human perception, apart from visual stimulus and patternrecognition, relies also o...
详细信息
ISBN:
(纸本)9783642030697
this work presents an image analysis framework driven by emerging evidence and constrained by the semantics expressed in an ontology. Human perception, apart from visual stimulus and patternrecognition, relies also on general knowledge and application context for understanding visual content in conceptual terms. Our work is an attempt to imitate this behavior by devising an evidence driven probabilistic, inference framework using ontologies and bayesian networks. Experiments conducted for two different image analysis, tasks showed improvement performance, compared to the case where computer vision techniques act isolated from any type of knowledge or context.
No-regret algorithms for online convex optimization are potent online learning tools and have been demonstrated to be successful in a wide-ranging number of applications. Considering affine and external regret, we, in...
详细信息
ISBN:
(纸本)9783642030697
No-regret algorithms for online convex optimization are potent online learning tools and have been demonstrated to be successful in a wide-ranging number of applications. Considering affine and external regret, we, investigate what happens when a set of no-regret learners (voters) merge their respective decisions in each learning iteration to a single, common one in form of a convex combination. We show that an agent (or algorithm) that executes this merged decision in each iteration of the online learning process and each time feeds back a copy of its own reward function to the voters, incurs sublinear regret itself. As a by-product, we obtain a simple method that allows us to construct new no-regret algorithms out of known ones.
Prior knowledge about it problem domain can be utilized to bias Support Vector machines (SVMs) towards learning better hypothesis functions. To this end, a number of methods have been proposed that demonstrate improve...
详细信息
ISBN:
(纸本)9783642030697
Prior knowledge about it problem domain can be utilized to bias Support Vector machines (SVMs) towards learning better hypothesis functions. To this end, a number of methods have been proposed that demonstrate improved generalization performance after the application of domain knowledge;especially in the case of scarce training data. In this paper, we propose an extension to the Virtual Support vectors (VSVs) technique where only a subset of the Support vectors (SVs) is Utilized. Unlike previous methods, the Purpose here is to compensate for noise and uncertainty in the training data. Furthermore, we investigate the effect of domain knowledge not only oil the quality of the SVM model, but also Oil rules extracted from it: hence the learned pattern by the SVM. Results on five benchmark and one real life data sets show that domain knowledge can significantly improve boththe quality Of the SVM and the rules extracted from it.
data clustering has been applied in multiple fields such as machinelearning, datamining, wireless sensor networks and patternrecognition. One of the most famous clustering approaches is K-means which effectively ha...
详细信息
ISBN:
(纸本)9781424481835
data clustering has been applied in multiple fields such as machinelearning, datamining, wireless sensor networks and patternrecognition. One of the most famous clustering approaches is K-means which effectively has been used in many clustering problems, but this algorithm has some problems such as local optimal convergence and initial point sensitivity. Artificial fishes swarm algorithm (AFSA) is one of the swarm intelligent algorithms and its major application is in solving optimization problems. Of its characteristics, it can refer to high convergent rate and insensitivity to initial values. In this paper a hybrid clustering method based on artificial fishes swarm algorithm and K-means so called KAFSA is proposed. In the proposed algorithm, K-means algorithm is used as one of the behaviors of artificial fishes in AFSA. the proposed algorithm has been tested on five data sets and its efficiency was compared with particle swarm optimization (PSO), K-means and standard AFSA algorithms. Experimental results showed that proposed approach has suitable and acceptable efficacy in data clustering.
In this paper we present a comparative analysis of two types of remote sensing satellite data by using the wavelet-based datamining techniques. the analyzed results reveal that the anomalous variations exist related ...
详细信息
ISBN:
(纸本)9783642030697
In this paper we present a comparative analysis of two types of remote sensing satellite data by using the wavelet-based datamining techniques. the analyzed results reveal that the anomalous variations exist related to the earthquakes. the methods studied in this work include wavelet transformations and spatial/temporal continuity analysis of wavelet maxima. these methods have been used to analyze the singularities of seismic anomalies in remote sensing satellite data, which are associated with file two earthquakes of Wenchuan and Pure recently occurred in China.
We face the problem of novelty detection from stream data, that is, the identification of new or unknown situations in an ordered sequence of objects which arrive on-line, at consecutive time points. We extend previou...
详细信息
ISBN:
(纸本)9783642030697
We face the problem of novelty detection from stream data, that is, the identification of new or unknown situations in an ordered sequence of objects which arrive on-line, at consecutive time points. We extend previous solutions by considering the case of objects modeled by multiple database relations. Frequent relational patterns are efficiently extracted at each time point, and a time window is used to filter out novelty patterns. An application of the proposed algorithm to the problem of detecting anomalies in network traffic is described and quantitative and qualitative results obtained by analyzing real stream of data collected from the firewall logs are reported.
datamining is the process of extracting interesting information from large sets of data. Outliers are defined as events that occur very infrequently. Detecting outliers before they escalate with potentially catastrop...
详细信息
ISBN:
(纸本)9783642030697
datamining is the process of extracting interesting information from large sets of data. Outliers are defined as events that occur very infrequently. Detecting outliers before they escalate with potentially catastrophic consequences is very important for various real life applications such as in the field of fraud detection, network robustness analysis, and intrusion detection. this paper presents a comprehensive analysis of three Outlier detection methods Extensible Markov Model (EMM), Local Outlier Factor (LOF) and LCS-Mine, where algorithm analysis shows the time complexity analysis and outlier detection accuracy. the experiments conducted with Ozone level Detection, IR video trajectories, and 1999 and 2000 DARPA DDoS datasets demonstrate that EMM outperforms both LOF and LSC-Mine in both time and outlier detection accuracy.
Many applications require predictions with confidence. We are interested in Confidence machines which are algorithms that call provide some measure oil how confident they are that their Output is correct. Confidence M...
详细信息
ISBN:
(纸本)9783642030697
Many applications require predictions with confidence. We are interested in Confidence machines which are algorithms that call provide some measure oil how confident they are that their Output is correct. Confidence machines are quite general and there are many algorithms solving the problem of prediction with confidence. As predictors we consider Venn Probability machines and Conformal Predictors. Both of these algorithms rely oil all underlying algorithm for prediction and in this paper We use two simple algorithms, namely the Nearest Neighbours and Nearest Centroid algorithms. Our aim is to provide some guidelines on how to choose the Most Suitable algorithm for a practical application where confidence is needed.
暂无评论