Background: siRNAs are small RNAs that serve as sequence determinants during the gene silencing process called RNA interference (RNAi). It is well know that siRNA efficiency is crucial in the RNAi pathway, and the siR...
详细信息
Background: siRNAs are small RNAs that serve as sequence determinants during the gene silencing process called RNA interference (RNAi). It is well know that siRNA efficiency is crucial in the RNAi pathway, and the siRNA efficiency for targeting different sites of a specific gene varies greatly. Therefore, there is high demand for reliable siRNAs prediction tools and for the design methods able to pick up high silencing potential siRNAs. Results: In this paper, two systems have been established for the prediction of functional siRNAs: (1) a statistical model based on sequence information and (2) a machine learning model based on three features of siRNA sequences, namely binary description, thermodynamic profile and nucleotide composition. Both of the two methods show high performance on the two datasets we have constructed for training the model. Conclusion: Both of the two methods studied in this paper emphasize the importance of sequence information for the prediction of functional siRNAs. The way of denoting a bio-sequence by binary system in mathematical language might be helpful in other analysis work associated with fixed-length bio-sequence.
Background: Automated information extraction from biomedical literature is important because a vast amount of biomedical literature has been published. Recognition of the biomedical named entities is the first step in...
详细信息
Background: Automated information extraction from biomedical literature is important because a vast amount of biomedical literature has been published. Recognition of the biomedical named entities is the first step in information extraction. We developed an automated recognition system based on the SVM algorithm and evaluated it in Task 1.A of BioCreAtIvE, a competition for automated gene/protein name recognition. Results: In the work presented here, our recognition system uses the feature set of the word, the part-of-speech (POS), the orthography, the prefix, the suffix, and the preceding class. We call these features "internal resource features", i.e., features that can be found in the training data. Additionally, we consider the features of matching against dictionaries to be external resource features. We investigated and evaluated the effect of these features as well as the effect of tuning the parameters of the SVM algorithm. We found that the dictionary matching features contributed slightly to the improvement in the performance of the f-score. We attribute this to the possibility that the dictionary matching features might overlap with other features in the current multiple feature setting. Conclusion: During SVM learning, each feature alone had a marginally positive effect on system performance. This supports the fact that the SVM algorithm is robust on the high dimensionality of the feature vector space and means that feature selection is not required.
Background: In this paper we discuss an efficient methodology for the image analysis and characterization of digital images containing skin lesions using supportvectormachines and present the results of a preliminar...
详细信息
Background: In this paper we discuss an efficient methodology for the image analysis and characterization of digital images containing skin lesions using supportvectormachines and present the results of a preliminary study. Methods: The methodology is based on the supportvectormachines algorithm for data classification and it has been applied to the problem of the recognition of malignant melanoma versus dysplastic naevus. Border and colour based features were extracted from digital images of skin lesions acquired under reproducible conditions, using basic image processing techniques. Two alternative classification methods, the statistical discriminant analysis and the application of neural networks were also applied to the same problem and the results are compared. Results: The SVM (supportvectormachines) algorithm performed quite well achieving 94.1% correct classification, which is better than the performance of the other two classification methodologies. The method of discriminant analysis classified correctly 88% of cases (71% of Malignant Melanoma and 100% of Dysplastic Naevi), while the neural networks performed approximately the same. Conclusion: The use of a computer-based system, like the one described in this paper, is intended to avoid human subjectivity and to perform specific tasks according to a number of criteria. However the presence of an expert dermatologist is considered necessary for the overall visual assessment of the skin lesion and the final diagnosis.
暂无评论