Background: Combining clinical and molecular data types may potentially improve prediction accuracy of a classifier. However, currently there is a shortage of effective and efficient statistical and bioinformatic tool...
详细信息
Background: Combining clinical and molecular data types may potentially improve prediction accuracy of a classifier. However, currently there is a shortage of effective and efficient statistical and bioinformatic tools for true integrative data analysis. Existing integrative classifiers have two main disadvantages: First, coarse combination may lead to subtle contributions of one data type to be overshadowed by more obvious contributions of the other. Second, the need to measure both data types for all patients may be both unpractical and (cost) inefficient. Results: We introduce a novel classification method, a stepwise classifier, which takes advantage of the distinct classification power of clinical data and high-dimensional molecular data. We apply classification algorithms to two data types independently, starting with the traditional clinical risk factors. We only turn to relatively expensive molecular data when the uncertainty of prediction result from clinical data exceeds a predefined limit. Experimental results show that our approach is adaptive: the proportion of samples that needs to be re-classified using molecular data depends on how much we expect the predictive accuracy to increase when re-classifying those samples. Conclusions: Our method renders a more cost-efficient classifier that is at least as good, and sometimes better, than one based on clinical or molecular data alone. Hence our approach is not just a classifier that minimizes a particular loss function. Instead, it aims to be cost-efficient by avoiding molecular tests for a potentially large subgroup of individuals;moreover, for these individuals a test result would be quickly available, which may lead to reduced waiting times (for diagnosis) and hence lower the patients distress. Stepwise classification is implemented in R-package stepwiseCM and available at the Bioconductor website.
This paper presents methods for training pattern (prototype) selection, class-specific feature selection and classification for automated learning. For training pattern selection, we propose a method of sampling that ...
详细信息
This paper presents methods for training pattern (prototype) selection, class-specific feature selection and classification for automated learning. For training pattern selection, we propose a method of sampling that extracts a small number of representative training patterns (prototypes) from the dataset. The idea is to extract a set of prototype training patterns that represents each class region in a classification problem. In class-specific feature selection, we try to find a separate feature set for each class such that it is the best one to separate that class from the other classes. We then build a separate classifier for that class based on its own feature set. The paper also presents a new hypersphere classification algorithm. Hypersphere nets are similar to radial basis function (RBF) nets and belong to the group of kernel function nets. Polynomial time complexity of the methods is proven. Polynomial time complexity of learning algorithms is important to the field of neural networks. Computational results are provided for a number of well-known datasets. None of the parameters of the algorithm were fine tuned for any of the problems solved and this supports the idea of automation of learning methods. Automation of learning is crucial to wider deployment of learning technologies. (c) 2012 Elsevier Ltd. All rights reserved.
Three-level neutral point clamped (NPC) voltage source converters have recently emerged as important alternatives to conventional two-level converter topologies in high-power medium-voltage energy conversion applicati...
详细信息
Three-level neutral point clamped (NPC) voltage source converters have recently emerged as important alternatives to conventional two-level converter topologies in high-power medium-voltage energy conversion applications, particularly in high performance ac motor drive systems. An all-inclusive modulation strategy for NPC converters should have the capability of extending the operating range of the converter into the overmodulation region with a smooth and linear transition characteristic. An overmodulation switching strategy based on the space vector classification technique for three-level NPC converters is introduced in this paper. The proposed overmodulation modes, make possible continuous control of the output voltage up to the maximum possible with a smooth linear transition characteristic, and minimum distortions. A theoretical basis for the vector classification space vector modulation technique in the overmodulation region is presented, and the proposed overmodulation schemes are validated by analysis, simulation and experimentation on a 2-KVA three-level NPC converter laboratory prototype.
In this letter, we propose a multimodal method for improving radio frequency (RF) fingerprinting performance that uses multiple features cultivated from RF signals. Combining multiple features, including a falling tra...
详细信息
In this letter, we propose a multimodal method for improving radio frequency (RF) fingerprinting performance that uses multiple features cultivated from RF signals. Combining multiple features, including a falling transient feature that has not previously been used in RF fingerprinting studies, we aim to demonstrate that the proposed method results in improved accuracy. We show that a sparse representation-based classification (SRC) scheme can be a good platform for combining multiple features. The experimental results on RF signals acquired from eight walkie-talkies show that the RF fingerprinting accuracy of the proposed method improves significantly as the number of features increases.
This paper aims to evaluate the feasibility and performance of two related applications in agricultural big data: hyperspectral imaging and parallel computing. After selecting two oilseed rape varieties (NingYou 22 an...
详细信息
This paper aims to evaluate the feasibility and performance of two related applications in agricultural big data: hyperspectral imaging and parallel computing. After selecting two oilseed rape varieties (NingYou 22 and NingZa 19) as the objects of study, we captured hyperspectral images of these siliques following their exposure to three different waterlogging stress levels (0, 3, and 6 days). The machine learning library for Spark was used to realize both artificial neural network (ANN) and support vector machine (SVM) classification algorithms, and to conduct hyperspectral classification analysis of the oilseed rape siliques under the various levels of waterlogging stress on the parallel computing platform. From the classification data sets, 70% of the data were randomly selected for training, and the remaining 30% were used for prediction. The experimental results indicate that, when the hyperspectral image of a region of interest (400-1000 nm) was extracted and combined with the spectrum image data, the oilseed rape waterlogging detection model based on the Spark parallel computing framework was feasible and efficient. For the multi-class classification problem, the accuracy of the ANN algorithm was superior to that of the SVM, but its convergence time exceeded that of the SVM algorithm. Using the ANN and SVM algorithms for binary classification of the samples from both varieties, the results indicate that the performance of the SVM algorithm was superior in terms of the binary classification problem. Meanwhile, of the two oilseed rape varieties, the NZ 19 waterlogging samples yielded better classification results. Five optimal wavebands (512, 621, 689, 953, and 961 nm) were selected as the inputs for the classification algorithm. The results show that the classification accuracy of full-waveband hyperspectral imaging was slightly higher than that of optimal waveband imaging, while the ANN algorithm was more accurate than the SVM algorithm. Finally, the three in
Postural instability is a symptom of disorders like ear diseases, high blood pressure, diabetes, psychiatric disorders, and neurodegenerative diseases like Parkinson and one of the main causes of falls. A continuous m...
详细信息
Postural instability is a symptom of disorders like ear diseases, high blood pressure, diabetes, psychiatric disorders, and neurodegenerative diseases like Parkinson and one of the main causes of falls. A continuous monitoring of the patient's postural stability while performing the daily activities would be of strategic interest for the doctors to gain continuous information on the user pathology or its progression. This article focuses on the experimental assessment of a wearable inertial device aimed at the continuous monitoring of the user's postural sway, which exploits a novel postural sway classification algorithm providing a classification index. Two performance keys have been proposed: 1) to assess the postural sway monitoring strategy and 2) to rate the reliability of the classification outcome. Experiments reported in this article have been performed involving "healthy" users miming typical instability dynamics under the supervision of the neurologists. In the case of study addressed through this article, a full classification success has been obtained with an overall reliability of about 70%.
Since label noise can hurt the performance of supervised learning (SL), how to train a good classifier to deal with label noise is an emerging and meaningful topic in machine learning field. Although many related meth...
详细信息
Since label noise can hurt the performance of supervised learning (SL), how to train a good classifier to deal with label noise is an emerging and meaningful topic in machine learning field. Although many related methods have been proposed and achieved promising performance, they have the following drawbacks: (1) They can lead to data waste and even performance degradation if the mislabeled instances are removed;and (2) the negative effect of the extremely mislabeled instances cannot be completely eliminated. To address these problems, we propose a novel method based on the capped l(1) norm and a graph-based regularizer to deal with label noise. In the proposed algorithm, we utilize the capped l(1) norm instead of the l(1) norm. The used norm can inherit the advantage of the l(1) norm, which is robust to label noise to some extent. Moreover, the capped l(1) norm can adaptively find extremely mislabeled instances and eliminate the corresponding negative influence. Additionally, the proposed algorithm makes full use of the mislabeled instances under the graph-based framework. It can avoid wasting collected instance information. The solution of our algorithm can be achieved through an iterative optimization approach. We report the experimental results on several UCI datasets that include both binary and multi-class problems. The results verified the effectiveness of the proposed algorithm in comparison to existing state-of-the-art classification methods.
In this paper, we suggest a new approach of genetic programming for music emotion classification. Our approach is based on Thayer's arousal-valence plane which is one of representative human emotion models. Thayer...
详细信息
In this paper, we suggest a new approach of genetic programming for music emotion classification. Our approach is based on Thayer's arousal-valence plane which is one of representative human emotion models. Thayer's plane which says human emotions is determined by the psychological arousal and valence. We map music pieces onto the arousal-valence plane, and classify the music emotion in that space. We extract 85 acoustic features from music signals, rank those by the information gain and choose the top k best features in the feature selection process. In order to map music pieces in the feature space onto the arousal-valence space, we apply genetic programming. The genetic programming is designed for finding an optimal formula which maps given music pieces to the arousal-valence space so that music emotions are effectively classified. k-NN and SVM methods which are widely used in classification are used for the classification of music emotions-in the arousal-valence space. For verifying our method, we compare with other six existing methods on the same music data set. With this experiment, we confirm the proposed method is superior to others.
The classification of scientific articles aligned to Sustainable Development Goals is crucial for research institutions and universities when assessing their influence in these areas. Machine learning enables the impl...
详细信息
The classification of scientific articles aligned to Sustainable Development Goals is crucial for research institutions and universities when assessing their influence in these areas. Machine learning enables the implementation of massive text data classification tasks. The objective of this study is to apply Natural Language Processing techniques to articles from peer-reviewed journals to facilitate their classification according to the 17 Sustainable Development Goals of the 2030 Agenda. This article compares the performance of multi-label text classification models based on a proposed framework with datasets of different characteristics. The results show that the combination of Label Powerset (a transformation method) with Support Vector Machine (a classification algorithm) can achieve an accuracy of up to 87% for an imbalanced dataset, 83% for a dataset with the same number of instances per label, and even 91% for a multiclass dataset.
The vestibulo-ocular reflex (VOR) plays an important role in our daily activities by enabling us to fixate on objects during head movements. Modeling and identification of the VOR improves our insight into the system ...
详细信息
The vestibulo-ocular reflex (VOR) plays an important role in our daily activities by enabling us to fixate on objects during head movements. Modeling and identification of the VOR improves our insight into the system behavior and improves diagnosis of various disorders. However, the switching nature of eye movements (nystagmus), including the VOR, makes dynamic analysis challenging. The first step in such analysis is to segment data into its subsystem responses (here slow and fast segment intervals). Misclassification of segments results in biased analysis of the system of interest. Here, we develop a novel three-step algorithm to classify the VOR data into slow and fast intervals automatically. The proposed algorithm is initialized using a K-means clustering method. The initial classification is then refined using system identification approaches and prediction error statistics. The performance of the algorithm is evaluated on simulated and experimental data. It is shown that the new algorithm performance is much improved over the previous methods, in terms of higher specificity.
暂无评论