Wheat is one of the most important cereals worldwide for human nutrition. Tetraploid wheat (Triticum turgidum L. asp. durum, 2n = 28, genomes AABB) is mainly used to produce pasta. The main objective of durum wheat br...
详细信息
Wheat is one of the most important cereals worldwide for human nutrition. Tetraploid wheat (Triticum turgidum L. asp. durum, 2n = 28, genomes AABB) is mainly used to produce pasta. The main objective of durum wheat breeding programs is to develop varieties with good quality and high yields. Yield is a very complex trait, and depends on different yield components that are genetically controlled and affected by environmental constraints. In this context, machine learning constitutes an excellent alternative for the analysis of a high number of traits in order to extract the most relevant ones as confident predictors of the performance of this crop, allowing a better agricultural planning. Thus, we propose the use of machine learning algorithms for the classification of yield components and for the search of new rules to infer high yields at harvest of durum wheat. The main objective of this work was to obtain rules for predicting durum wheat yield through different machine learning algorithms, and compare them to detect the one that best fits the model. In order to achieve this goal, One-R, J48, Ibk and A priori algorithms were run with data collected by our research group of a RIL (recombinant inbreed lines) population growing in six different environments from the Province of Buenos Aires in Argentina. The results indicate that the A priori method obtains the best performance for all locations, and the classificators generated using the different algorithms share a common set of selected traits. Moreover, comparing these results with the previous ones obtained using different techniques, mainly QTL mapping, the traits indicated to be the most significant ones were the same. The analysis of the resulting rules shows the soundness in the agronomic relevance of the extracted knowledge. (C) 2013 Elsevier B.V. All rights reserved.
To address the rapid and nondestructive determination of pork storage time associated with its freshness, Fourier transform near infrared (FT-NIR) spectroscopy technique, with the help of classification algorithm, was...
详细信息
To address the rapid and nondestructive determination of pork storage time associated with its freshness, Fourier transform near infrared (FT-NIR) spectroscopy technique, with the help of classification algorithm, was attempted in this work. To investigate the effects of different linear and non-linear classification algorithms on the discrimination results, linear discriminant analysis (LDA), K-nearest neighbors (KNN), and back propagation artificial neural network (BP-ANN) were used to develop the discrimination models, respectively. The number of principal components (PCs) and other parameters were optimized by cross-validation in developing discrimination models. Experimental results showed that the performance of BP-ANN model was superior to others, and the optimal BP-ANN model was achieved when 5 PCs were included. The discrimination rates of the BP-ANN model were 99.26% and 96.21% in the training and prediction sets, respectively. The overall results sufficiently demonstrate that the FT-NIR spectroscopy technique combined with BP-ANN classification algorithm has the potential to determine pork storage time associated with its freshness. (C) 2011 Elsevier Ltd. All rights reserved.
Patients with cerebral haemorrhages need to drain haematomas. Fresh blood may appear during the haematoma drainage process, so this needs to be observed and detected in real time. To solve this problem, this paper stu...
详细信息
Patients with cerebral haemorrhages need to drain haematomas. Fresh blood may appear during the haematoma drainage process, so this needs to be observed and detected in real time. To solve this problem, this paper studies images produced during the haematoma drainage process. A blood image feature selection recognition and classification framework is designed. First, aiming at the characteristics of the small colour differences in blood images, the general RGB colour space feature is not obvious. This study proposes an optimal colour channel selection method. By extracting the colour information from the images, it is recombined into a 3 x 3 matrix. The normalised 4-neighbourhood contrast and variance are calculated for quantitative comparison. The optimised colour channel is selected to overcome the problem of weak features caused by a single colour space. After that, the effective region in the image is intercepted, and the best colour channel of the image in the region is transformed. The first, second and third moments of the three best colour channels are extracted to form a nine-dimensional eigenvector. K-means clustering is used to obtain the image eigenvector, outliers are removed, and the results are then transferred to the hidden Markov model (HMM) and support vector machine (SVM) for classification. After selecting the best color channel, the classification accuracy of HMM-SVM is greatly improved. Compared with other classification algorithms, the proposed method offers great advantages. Experiments show that the recognition accuracy of this method reaches 98.9%.
In the present era of modern technology, the efficacy and accuracy of output is demanding. Based on rigorous survey, Bharatkar and Patel (International Journal of Advanced Research in Computer Science 3(7):218-223, 20...
详细信息
In the present era of modern technology, the efficacy and accuracy of output is demanding. Based on rigorous survey, Bharatkar and Patel (International Journal of Advanced Research in Computer Science 3(7):218-223, 2012) concluded that the incorporation of Block Truncation Coding (BTC) approach in the existing image classification algorithm can be used to improve the classification accuracy. Therefore, in the present study, an effort has been made to explore the Content Based Remote Sensing Image (CBRSI) classification algorithm to enhance classification accuracy with BTC approach. It is revealed from the study that BTC based maximum likelihood classifier gives better overall accuracy and kappa statistics.
Mould level fluctuation (MLF) is one of the main reasons for surface defects in continuously cast slabs. In these study first, large scale mould level fluctuations has been categorized in three different cases based o...
详细信息
Mould level fluctuation (MLF) is one of the main reasons for surface defects in continuously cast slabs. In these study first, large scale mould level fluctuations has been categorized in three different cases based on actual plant data. Moreover, theoretical formulation has been investigated to better understand the underlying physics of flow. Next, exploratory data analysis is used for preliminary investigation into the phenomenon based on actual plant data. Finally, different classification algorithms were used to classify non-mould level fluctuation cases from MLF cases for two different scenarios- one where all mould level fluctuation cases are considered and in another where only a particular case of mould level fluctuation is considered. classification algorithm such as recursive partitioning, random forest etc. has been used to identify different casting parameters affecting the phenomenon of mould level fluctuation. 70% of the dataset used as training dataset and rest 30% as the testing dataset. Prediction accuracy of these different classification algorithms along with an ensemble model has been compared on a completely unseen test set. Ladle change operation and superheat temperature has been identified as process parameters influencing the phenomenon of large scale mould level fluctuations.
Imbalance between positive and negative outcomes, a so-called class imbalance, is a problem generally found in medical data. Imbalanced data hinder the performance of conventional classification methods which aim to i...
详细信息
Imbalance between positive and negative outcomes, a so-called class imbalance, is a problem generally found in medical data. Imbalanced data hinder the performance of conventional classification methods which aim to improve the overall accuracy of the model without accounting for uneven distribution of the classes. To rectify this, the data can be resampled by oversampling the positive (minority) class until the classes are approximately equally represented. After that, a prediction model such as gradient boosting algorithm can be fitted with greater confidence. This classification method allows for non-linear relationships and deep interactive effects while focusing on difficult areas by iterative shifting towards problematic observations. In this study, we demonstrate application of these methods to medical data and develop a practical framework for evaluation of features contributing into the probability of stroke.
In this study, Artificial Bee Colony (ABC) algorithm based classifier is used. Also, in order to improve the effectiveness of ABC algorithm, some modifications are done. New method is called MABC algorithm. Both metho...
详细信息
ISBN:
(纸本)9781479975723
In this study, Artificial Bee Colony (ABC) algorithm based classifier is used. Also, in order to improve the effectiveness of ABC algorithm, some modifications are done. New method is called MABC algorithm. Both methods are applied on various real life data sets such as IRIS, WINE, PIMA, BUPA, ECG and results are compared. Those datasets are obtained from UCI Machine Learning Repository and MITBIH ECG database. In addition to it, validity indices and effects of some control parameters such as MCN, Limit are examined. It is observed that, selected features have significiant effect on classification success rate of classifier. If there is high overlap between the classes, success rate of classifier decreases. However observed results indicate that ABC algorithm can successfully be used for classification of multi dimensional datasets. By means of SCTR control parameter, MABC algorithm based classifier provides higher classification success rates versus ABC algorithm, independent from Limit and MCN values.
This study aims to extract the most relevant set consisted of affective variables to the level of user satisfaction on engine sounds using classification algorithm. The affective variables for engine sounds were defin...
详细信息
ISBN:
(纸本)9781479964109
This study aims to extract the most relevant set consisted of affective variables to the level of user satisfaction on engine sounds using classification algorithm. The affective variables for engine sounds were defined by three axes, and two classification algorithms were used to determine the prediction accuracy for those affective axes. The study was consisted of three phases: 1) extracting sets of affective variables and the level of satisfaction on engine sounds, 2) preprocessing of engine sounds and experiment design, and 3) analysis of the most relevant sets of affective variables to user satisfaction. As a result, PA (Powerful-Affective) variable set showed the highest prediction accuracy of user satisfaction compared to other sets. Predicting the level of satisfaction based on classification algorithm could help to generalize the relationship between user satisfaction and affective variables more easily, beyond the limitation with a small size of subjects.
The paper describes algorithm for the classification of digital modulations and its testing with disturbed signals. 2ASK, 2FSK, 4FSK, MSK, BPSK, QPSK, 8PSK and 16QAM were chosen for recognition as the best-known digit...
详细信息
ISBN:
(纸本)9781457714115
The paper describes algorithm for the classification of digital modulations and its testing with disturbed signals. 2ASK, 2FSK, 4FSK, MSK, BPSK, QPSK, 8PSK and 16QAM were chosen for recognition as the best-known digital modulations used in modern communication technologies. The method designed uses ten features computed from parameters of recognized signal such as instantaneous amplitude, instantaneous phase, instantaneous frequency and spectrum characteristic. The GentleBoost algorithm was used to analyze the features and classify the modulations. We used multipath fading channel to model signal propagation and disturbed the signal by white Gaussian noise for the purpose of testing the algorithm.
As the world is struggling against COVID-19 pandemic, and unfortunately no certain treatments are discovered yet, prevention of further transmission by isolating infected people has become an effective strategy to ove...
详细信息
ISBN:
(纸本)9781665412414
As the world is struggling against COVID-19 pandemic, and unfortunately no certain treatments are discovered yet, prevention of further transmission by isolating infected people has become an effective strategy to overcome this outbreak. That is why scaling up COVID-19 testing is strongly recommended. However, depending on the time tests are performed, they may have a high rate of false-negative results. This inaccuracy of COVID-19 testing is a challenge against controlling the pandemic. Therefore, in this paper we propose a geometric classification algorithm that is fault -tolerant to handle the inaccuracy of tests. So, in a metropolis of n people, let w r be the number of cases that are tested, where r is the number of positive, while w is the number of negative COVID-19 cases, and k is an upper bound on the number of false -negative COVID-19 cases. The proposed algorithm takes 0(r " (log r log w) w3 w logfitrO) time for isolating all positive cases together with at most k (according to the rate of error of testing) possibly positive (false -negative) cases from the rest of the people. The term hR in the time complexity is the size of convex hull of the set of positive cases, and obviously k E 0(w). For simplicity of this isolation, we consider a simple convex shape (a triangle) for this classification algorithm.
暂无评论