Breast cancer is one of the diseases that represent a large number of incidence and mortality in the world. datamining classifications techniques will be effective tools for classifying data of cancer to facilitate d...
详细信息
ISBN:
(纸本)9781450365628
Breast cancer is one of the diseases that represent a large number of incidence and mortality in the world. datamining classifications techniques will be effective tools for classifying data of cancer to facilitate decision-making. The objective of this paper is to compare the performance of different machinelearning algorithms in the diagnosis of breast cancer, to define exactly if this type of cancer is a benign or malignant tumor. Six machinelearning algorithms were evaluated in this research Bayes Network (BN), Support Vector machine (SVM), k-nearest neighbors algorithm (Kim), Artificial Neural Network (ANN), Decision Tree (C4.5) and Logistic Regression. The simulation of the algorithms is done using the WEKA tool (The Waikato Environment for Knowledge Analysis) on the Wisconsin breast cancer dataset available in UCI machinelearning repository.
According to the scenario of the last few decades, sensor-based human activity recognition has gained considerable research attention due to its novel applications in health care, machinelearning, human-computer inte...
详细信息
ISBN:
(纸本)9781728107882
According to the scenario of the last few decades, sensor-based human activity recognition has gained considerable research attention due to its novel applications in health care, machinelearning, human-computer interaction, etc. In this research, we have investigated several data pre-processing methods to recognize static and dynamic activities more accurately based on smartphone sensor data. The addition of magnitude and jerk-based features ensure the recognition of activities independent of the orientation of the smartphone in daily life. Besides, we have proposed a new feature namely Average peak-trough-distance (APTD), which ensures better recognition accuracy. We have also discussed a sliding window technique without overfitting, which utilizes observation of test data from previous windows with 50% overlapping for the detection of activity in real-time. Moreover, we have evaluated our proposed method showing a comparison between four classifiers. We have found the best accuracy for HASC2010corpus dataset using our proposed feature along with the multi-class Support Vector machine, where we have utilized the Grid Search Cross Validation optimization technique to tune the hyper-parameters. We have also shown that the proposed approach outperforms the naive methods for the detection of activity using smartphones more precisely in real-time applications.
This paper presents a general framework to automatically generate rules that produce given spatial patterns in complex systems. The proposed framework integrates Genetic Algorithms with Artificial Neural Networks or S...
详细信息
This paper presents a general framework to automatically generate rules that produce given spatial patterns in complex systems. The proposed framework integrates Genetic Algorithms with Artificial Neural Networks or Support Vector machines. Here, it is tested on a well known 3-values, 6-neighbors, k-totalistic cellular automata rule called the "burning paper" rule. Results are encouraging and should pave the way for the use of the proposed framework in real-life complex systems models.
Graphs, which encode pairwise relations between entities, are a kind of universal data structure for a lot of real-world data, including social networks, transportation networks, and chemical molecules. Many important...
详细信息
ISBN:
(纸本)9781450394079
Graphs, which encode pairwise relations between entities, are a kind of universal data structure for a lot of real-world data, including social networks, transportation networks, and chemical molecules. Many important applications on these data can be treated as computational tasks on graphs. Recently, machinelearning techniques are widely developed and utilized to effectively tame graphs for discovering actionable patterns and harnessing them for advancing various graph-related computational tasks. Huge success has been achieved and numerous real-world applications have benefited from it. However, since in today's world, we are generating and gathering data in a much faster and more diverse way, real-world graphs are becoming increasingly large-scale and complex. More dedicated efforts are needed to propose more advanced machinelearning techniques and properly deploy them for real-world applications in a scalable way. Thus, we organize The 3rdinternational Workshop on machinelearning on Graphs (MLoG)(1), held in conjunction with the 16th ACM conference on Web Search and datamining (WSDM), which provides a venue to gather academia researchers and industry researchers/practitioners to present the recent progress on machinelearning on graphs.
Outlier explanation approaches are used to support analysts in investigating outliers, especially those detected by methods which are not intuitively interpretable such as deep learning or ensemble approaches. There h...
详细信息
Designing robust features for human activity recognition (HAR) that performs well across a wide range of users is a hard task. Therefore, more attention is being given to feature learning techniques, to automatically ...
详细信息
ISBN:
(纸本)9781728107882
Designing robust features for human activity recognition (HAR) that performs well across a wide range of users is a hard task. Therefore, more attention is being given to feature learning techniques, to automatically learn features from raw data. In this paper, we present a comparison study among feature learning methods for HAR. Using accelerometer data, we compare four methods for feature learning from raw-sensor data (PCA-based, clustering, matrix factorization, and LSTM networks) to the traditional hand-crafted feature extraction method. We focus on the performance degradation when each model is evaluated using a new user. According to our results, features learned with Principal Component Analysis are the more robust to the new user scenario. Our results evidence the importance of evaluation in unseen user, since the performance difference compared to a random split testing is big.
Template matching algorithms are well suited for gesture recognition, but unlike other machinelearning approaches there are no established methods to optimize their parameters. We present WLCSSLearn: an optimization ...
详细信息
ISBN:
(纸本)9781728107882
Template matching algorithms are well suited for gesture recognition, but unlike other machinelearning approaches there are no established methods to optimize their parameters. We present WLCSSLearn: an optimization approach for the WarpingLCSS algorithm based on a genetic algorithms. We demonstrate that WLCSSLearn makes the optimization procedure automatic, fast and suitable for new recognition problems even when there is no a-priori knowledge about suitable range of parameter values. We evaluate WLCSSLearn on three different datasets of gestures. We demonstrated that our method increased the accuracy and F1 score up to 20% compared to previous literature.
This research paper presents a brief review of ten popular unsupervised algorithms widely utilized in patternrecognition publications. The algorithms are assessed based on their popularity, strengths, limitations, an...
详细信息
ISBN:
(纸本)9798350372977;9798350372984
This research paper presents a brief review of ten popular unsupervised algorithms widely utilized in patternrecognition publications. The algorithms are assessed based on their popularity, strengths, limitations, and resource requirements. Considering these factors, we propose two most-preferred algorithms suitable for adoption in IDS (Intrusion Detection Systems) to address the problems associated with Zero Day exploits or attacks. Our review of the surveyed algorithms facilitated the recommendation of specific algorithms that can enhance IDS capabilities in detecting and mitigating Zero-Day attacks and anomalous intrusion attempts. These algorithms leverage unsupervised learning techniques to overcome the limitations of traditional signature-based approaches. By incorporating these algorithms, IDS can better handle sophisticated and evolving attacks that often evade detection. In conclusion, this research provides valuable insights into the strengths, limitations, and resource requirements of popular unsupervised algorithms used in patternrecognition. It highlights the potential of adopting these algorithms in IDS systems to bolster their ability to detect and respond to Zero-Day attacks. By recommending the integration of these algorithms, we contribute to the development of intelligent IDS solutions that can adapt to dynamic threat landscapes.
In this paper we present a novel algorithm for learning oblique decision trees. Most of the current decision tree algorithms rely on impurity measures to assess goodness of hyperplanes at each node. These impurity mea...
详细信息
ISBN:
(纸本)9783642111631
In this paper we present a novel algorithm for learning oblique decision trees. Most of the current decision tree algorithms rely on impurity measures to assess goodness of hyperplanes at each node. These impurity measures do not properly capture the geometric structures in the data. Motivated by this, our algorithm uses a strategy, based on some recent variants of SVM, to assess the hyperplanes in such a way that the geometric structure in the data is taken into account. We show through empirical studies that our method is effective.
This paper discusses a new approach for developing a service-oriented infrastructure for distributed datamining applications. The proposed architecture hides the complexity of implementation details and enables users...
详细信息
ISBN:
(纸本)0780388232
This paper discusses a new approach for developing a service-oriented infrastructure for distributed datamining applications. The proposed architecture hides the complexity of implementation details and enables users to perform datamining in a utility-like fashion. The service-oriented architecture provides an autonomic datamining framework where self-describing datamining services can be automatically discovered on the Internet. Moreover, this structure allows for the implementation of datamining algorithms for processing data on more than one site in a distributed manner. The performance of the proposed distributed datamining framework is compared to a standarddatamining approach to demonstrate its effectiveness.
暂无评论