Hierarchical hidden Markov models (HHMMs) can be used for time series segmentation. However, it is difficult to obtain a desirable segmentation result, because the form of learning for HHMMs is unsupervised. In the pa...
详细信息
ISBN:
(纸本)3540287574
Hierarchical hidden Markov models (HHMMs) can be used for time series segmentation. However, it is difficult to obtain a desirable segmentation result, because the form of learning for HHMMs is unsupervised. In the paper, we present a semisupervised learning algorithm for HHMMs. It is semisupervised in the sense that the supervisor teaches segmentation boundaries but not segment labels. The learning performance of the proposed algorithm is demonstrated through an experiment using music data.
As the market economy becomes more diversified, competition among enterprises is increasing. As a technological product in the information age, datamining technology is applied to enterprise supply chain management t...
详细信息
ISBN:
(纸本)9781665417907
As the market economy becomes more diversified, competition among enterprises is increasing. As a technological product in the information age, datamining technology is applied to enterprise supply chain management to help improve management efficiency and enhance the core competitiveness of enterprises. In this regard, this article takes a tobacco company as an example, briefly explains the basic principles of datamining technology in the application of supply chain management, outlines the supply chain management process based on datamining technology, analyzes and explores the selection of partners based on fuzzy datamining, and aims to give Enterprise supply chain management better apply datamining technology to provide some useful references.
Now-a-days, people face various diseases due to the environmental condition and their living habits. So the prediction of disease at earlier stage becomes important task. But the accurate prediction on the basis of sy...
详细信息
ISBN:
(纸本)9781538678084
Now-a-days, people face various diseases due to the environmental condition and their living habits. So the prediction of disease at earlier stage becomes important task. But the accurate prediction on the basis of symptoms becomes too difficult for doctor. The correct prediction of disease is the most challenging task. To overcome this problem datamining plays an important role to predict the disease. Medical science has large amount of data growth per year. Due to increase amount of data growth in medical and healthcare field the accurate analysis on medical data which has been benefits from early patient care. With the help of disease data, datamining finds hidden pattern information in the huge amount of medical data. We proposed general disease prediction based on symptoms of the patient. For the disease prediction, we use K-Nearest Neighbor (KNN) and Convolutional neural network (CNN) machinelearning algorithm for accurate prediction of disease. For disease prediction required disease symptoms dataset. In this general disease prediction the living habits of person and checkup information consider for the accurate prediction. The accuracy of general disease prediction by using CNN is 84.5% which is more than KNN algorithm. And the time and the memory requirement is also more in KNN than CNN. After general disease prediction, this system able to gives the risk associated with general disease which is lower risk of general disease or higher.
Lung cancer is one of the most fatal types of cancer, therefore, the early and accurate diagnosis can greatly improve the patients’ quality of life, as well as the survival rate. In this research, the progress and th...
详细信息
The availability of huge remote sensing image dataset imposes recourse to powerful techniques of content-based image retrieval for archiving and mining. This paper propose descriptors based on the SIFT (Scale invarian...
详细信息
ISBN:
(纸本)9781538642382
The availability of huge remote sensing image dataset imposes recourse to powerful techniques of content-based image retrieval for archiving and mining. This paper propose descriptors based on the SIFT (Scale invariant features) combined with SVM linear classification. To build a powerful image classifier using very little training data, image augmentation is usually required to boost the performance of the classification. For this reason, an augmentation data is used to increase the training data for the SVM (Support vector machine) classifier. The creation of the training data is done using several techniques of augmentation: anisotropic filter. We report a first evaluation of the CBIR (Content based image retrieval) and the second evaluation of the system aims to compare the deep learning with the boosted SVM classification.
This paper analyses the main research methods of power text mining technology in detail, and discusses the research hotspots of power text named entity recognition and named entity relationship extraction based on mac...
详细信息
The combination of classifiers is a powerful tool to improve the accuracy of classifiers, by using the prediction of multiple models and combining them. Many practical and useful combination techniques work by using t...
详细信息
ISBN:
(纸本)3540405046
The combination of classifiers is a powerful tool to improve the accuracy of classifiers, by using the prediction of multiple models and combining them. Many practical and useful combination techniques work by using the output of several classifiers as the input of a second layer classifier. The problem of this and other multi-classifier approaches is that huge amounts of memory are required to store a set of multiple classifiers and, more importantly, the comprehensibility of a single classifier is lost and no knowledge or insight can be acquired from the model. In order to overcome these limitations, in this work we analyse the idea of "mimicking" the semantics of an ensemble of classifiers. More precisely, we use the combination of classifiers for labelling an invented random dataset, and then, we use this artificially labelled dataset to re-train one single model. This model has the following advantages: it is almost similar to the highly accurate combined model, as a single solution it requires much fewer memory resources, no additional validation test must be reserved to do this procedure and, more importantly, the resulting model is expressed as a single classifier in terms of the original attributes and, hence, it can be comprehensible. First, we illustrate this methodology using a popular data-mining package, showing that it can spread into common practice, and then we use our system SMILES, which automates the process and takes advantage of its ensemble method.
Human Activity recognition (HAR) is technically the problem of forecasting an individual39;s actions based on evidence of their gesture using sensors functioning as accelerometer and gyroscope. It plays a major role...
详细信息
ISBN:
(纸本)9781665428644
Human Activity recognition (HAR) is technically the problem of forecasting an individual's actions based on evidence of their gesture using sensors functioning as accelerometer and gyroscope. It plays a major role in contrasting sectors such as personal biometric signature, daily life monitoring, anti-terrorists along with anti-crime securities, medical-related applications, and so on. These days, smart phones are well-resourced with leading processors and built-in sensors. This comes up with the possibility to unfold a new arena of datamining. This paper signifies the analysis of HAR focused on data composed via accelerometer sensors of smart phones. Further, it illustrates the use of time-domain features which are acquired with the help of a windowing approach termed as overlapping. It is accompanied by a window size of 250ms along with overlapping of 25%. Numerous machinelearning classifiers such as k-nearest neighbors, linear discriminant analysis, bagging classifier, gradient boosting classifier, decision tree, random forest, and support vector machine using three different kernels were practiced. The outcomes exhibit that random forest with 5-fold cross-validation imparts the highest accuracy (92.71%) in recognition of human activities.
A variety of techniques from statistics, signal processing, patternrecognition, machinelearning, and neural networks have been proposed to understand data by discovering useful categories. However, research in data ...
详细信息
Intrusion detection systems play a crucial rule in this era where networks reached almost any sector. Unfortunately, intrusion detection systems are far from perfectness. Therefore, researchers never stopped digging d...
详细信息
ISBN:
(纸本)9781538642382
Intrusion detection systems play a crucial rule in this era where networks reached almost any sector. Unfortunately, intrusion detection systems are far from perfectness. Therefore, researchers never stopped digging deeper to improve them. In this context, datamining techniques have been highly exploited for intrusion detection. In this paper, we present a comparative study of datamining techniques for intrusion detection. Specifically, we study the overall performances of those methods as well as the impact of training data size on their results. We use ISCX2012 as a benchmark for our experimentation. A realistic dataset that represents at a certain level today's network traffic. The study shows that relatively old methods outperform some of the techniques highly used actually by the community. Regarding the impact of training dataset size, the investigated methods react differently from each other when we add more data to the training dataset. In addition, the results highlight the importance of attack traffic in the training dataset. Moreover, they strongly suggest the use of Random Forest for intrusion detection due to its linear performance relation with the training dataset's size.
暂无评论