Temporal regularity in trajectory data is an important basis for traffic management, public service and marketing. Although many efforts have been made to study temporal regularity, yet almost all existing works selec...
详细信息
ISBN:
(纸本)9781450355490
Temporal regularity in trajectory data is an important basis for traffic management, public service and marketing. Although many efforts have been made to study temporal regularity, yet almost all existing works select time granularity intuitively. User-specified time granularity and other parameters may lead to biased results. Moreover, as the size of datasets grows, the costs of parameters tuning also increases. To solve these problems, we propose the Automatic Multi-granularity Temporal Regularity Detection algorithm (auto-MTRD) for trajectory data. Our approach clusters time series from the trajectory data using automatic parameter selection and generates a temporal regularity tree to indicate multi-granularity temporal regularity. It cannot only avoid the negative effect of human intervention, but also evaluate the relative importance of multiple time granularities at the same time. Two real-life datasets are used to validate the effectiveness of our method.
This work seeks to explore the current scenario of news related to the Olympic Games of Rio de Janeiro 2016, using the process of knowledge discovery in databases. With the increase of the news divulged in the scope o...
详细信息
This paper discusses the results of an applied research on the fishing activity based on a monitor system developed by a company and fishing reports produced at the end of each fishing activity. Due to economic intere...
详细信息
ISBN:
(纸本)9783319534800;9783319534794
This paper discusses the results of an applied research on the fishing activity based on a monitor system developed by a company and fishing reports produced at the end of each fishing activity. Due to economic interests combined with fishing limitations there is a natural tendency for wrong reporting. We apply datamining (DM) methodologies to find fishing patterns. These DM techniques in SQL tool allow to deal with the high volume of this data set and determine the major factors that influence fishing activity.
In recent years, many new applications, such as location-based services, sensor monitoring systems, and data integration, have shown a growing amount of importance of uncertain datamining. In addition, due to instrum...
详细信息
ISBN:
(纸本)9783319520155;9783319520148
In recent years, many new applications, such as location-based services, sensor monitoring systems, and data integration, have shown a growing amount of importance of uncertain datamining. In addition, due to instrument errors, imprecise of sensor monitoring systems, and so on, real-world data tend to be numerical data with inherent uncertainty. Thus, mining association rules from an uncertain, especially probabilistic numerical dataset has been studied recently. However, a probabilistic numerical dataset often grows as new data append. Thus, developing a mining algorithm that can incrementally maintain discovered information is quite important. In this paper, we have designed an efficient, incremental mining algorithm to mine association rules from a probabilistic numeric dataset using estimated-frequent uncertain-itemsets. By using a user-specified support threshold, estimated-frequent uncertain-itemsets could act as a gap to avoid small itemsets becoming large in the updated dataset when new transactions are inserted. As a result, the algorithm has execution time faster than that of previous methods. An illustrated example is given to demonstrate the procedures of the algorithm.
In today’s world, the number of elderly cardiac patients is growing tremendously, which arises the need of Point-of-Care (PoC) systems. Therefore, the Body Sensor Networks (BSNs) collect ECG signal from remote patien...
详细信息
Water management and irrigation scheduling have become the main subjects of different studies in the last decades, due to their high influence on crop performance indicators. This study presents the most important par...
详细信息
ISBN:
(纸本)9781538606971
Water management and irrigation scheduling have become the main subjects of different studies in the last decades, due to their high influence on crop performance indicators. This study presents the most important parameters that have to be monitored in an irrigation management system and the most important ones are synthesized: air moisture and temperature, soil air and moisture, evapotranspiration. Based on the monitoring of these parameters, different control strategies and methods can be applied for optimization and efficiency of irrigation systems. The synthesis in this paper starts with classical control systems and, also, advanced methods such as fuzzy concept, decision support systems and model predictive control. Considering the currently necessity of integration into the Cyber-Physical Systems (CPS) concept, the paper finally proposes an irrigation control system for vineyards. The SCADA architecture is suitable for the Romanian context, it allows flexibility and ease of use and it reduced both energy consumption and irrigation operation costs.
Multilevel association rule mining is one of the important techniques of datamining to analyze the sales data. Multilevel association rules provide detailed information as compare to single level association rules. T...
详细信息
ISBN:
(纸本)9789811054273;9789811054266
Multilevel association rule mining is one of the important techniques of datamining to analyze the sales data. Multilevel association rules provide detailed information as compare to single level association rules. Today's era of e-commerce and e-business, various online marketing sites and social networking sites are generating tremendous amount of data in the form of sales, tweets, text mails, web usages and many more. The data generated from these sources is really too large so that it becomes tedious task to process and analyze using traditional approaches. This paper overcomes the drawback of single node computing by distributing the task to cluster of nodes. The performance of this system is analyzed using reduced minimum support threshold at different levels of concept hierarchy and by varying the database size. In this experiment, the transactional dataset is generated from big sales dataset then the distributed multilevel frequent pattern mining algorithm (DMFPM) is implemented to generate level-crossing frequent itemset using hadoop mapreduce framework. The multilevel association rules are generated from frequent itemset. The hierarchical redundant rule affects the efficiency of the system, so hierarchical redundancy is removed from it. Finally, the time efficiency of proposed algorithms is compared with existing Multilevel Frequent Pattern mining Algorithm (MFPM).
The datamining models are an excellent tool to help companies that live from the sale of items they produce. With these models combined with Lean Production, it becomes easier to remove waste and optimize industrial ...
详细信息
ISBN:
(纸本)9789811054273;9789811054266
The datamining models are an excellent tool to help companies that live from the sale of items they produce. With these models combined with Lean Production, it becomes easier to remove waste and optimize industrial production. This project is based on the phases of the methodology CRISP-DM. Several methods were applied to this data namely, average, mean and standard deviation, quartiles and Sturges rule. Classification Techniques were used in order to understand which model has the best probability of hitting the correct result. After performing the tests, model M1 was the one with the best chance to accomplish a great level of classification having 99.52% of accuracy.
datamining techniques are widely used to analyze the large amount of data. Classification is an important technique which classifies data of various real world applications. This paper aims to compare the performance...
详细信息
ISBN:
(纸本)9789811054273;9789811054266
datamining techniques are widely used to analyze the large amount of data. Classification is an important technique which classifies data of various real world applications. This paper aims to compare the performance of classification algorithms for weather data using Waikato Environment for Knowledge Analysis (WEKA). Performance analysis done using cross fold and training set method. The best algorithm found was J48 Decision Tree classifier with highest accuracy and minimum error as compared to others.
Association rule mining is one of the most common datamining techniques used to identify and describe interesting relationships between patterns from large quantities of data. Whereas many researches have been focuse...
详细信息
ISBN:
(纸本)9783319534800;9783319534794
Association rule mining is one of the most common datamining techniques used to identify and describe interesting relationships between patterns from large quantities of data. Whereas many researches have been focused on the extraction of these patterns which appear frequently to obtain general information, in some scenarios it could also be interesting to extract unexpected phenomena. Rare association rule mining is a recent field aiming to discover sporadic rules having a low frequency of appearance but high confidence of occurring together. This field is really useful over Big data where abnormal endeavor are more interesting than normal behavior. In this sense, our aim is to propose a new algorithm to obtain rare association rule on Big data using MapReduce by means of Spark and Hadoop. The experimental study includes more than 30 datasets revealing alluring results in efficiency when more than 60, 000 million of instances and file sizes of 500 GBytes are considered.
暂无评论