Sequential pattern mining is an important problem in continuous, fast, dynamic and unlimited stream mining. Recently approximate mining algorithms are proposed which spend too many system resources and can only obtain...
详细信息
Sequential pattern mining is an important problem in continuous, fast, dynamic and unlimited stream mining. Recently approximate mining algorithms are proposed which spend too many system resources and can only obtain the partial feature of stream. In this paper, a multi-level evolving sequential pattern mining model ESPMM is presented to address this problem thus the mostly entire stream feature is obtained. Furthermore, because of the smaller support of sequential patterns in each level, a mining method BMLA based on Levenshtein-Automata is proposed which builds state conversion model to compute sequences' similarity in linear time. The experiment results show this model is effective and efficient.
Recently there have been growing interests in the applications of wireless sensor networks such as traffic tracking, environmental surveillance, and network monitoring. In these applications, the exploration of the re...
详细信息
ISBN:
(纸本)9781424432004;9780769531854
Recently there have been growing interests in the applications of wireless sensor networks such as traffic tracking, environmental surveillance, and network monitoring. In these applications, the exploration of the relationship and linkage of sensing data with other data sources can be naturally expressed by the external join, where the sensory tuples join with an external table at the base station. However, executing such kind of join queries in a highly distributed and resource-constraint sensor network is a challenging task. In this paper, we propose a partition-based algorithm called NEJA (in-network external join algorithm) for the external join processing in sensor networks. NEJA organizes the sensory data of the network through an optimized "value-to-storage" mapping, according to which each storage point stores the tuples that belong to the same subrange on the joint attribute. Then the subrange of each storage point is further partitioned into unit ranges, and tuples in the same unit range wisely choose their joining point that incurs the least communication cost based on a cost metric according to the latest historical statistics. Also, NEJA adopts some optimization techniques to handle the changes of sensory data and uses approximate approaches to cut down the maintenance cost of the mechanism. The experimental results indicate that our scheme is effective in reducing the amount of transmissions for the real time external join processing, especially when the external table has a relatively large size.
This paper focuses on the problem that how to select the optimal service among many Web services which all meet the functional needs,establishes an index system for Web services products selection from four aspects,na...
详细信息
This paper focuses on the problem that how to select the optimal service among many Web services which all meet the functional needs,establishes an index system for Web services products selection from four aspects,namely the supply side,the user,product and *** on this,we collect the views of 30 experts by Analytic Hierarchy Process (AHP) method and calculate the weight of each index at all levels based on the data collected from questionnaire *** the overall sample data analysis,we put two types of sample data namely business operation experts and academics for comparative *** Web services selection model proposed in this article can provide the reference to Web services managers when they selecting Web services,and also contributes to in-depth research on the adoption of Web services based information system.
Frequent itemsets mining is an important problem in data mining. Frequent closed itemsets mining provides complete and condensed information for frequent pattern analysis thus reduces the memory cost without accuracy ...
详细信息
ISBN:
(纸本)9780769532639
Frequent itemsets mining is an important problem in data mining. Frequent closed itemsets mining provides complete and condensed information for frequent pattern analysis thus reduces the memory cost without accuracy loss. More research focus on stream mining with the more application of stream. Stream is fast and unlimited thus data had to be stored in limited memory, how to save running time and memory usage is the most important target. In this paper, we propose an improved frequent closed itemsets mining method based on traditional stream mining algorithm CFI-stream with bitmap coding named CLIMB (closed itemset mining with bitmap) over stream's sliding window. The distinct items are maintained in memory in lexicographic order and each itemset is coded to bit-sequence with the order of items, moreover, the bit-sequence is split into sections to be recoded to reduce the memory cost. The experimental results on real-life show that CLIMB algorithm is effective and efficient.
In the research field of supply chain coordination,many coordination contracts have been well *** chain members still feel confused about which contract should be chosen for their specific needs and *** paper starts f...
详细信息
In the research field of supply chain coordination,many coordination contracts have been well *** chain members still feel confused about which contract should be chosen for their specific needs and *** paper starts from the essential analysis of supply chain coordination,and summaries four important affecting factors as well as the attributes in coordination,including market demand,competitors' relationship,supply chain structure,and decisions *** importantly this paper studies the related products' characters,such as storage life,customer's loyalty,etc,which are seldom discussed in coordination before,and analyzes the influence in *** on these research,the chain members could analyze their specific product's characters and affecting factors,then choose the proper coordination contracts.
In this paper, we present a system called CRO (Chinese Review Observer) for online product review structurization. By Structurization, we mean identifying, extracting and summarizing information from unstructured revi...
详细信息
ISBN:
(纸本)9781605581934
In this paper, we present a system called CRO (Chinese Review Observer) for online product review structurization. By Structurization, we mean identifying, extracting and summarizing information from unstructured review text to a structured table. The core tasks include review collection, product feature and user opinion extraction, and polarity analysis of opinions. Existing research in this area is mainly English text oriented. To deal with Chinese effectively, we propose several novel approaches for fulfilling the core tasks. Then we integrated these approaches and implement the whole procedure of review structurization in the system CRO. Running results for reviews of real products show its performance is satisfactory.
Most traditional mining approaches of frequent item sets consider mainly on databases and thus can use the second storage and need multiple scans which are not adapted to mining of stream. Some new algorithms over str...
详细信息
Most traditional mining approaches of frequent item sets consider mainly on databases and thus can use the second storage and need multiple scans which are not adapted to mining of stream. Some new algorithms over stream's sliding window are presented recently, which perform addition and deletion over stream independently, so the common deleting strategy which removes the earliest transaction is used when the window slides. This paper considers both operations together to reduce the computation cost, consequently, three deleting strategies are proposed to improve the performance with little precision loss. The experimental results show that these strategies over current method are effective and efficient.
Compared with traditional magnetic disks, flash memory has many advantages and has been used as external storage media for a wide spectrum of electronic devices (such as PDA, MP3, digital camera and mobile phone). As ...
详细信息
Compared with traditional magnetic disks, flash memory has many advantages and has been used as external storage media for a wide spectrum of electronic devices (such as PDA, MP3, digital camera and mobile phone). As the capacity increases and price drops, it looks like a perfect alternative for magnetic disks. However, due to hardware limitations of flash memory, techniques including storage subsystem and indexing originally designed for magnetic disks can not run smoothly in a flash memory without any modification. In this paper we explore problems of indexing flash-resided data and present a new dynamical hash index for flash memory in two schemas. The analysis and experimental results validate the efficiency of our design.
Nowadays more and more people like to publish their comments on a product on the Web. Mining such unstructured data (product reviews) is exciting hot and challenging research and application topic. In this paper, we f...
详细信息
Nowadays more and more people like to publish their comments on a product on the Web. Mining such unstructured data (product reviews) is exciting hot and challenging research and application topic. In this paper, we focus on mining product reviews written in Chinese. We aim at extract the structural information from Chinese product reviews. By structural information, we mean product features and corresponding opinion words expressed in each review text. There are already some works done for reviews written in English, but less in Chinese. In this paper, we propose an effective method to extract candidate features and some effective pruning rules to prune the features. Also, we introduce a pattern extraction and matching step to improve our results. The experiment results show our approach is very effective, and has a good recall and precision.
In this paper, we discuss the energy efficient multicast problem for discrete power levels in ad hoc sensor wireless networks. The problem of our concern is: given n nodes and each node v has l(v) transmission power l...
详细信息
In this paper, we discuss the energy efficient multicast problem for discrete power levels in ad hoc sensor wireless networks. The problem of our concern is: given n nodes and each node v has l(v) transmission power levels and a multicast request (s, D), how to find a multicast tree rooted at s and spanning all destinations in D such that the total energy cost of the multicast tree is minimized. This problem is NP-hard. We propose a NWM_DST algorithm which has a theoretical guaranteed approximation performance ratio, and two efficient heuristics MNJT and g-D-MIP for multicast tree problem. Simulation results have shown efficiency of our proposed algorithms.
暂无评论