检索结果-内蒙古大学图书馆

A New Algorithm of Mining high utility sequential pattern in Streaming Data

INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS 2019年第1期12卷 342-350页

作者： Tang, Huijun Liu, Yangguang Wang, Le Ningbo Dahongying Univ Sch Informat Engn 899 XueYuan Rd Ningbo 315175 Zhejiang Peoples R China

high utility sequential pattern (HUSP) mining has emerged as a novel topic in data mining, its computational complexity increases compared to frequent sequences mining and high utility itemsets mining. A number of algorithms have been proposed to solve such problem, but they mainly focus on mining HUSP in static databases and do not take streaming data into account, where unbounded data come continuously and often at a high speed. The efficiency of mining algorithms is still the main research topic in this field. In view of this, this paper proposes an efficient HUSP mining algorithm named HUSP-UT (utility on Tail Tree) based on tree structure over data stream. Substantial experiments on real datasets show that HUSP-UT identifies high utility sequences efficiently. Comparing with the state-of-the-art algorithm HUSP-Stream (HUSP mining over data streams) in our experiments, the proposed HUSP-UT outperformed its counterpart significantly, especially for time efficiency, which was up to 1 order of magnitude faster on some datasets. (c) 2019 The Authors. Published by Atlantis Press SARL.

关键词： high utility sequential pattern Data streaming Sliding windows Tree structure Header table

来源：评论

学校读者我要写书评

暂无评论

HUPSP-LAL: Efficiently mining utility-driven sequential patterns in uncertain sequences

引用

EXPERT SYSTEMS WITH APPLICATIONS 2025年 270卷

作者： Li, Gufeng Xiang, Jiawei Fang, Weiyi Wang, Jialong Shang, Tao Xidian Univ Hangzhou Inst Technol Hangzhou 311231 Zhejiang Peoples R China Xidian Univ State Key Lab Integrated Serv Networks Xian 710071 Shaanxi Peoples R China

Data mining encompasses various subfields, among which an important branch is high utility itemset mining. Within this domain, exploring high utility sequential patterns is an emerging field of interest, which is to identify high utility sequential patterns (HUSPs) within databases. In practice, there are many fields with application of high utility sequential pattern mining, including DNA sequence analysis and network intrusion detection, etc. However, most HUSPM assume that the data in the database is accurate, which is not consistent with the actual situation in the real world. Inevitably, data uncertainty arises due to the collection process, which involves sensors of varying degrees of precision. Although the methods of high utility probability sequential pattern mining (HUPSPM) in the context of uncertain sequences have been proposed, their performance is unsatisfactory when dealing with a low utility/probability threshold or largescale datasets. Therefore, we propose an efficient HUPSPM algorithm called HUPSP-LAL. We have proposed a new probability calculation framework to mathematically represent the collected uncertain data. We designed the compact structure, PUL - IA - EL , which HUPSP-LAL uses for projection to accelerate the calculation of the utility, probability, and upper bounds of the candidates. This paper introduces two probability-based pruning strategies, complemented by two additional utility-based pruning strategies, all aimed at diminishing the search space. The experimental findings from real datasets indicate that HUPSP-LAL outperforms the leading algorithms significantly regarding patterns, runtime, candidates, and memory consumption.

关键词： Data mining Uncertain data high utility sequential pattern Database projection

来源：评论

学校读者我要写书评

暂无评论

A fast algorithm for hiding high utility sequential patterns 17

A fast algorithm for hiding high utility sequential patterns

引用

IEEE Int Conf on Parallel and Distributed Processing with Applications, Big Data and Cloud Computing, Sustainable Computing and Communications, Social Computing and Networking (ISPA/BDCloud/SocialCom/SustainCom)

作者： Zhang, Chunkai Zu, Yiwen Nie, Junli Du, Linzi Du, Jingqi Hong, Siyuan Wu, Wenping Harbin Inst Technol Sch Comp Sci & Technol Shenzhen Guangdong Peoples R China Yunnan Elect Power Res Inst Grp Co Ltd Kunming Yunnan Peoples R China China Elect Cyberspace Great Wall Co Ltd Beijing Peoples R China

ISBN: (纸本)9781728143286

high utility sequential patterns (HUSPs) are common patterns that can be discovered from the data collected in many domains (e.g. retail, bioinformatics, mobile commerce). To extract these patterns, high utility sequential pattern mining (HUSPM) has been proposed in [went decade. Although the HUSPM algorithms provide us a special perspective to analyze the knowledge behind the collected data, it also arises the risk of the privacy leakage and underlying security issues. This leads to the emergence of high utility sequential pattern hiding (HUSPH) whose purpose is to hide all HUSPs in the sequence database under a specified threshold. Around this topic, many algorithms were proposed. However. the existing algorithms are very time-consuming. which makes them unable to process the real massive data quickly. In this paper, we propose an efficient algorithm named FH-HUSP (fast algorithm for hiding high utility sequential patterns) for HUSPH. Substantial experimental results show that the proposed algorithm can hide all high utility sequential patterns quickly under the specific minimum utility with relatively small modifications.

关键词： data mining privacy preserving data mining high utility sequential pattern

来源：评论

学校读者我要写书评

暂无评论

Two efficient algorithms for mining high utility sequential patterns 17

Two efficient algorithms for mining high utility sequential ...

引用

作者： Zhang, Chunkai Zu, Yiwen Nie, Junli Du, Linzi Harbin Inst Technol Sch Comp Sci & Technol Shenzhen Peoples R China

ISBN: (纸本)9781728143286

high utility sequential pattern mining (HUSPM) is an emerging topic in data mining. Compared with the previous topics (sequential pattern mining and high utility itemset mining), HUSPM can provide more applicable knowledge, for it comprehensively considers utility indicating the business value and sequential indicating the causality of different items. However, the combination of utility and sequential brings the dramatic challenges and makes HUSPM more difficult than the previous problems. In this paper, we propose an two efficient algorithms, HUS-UT and HUS-Par, for HUSPM. The proposed IRIS-UT algorithm adopts a novel data structure named utility-Table to facilitate the utility calculation, so it can find the desired patterns quickly. The HUS-Par algorithm is a parallel version of HUS-UT based on the thread model, which also exploits two balance strategies to improve efficiency. We also conduct substantially experiments to evaluate the performance of our algorithms. The experimental results show that our algorithms are much faster than the state-of-the-art algorithms.

关键词： high utility sequential pattern mining high utility sequential pattern Data mining

来源：评论

学校读者我要写书评

暂无评论

Mining Regular high utility sequential patterns in Static and Dynamic Databases 13th

Mining Regular High Utility Sequential Patterns in Static an...

引用

13th International Conference on Ubiquitous Information Management and Communication (IMCOM)

作者： Ishita, Sabrina Zaman Ahmed, Chowdhury Farhan Leung, Carson K. Hoi, Calvin H. S. Univ Dhaka Dhaka Bangladesh Univ Manitoba Winnipeg MB Canada

ISBN: (纸本)9783030190637;9783030190620

Regular pattern mining has been emerged as one of the important sub-domains of data mining with its numerous applications. Although patterns that occur at a regular interval throughout the whole database can lead to interesting knowledge, examining the utility values of these patterns can unveil more interesting useful information. In a sequence database, the task of mining regular high utility patterns can be more challenging. In this paper, we first propose a new algorithm for mining regular high utility sequential patterns from static databases. As handling of the incremental nature of big data brings useful results in many applications in the recent era of big data, we then extend our algorithm to mine regular high utility sequential patterns from dynamic databases. Evaluation results on several real-life datasets show the effectiveness of our two algorithms.

关键词： Data mining Regular pattern mining high utility sequential pattern Incremental mining Information management Information processing management

来源：评论

学校读者我要写书评

暂无评论

On efficiently mining high utility sequential patterns

引用

KNOWLEDGE AND INFORMATION SYSTEMS 2016年第2期49卷 597-627页

作者： Wang, Jun-Zhe Huang, Jiun-Long Chen, Yi-Cheng Natl Chiao Tung Univ Dept Comp Sci Hsinchu Taiwan Tamkang Univ Dept Comp Sci & Informat Engn New Taipei Taiwan

high utility sequential pattern mining is an emerging topic in pattern mining, which refers to identify sequences with high utilities (e.g., profits) but probably with low frequencies. To identify high utility sequential patterns, due to lack of downward closure property in this problem, most existing algorithms first generate candidate sequences with high sequence-weighted utilities (SWUs), which is an upper bound of the utilities of a sequence and all its supersequences, and then calculate the actual utilities of these candidates. This causes a large number of candidates since SWU is usually much larger than the real utilities of a sequence and all its supersequences. In view of this, we propose two tight utility upper bounds, prefix extension utility and reduced sequence utility, as well as two companion pruning strategies, and devise HUS-Span algorithm to identify high utility sequential patterns by employing these two pruning strategies. In addition, since setting a proper utility threshold is usually difficult for users, we also propose algorithm TKHUS-Span to identify top-k high utility sequential patterns by using these two pruning strategies. Three searching strategies, guided depth-first search (GDFS), best-first search (BFS) and hybrid search of BFS and GDFS, are also proposed to improve the efficiency of TKHUS-Span. Experimental results on some real and synthetic datasets show that HUS-Span and TKHUS-Span with strategy BFS are able to generate less candidate sequences and thus outperform other prior algorithms in terms of mining efficiency.

关键词： high utility sequential pattern high utility sequential pattern mining Top-k high utility sequential pattern utility mining

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：