检索结果-内蒙古大学图书馆

Scalable mining of high-utility sequential patterns With Three-Tier MapReduce Model

ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA 2022年第3期16卷 60-60页

作者： Lin, Jerry Chun-Wei Djenouri, Youcef Srivastava, Gautam Li, Yuanfa Yu, Philip S. Western Norway Univ Appl Sci Inndalsveien 28 N-5063 Bergen Norway SINTEF SINTEF Digital Forskningsveien 1 N-0373 Oslo Norway Brandon Univ 270 18th St Brandon MB R7A 6A9 Canada China Med Univ Taichung Taiwan Harbin Inst Technol Shenzhen HIT Campus Univ Town Shenzhen Shenzhen 518055 Peoples R China Univ Illinois 1200 West Harrison St Chicago IL 60607 USA

high-utility sequential pattern mining (HUSPM) is a hot research topic in recent decades since it combines both sequential and utility properties to reveal more information and knowledge rather than the traditional frequent itemset mining or sequential pattern mining. Several works of HUSPM have been presented but most of them are based on main memory to speed up mining performance. However, this assumption is not realistic and not suitable in large-scale environments since in real industry, the size of the collected data is very huge and it is impossible to fit the data into the main memory of a single machine. In this article, we first develop a parallel and distributed three-stage MapReduce model for mining high-utility sequential patterns based on large-scale databases. Two properties are then developed to hold the correctness and completeness of the discovered patterns in the developed framework. In addition, two data structures called sidset and utilitylinked list are utilized in the developed framework to accelerate the computation for mining the required patterns. From the results, we can observe that the designed model has good performance in large-scale datasets in terms of runtime, memory, efficiency of the number of distributed nodes, and scalability compared to the serial HUSP-Span approach.

关键词： high-utility sequential pattern mining MapReduce large-scale parallel and distributed

来源：评论

学校读者我要写书评

暂无评论

Multi-core parallel algorithms for hiding high-utility sequential patterns

引用

KNOWLEDGE-BASED SYSTEMS 2022年 237卷 107793-107793页

作者： Ut Huynh Bac Le Duy-Tai Dinh Fujita, Hamido Univ Sci Fac Informat Technol Dept Comp Sci Ho Chi Minh City Vietnam Vietnam Natl Univ Ho Chi Minh City Vietnam Japan Adv Inst Sci & Technol Nomi Ishikawa 9231292 Japan Iwate Prefectural Univ Reg Res Ctr Takizawa Iwate 0200693 Japan

high-utility sequential pattern mining (HUSPM) can be applied in many applications such as retail, market basket analysis, click-stream analysis, healthcare data analysis, and bioinformatics. HUSPM algorithms discover useful information from data. However, looking at the dark side, the sensitive patterns can also be disclosed by the competitors, who use a HUSPM algorithm on the leaked data. Therefore, high-utility sequential pattern hiding (HUSPH) is used to protect the privacy information from HUSPM algorithms. This paper proposes three algorithms named high utility sequential pattern Hiding Using Pure Array Structure (USHPA), high utility sequential pattern Hiding Using Parallel Strategy (USHP), and high utility sequential pattern Hiding Using Random Distribution Strategy (USHR) for hiding high-utility sequential patterns on quantitative sequence datasets. These algorithms use a proposed data structure named pattern utility Set for Hiding (PUSH) to speed up the hiding process. We also introduce a metric called Privacy Factor to evaluate the quality of hiding results. The comparative experiments were conducted on real datasets to evaluate the performance of the proposed algorithms in terms of runtime, memory consumption, scalability, missing cost, and privacy factor. Results show that the proposed algorithms can efficiently sanitize the input datasets, and they outperform the compared algorithms for all metrics. (C)& nbsp;2021 Elsevier B.V. All rights reserved.

关键词： high-utility sequential pattern mining high-utility sequential pattern hiding Parallel hiding Privacy preserving utility mining

来源：评论

学校读者我要写书评

暂无评论

P-FCloHUS: A Parallel Approach for mining Frequent Closed high-utility Sequences on Multi-core Processors 14th

P-FCloHUS: A Parallel Approach for Mining Frequent Closed Hi...

引用

14th Asian Conference on Intelligent Information and Database Systems (ACIIDS)

作者： Hong-Phat Nguyen Bac Le Univ Sci Fac Informat Technol Ho Chi Minh City Vietnam Vietnam Natl Univ Ho Chi Minh City Vietnam

ISBN: (纸本)9789811982330;9789811982347

Frequent closed high-utility (FCHU) sequences are preferable to frequent closed sequences. Not only because of their utility-based nature that considerately contributes to taking decisive business actions, FCHU sequences also preserve necessary information for re-constructing frequent high-utility sequences. Despite of their vital role, mining FCHU sequences is a time consuming task when facing with large-scale datasets, or especially when the input thresholds are relatively small. To contend with these difficulties, this paper proposes a parallel algorithm named P-FCloHUS for fast mining FCHU sequences by making good use of multi-core processors. By relying on a novel Single scan synchronization strategy that is facilitated by an efficiently Partitioned result space structure, P-FCloHUS successfully alleviates the communication cost between mining tasks and hence speeds up the parallel mining process. Experiments on both dense and sparse datasets show that P-FCloHUS outperforms the state-of-the-art FMaxCloHUSM in terms of runtime performance.

关键词： Data mining high-utility sequential pattern mining Frequent closed sequence mining Multi-core processors Parallel mining

来源：评论

学校读者我要写书评

暂无评论

An efficient algorithm for Hiding high utility sequential patterns

引用

INTERNATIONAL JOURNAL OF APPROXIMATE REASONING 2018年 95卷 77-92页

作者： Bac Le Duy-Tai Dinh Van-Nam Huynh Quang-Minh Nguyen Fournier-Viger, Philippe Univ Sci VNU HCMC Ho Chi Minh City Vietnam Japan Adv Inst Sci & Technol Nomi Japan Acad Cryptog Tech Ho Chi Minh City Vietnam Harbin Inst Technol Sch Humanities & Social Sci Shenzhen Peoples R China

high utility sequential patterns (HUSP) are a type of patterns that can be found in data collected in many domains such as business, marketing and retail. Two critical topics related to HUSP are: HUSP mining (HUSPM) and HUSP Hiding (HUSPH). HUSPM algorithms are designed to discover all sequential patterns that have a utility greater than or equal to a minimum utility threshold in a sequence database. HUSPH algorithms, by contrast, conceal all HUSP so that competitors cannot find them in shared databases. This paper focuses on HUSPH. It proposes an algorithm named HUS-Hiding to efficiently hide all HUSP. An extensive experimental evaluation is conducted on six real-life datasets to evaluate the performance of the proposed algorithm. According to the experimental results, the designed algorithm is more effective than three state-of-the-art algorithms in terms of runtime, memory usage and hiding accuracy. (C) 2018 Elsevier Inc. All rights reserved.

关键词： Data mining Privacy preserving data mining high-utility sequential pattern mining high-utility sequential pattern hiding

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：