检索结果-内蒙古大学图书馆

An improved hidden behavioral pattern mining approach to enhance the performance of recommendation system in a big data environment

引用

JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES 2022年第10期34卷 8390-8400页

作者： Sundari, P. Shanmuga Subaji, M. Vellore Inst Technol Sch Comp Sci & Engn Vellore 632014 India Vellore Inst Tech IIIP Vellore 632014 India

The proposed work aims to solve data sparsity problem in the recommendation system. It handles two-level pre-processing techniques to reduce the data size at the item level. Additional resources like items genre, tag, and time are added to learn and analyse the behaviour of the user preferences in-depth. The advantage of the proposed method is to recommend the item, based on user interest pattern and avoid recommending the outdated items. User information are grouped based on similar item genre and tag feature. This effectively handle overlapping conditions that exist on item's genre, as it has more than one genre at initial level. Further, based on time, it analyses the user non-static interest. Overall it reduces the dimensions which is an initial way to prepare data, to analyse hidden pattern. To enhance the perfor-mance, the proposed method utilized Apache's spark Mllib FP-Growth and association rule mining approach in a distributed environment. To reduce the computation cost of constructing tree in FP-Growth, the candidate data set is stored in matrix form. The experiments were conducted using MovieLens data set. The observed results shows that the proposed method achieves 4% increase in accu-racy when compared to earlier methods.(c) 2020 The Authors. Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/).

关键词： Hidden Behavioral analysis Big data Fp-Growth Association rule mining two-level clustering

来源：评论

学校读者我要写书评

暂无评论

A positional keyword-based approach to inferring fine-grained message formats

引用

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE 2020年 102卷 369-381页

作者： Jiang, Jiaojiao Versteeg, Steve Han, Jun Hossain, M. D. Arafat Schneider, Jean-Guy Swinburne Univ Technol Sch Software & Elect Engn Melbourne Vic 3122 Australia RMIT Univ Sch Sci Comp Sci & Software Engn Melbourne Vic 3000 Australia Deakin Univ Sch Informat Technol Burwood Vic 3125 Australia

Message format extraction, the process of revealing the message syntax without access to the protocol specification, is important for a variety of applications such as service virtualization and network security. In this paper, we propose P-token, which mines fine-grained message formats from network traces. The novelty of our approach is twofold: a 'positional keyword' identification technique and a two-level hierarchical clustering strategy. Positional keywords are based on the insight that keywords or reserved words usually occur at relatively fixed positions in the messages. By associating positions as meta-information with keywords, we can more accurately distinguish keywords from message payload data. After identification, the positional keywords are used as features to cluster the messages using density peaks clustering. We then perform another level of clustering to refine the clusters with low homogeneity. Finally, the message format of each cluster is extracted based on the observed ordering of keywords. P-token improves on the current state-of-the-art techniques by successfully addressing two challenges that commonly afflict existing keyword based format extraction methods: message keyword mis-identification and message format over-generalization. We have conducted experiments on services and applications using various protocols, including SOAP, LDAP, IMS and a RESTful service. Our experimental results show that P-token outperforms existing methods in extracting message formats. (C) 2019 Elsevier B.V. All rights reserved.

关键词： Protocol message formats Positional keyword two-level clustering

来源：评论

学校读者我要写书评

暂无评论

A framework for evaluating aggressive driving behaviors based on in-vehicle driving records

引用

TRANSPORTATION RESEARCH PART F-TRAFFIC PSYCHOLOGY AND BEHAVIOUR 2019年 65卷 610-619页

作者： Lee, Jooyoung Jang, Kitae Korea Adv Inst Sci & Technol Cho Chun Shik Grad Sch Green Transportat 261 Daehak Ro Daejeon South Korea

Driving behavior is how drivers respond to actual driving environments and a major factor for road traffic safety. Recent advances in in-vehicle sensors facilitate continuous monitoring of driving behaviors;large-scale driving data have been accumulated. This study develops a framework to evaluate large-scale driving records and to establish clusters that can be used to identify potentially aggressive driving behaviors. The framework employs three steps of data analytic methods: abrupt change detection to extract meaningful driving events from raw data, feature extraction using an auto-encoder, and two-level clustering. This framework is applied to real driving data that were obtained from 43 taxis in Korean metropolitan cities. The application shows that the framework can characterize driving patterns from large-scale driving records and identify clusters with high potential for aggressive driving. The findings imply that the outcome clusters represent the norm of driving behavior and thus can be used as a reference in diagnosing other drivers' behavior. (C) 2017 Elsevier Ltd. All rights reserved.

关键词： In-vehicle driving record Aggressive driving behavior Large-scale data two-level clustering

来源：评论

学校读者我要写书评

暂无评论

Fast Emulation of Self-Organizing Maps for Large Datasets 6th

Fast Emulation of Self-Organizing Maps for Large Datasets

引用

6th International Conference on Ambient Systems, Networks and Technologies (ANT) / 5th International Conference on Sustainable Energy Information Technology (SEIT)

作者： Cordel, Macario O., II Azcarraga, Arnulfo P. De La Salle Univ Manila 1004 Philippines

The self-organizing map (SOM) methodology does vector quantization and clustering on the dataset, and then projects the obtained clusters to a lower dimensional space, such as a 2D map, by positioning similar clusters in locations that are spatially closer in the lower dimension space. This makes the SOM methodology an effective tool for data visualization. However, in a world where mined information from big data have to be available immediately, SOM becomes an unattractive tool because of its time complexity. In this paper, we propose an alternative visualization methodology for large datasets that emulates SOM methodology without the speed constraints inherent to SOM. To demonstrate the efficiency and the potential of the proposed scheme as a fast visualization tool, the methodology is used to cluster and project the 3,823 image samples of handwritten digits of the Optical Recognition of Handwritten Digits dataset. Although the dataset is not, by any means large, it is sufficient to demonstrate the speed-up that can be achieved by using this proposed SOM emulation procedure. (C) 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/).

关键词： Data visualization self-organizing map multidimensional scaling two-level clustering fast data analysis positions of clusters

来源：评论

学校读者我要写书评

暂无评论

Fast Emulation of Self-organizing Maps for Large Datasets

引用

Procedia Computer Science 2015年 52卷 381-388页

作者： Macario O. Cordel Arnulfo P. Azcarraga De La Salle University 2401 Taft Avenue Manila 1004 Philippines

关键词： Data visualization self-organizing map multidimensional scaling two-level clustering fast data analysis positions of clusters

来源：评论

学校读者我要写书评

暂无评论

Enriched topological learning for cluster detection and visualization

引用

NEURAL NEtwoRKS 2012年 32卷 186-195页

作者： Cabanes, Guenael Bennani, Younes Fresneau, Dominique LIPN CNRS UMR 7030 F-93430 Villetaneuse France LEEC EA 4443 F-93430 Villetaneuse France

The exponential growth of data generates terabytes of very large databases. The growing number of data dimensions and data objects presents tremendous challenges for effective data analysis and data exploration methods and tools. Thus, it becomes crucial to have methods able to construct a condensed description of the properties and structure of data, as well as visualization tools capable of representing the data structure from these condensed descriptions. The purpose of our work described in this paper is to develop a method of describing data from enriched and segmented prototypes using a topological clustering algorithm. We then introduce a visualization tool that can enhance the structure within and between groups in data. We show, using some artificial and real databases, the relevance of the proposed approach. (C) 2012 Elsevier Ltd. All rights reserved.

关键词： Self-Organizing Map Prototype enrichment two-level clustering Coclustering Visualization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：