检索结果-内蒙古大学图书馆

Coupling MDL and Markov chain Monte Carlo to sample diverse pattern sets

DATA & KNOWLEDGE ENGINEERING 2025年 156卷

作者： Camelin, Francois Loudni, Samir Pesant, Gilles Truchet, Charlotte IMT Atlantique 4 Rue Alfred Kastler F-44300 Nantes France Polytech Montreal 2500 Chem Polytech Montreal PQ H3T 0A3 Canada Sorbone Univ Paris 75005 France

Exhaustive methods of pattern extraction in a database face real obstacles to speed and output control of patterns: a large number of patterns are extracted, many of which are redundant. Pattern extraction methods through sampling, which allow for controlling the size of the outputs while ensuring fast response times, provide a solution to these two problems. However, these methods do not provide high-quality patterns: they return patterns that are very infrequent in the database. Furthermore, they do not scale. To ensure more frequent and diversified patterns in the output, we propose integrating compression methods into sampling to select the most representative patterns from the sampled transactions. We demonstrate that our approach improves the state of the art in terms of diversity of produced patterns.

关键词： Data mining mining methods and algorithms Pattern Sampling Diversity Compression CFTP

来源：评论

学校读者我要写书评

暂无评论

Using Subgroup Discovery to Relate Odor Pleasantness and Intensity to Peripheral Nervous System Reactions

引用

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING 2023年第3期14卷 2005-2019页

作者： Moranges, Maelle Plantevit, Marc Bensafi, Moustafa Lyon Neurosci Res Ctr F-69500 Bron France Univ Lyon LIRIS Lab Villeurbanne Lyon France Ecole Ingn Informat Paris Sch Engn & Comp Sci F-94270 Le Kremlin Bicetre France

Activation of the autonomic nervous system is a primary characteristic of human hedonic responses to sensory stimuli. For smells, general tendencies of physiological reactions have been described using classical statistics. However, these physiological variations are generally not quantified precisely;each psychophysiological parameter has very often been studied separately and individual variability was not systematically considered. The current study presents an innovative approach based on data mining, whose goal is to extract knowledge from a dataset. This approach uses a subgroup discovery algorithm which allows extraction of rules that apply to as many olfactory stimuli and individuals as possible. These rules are described by intervals on a set of physiological attributes. Results allowed both quantifying how each physiological parameter relates to odor pleasantness and perceived intensity but also describing the participation of each individual to these rules. This approach can be applied to other fields of affective sciences characterized by complex and heterogeneous datasets.

关键词： Physiology Olfactory Data mining Task analysis Skin Sociology Classification algorithms mining methods and algorithms pattern analysis physiological measures

来源：评论

学校读者我要写书评

暂无评论

Change pattern relationships in event logs

引用

DATA & KNOWLEDGE ENGINEERING 2024年 154卷

作者： Cremerius, Jonas Patzlaff, Hendrik Weske, Mathias Univ Potsdam Hasso Plattner Inst Prof Dr Helmert Str 2-3 D-14482 Potsdam Brandenburg Germany

Process mining utilises process execution data to discover and analyse business processes. Event logs represent process executions, providing information about the activities executed. In addition to generic event attributes like activity name and timestamp, events might contain domain-specific attributes, such as a blood sugar measurement in a healthcare environment. Many of these values change during a typical process quite frequently. We refer to those as dynamic event attributes. Change patterns can be derived from dynamic event attributes, describing if the attribute values change from one activity to another. So far, change patterns can only be identified in an isolated manner, neglecting the chance of finding co-occuring change patterns. This paper provides an approach to identifying relationships between change patterns by utilising correlation methods from statistics. We applied the proposed technique on two event logs derived from the MIMIC-IV real-world dataset on hospitalisations in the US and evaluated the results with a medical expert. It turns out that relationships between change patterns can be detected within the same directly or eventually follows relation and even beyond that. Further, we identify unexpected relationships that are occurring only at certain parts of the process. Thus, the process perspective reveals novel insights on how dynamic event attributes change together during process execution. The approach is implemented in Python using the PM4Py framework.

关键词： Business process mining methods and algorithms Process mining Change pattern relationships Correlation MIMIC-IV

来源：评论

学校读者我要写书评

暂无评论

Robotic process automation using process mining - A systematic literature review

引用

DATA & KNOWLEDGE ENGINEERING 2023年 148卷

作者： El-Gharib, Najah Mary Amyot, Daniel Univ Ottawa Sch Elect Engn & Comp Sci 800 King Edward St Ottawa ON K1N 6N5 Canada

Process mining (PM) aims to construct, from event logs, process maps that can help discover, automate, improve, and monitor organizational processes. Robotic process automation (RPA) uses software robots to perform some tasks usually executed by humans. It is usually difficult to determine what processes and steps to automate, especially with RPA. PM is seen as one way to address such difficulty. This paper aims to assess the applicability of process mining in accelerating and improving the implementation of RPA, along with the challenges encountered throughout project lifecycle. A systematic literature review was conducted to examine the approaches where PM techniques were used to understand the as-is processes that can be automated with software robots. Seven databases were used to identify papers on this topic. A total of 32 papers, all published since 2018, were selected from 605 unique candidate papers and then analyzed. There is a steady increase in the number of publications in this domain, especially during the year 2022, which suggests a raising interest in the combined use of PM with RPA. The literature mainly focuses on the methods to record the events that occur at the level of user interactions with the application, and on the preprocessing methods that are needed to discover routines with the steps that can be automated. Important challenges are faced with preprocessing such event logs, and many lifecycle steps of automation projects are weakly supported by existing approaches suggesting corresponding research areas in need of further attention.

关键词： Business processes Intelligence automation mining methods and algorithms Process discovery Process mining Robotic process automation Task mining User interactions

来源：评论

学校读者我要写书评

暂无评论

Recognition algorithm for cross-texting in text chat conversations

引用

DATA & KNOWLEDGE ENGINEERING 2024年 150卷

作者： Lee, Da-Young Cho, Hwan-Gue Pusan Natl Univ Dept Informat Convergence Engn Pusan South Korea

As the development of the Internet and IT technology, short-text based communication is so popular compared with voice based one. Chat-based communication enables rapid, short and massive exchange of message with many people, creates new social problems. 'Cross-texting' is one of them. It refers to accidentally sending a text to an unintended person during the concurrent conversations with separated multiple people. Cross-texting would be a serious problem in languages where respectful expressions are required. As text-based communication is getting popular, it is a crucial work to prevent cross-texting by detecting it in advance in languages with honorifics expression such as Korean. In this paper, we proposed two methods detecting a cross-text using a deep learning model. The first model is the formal feature vector, which models dialog by explicitly defining the politeness and completeness features. The second one is the grpah2vec based ChatGram-net model, which models the dialog based on the syllable occurrence relationship. To evaluate the detection performance, we suggest a generating method for cross-text datasets from a actual messenger corpus. In experiment we show that both proposed models detected cross-text effectively, and exceeded the performance of the baseline models.

关键词： Text mining mining methods and algorithms Cross-texting Natural language processing Text graph embedding

来源：评论

学校读者我要写书评

暂无评论

A parameter-free KNN for rating prediction

引用

DATA & KNOWLEDGE ENGINEERING 2022年 142卷

作者： Fopa, Medjeu Gueye, Modou Ndiaye, Samba Naacke, Hubert Univ Cheikh Anta Diop BP 5005 Dakar Senegal Sorbonne Univ LIP6 4 Pl Jussieu F-75005 Paris France

Among the most popular collaborative filtering algorithms are methods based on the K nearest neighbors (KNN). In their basic operation, KNN methods consider a fixed number of neighbors to make recommendations. However, it is not easy to choose an appropriate number of neighbors. Thus, it is generally fixed by calibration to avoid inappropriate values which would negatively affect the accuracy of the recommendations. In the literature, some authors have addressed the problem of dynamically finding an appropriate number of neighbors. But they use additional parameters which limit their proposals because these parameters also require calibration. In this paper, we propose a parameter-free KNN method for rating prediction. It is able to dynamically select an appropriate number of neighbors to use. The experiments that we did on four publicly available datasets demonstrate the efficiency of our proposal. It rivals those of the state of the art in their best configurations.

关键词： mining methods and algorithms Recommendation systems Collaborative filtering K nearest neighbors

来源：评论

学校读者我要写书评

暂无评论

PRESS: A personalised approach for mining top-k groups of objects with subspace similarity

引用

DATA & KNOWLEDGE ENGINEERING 2020年 128卷 101833-101833页

作者： Hashem, Tahrima Rashidi, Lida Kulik, Lars Bailey, James Univ Melbourne Sch Comp & Informat Syst Melbourne Vic Australia

Personalised analytics is a powerful technology that can be used to improve the career, lifestyle, and health of individuals by providing them with an in-depth analysis of their characteristics as compared to other people. Existing research has often focused on mining general patterns or clusters, but without the facility for customisation to an individual's needs. It is challenging to adapt such approaches to the personalised case, due to the high computational overhead they require for discovering patterns that are good across an entire dataset, rather than with respect to an individual. In this paper, we tackle the challenge of personalised pattern mining and propose a query-driven approach to mine objects with subspace similarity. Given a query object in a categorical dataset, our proposed algorithm, PRESS (Personalised Subspace Similarity), determines the top-k groups of objects, where each group has high similarity to the query for some particular subspace. We evaluate the efficiency and effectiveness of our approach on both synthetic and real datasets.

关键词： Subspace mining Similarity search Association rules mining methods and algorithms Personalisation

来源：评论

学校读者我要写书评

暂无评论

Multiangle P2P Borrower Characterization Analytics by Attributes Partition Considering Business Process

引用

IEEE INTELLIGENT SYSTEMS 2020年第3期35卷 96-105页

作者： Liu, Shuaiqi Wu, Sen Univ Sci & Technol Beijing Donlinks Sch Econ & Management Dept Management Sci & Engn Beijing Peoples R China

In the research of P2P lending data, the study of borrower characteristics is of great value for the establishment of target customers and risk management. Because of high dimensionality, mixed attributes, different importance, and different generation time of information, P2P lending data often leads to the mining results unable to reflect the important borrower characteristics that affect the approval results and the approval loan amount. In this article, we are the first to propose the attributes partition of lending data considering the business process to classify variables into different types. Furthermore, we propose a multiangle data mining method for lending data by attributes partition considering the business process to discover the characteristics of P2P borrowers from multiple perspectives. Experimental results on the real dataset demonstrate that the method depicts the important characteristics of borrowers that affect the approval results and the loan amount, makes the research on P2P borrower characteristics more comprehensive and specific, and provides new ideas for the research on high-dimensional lending data.

关键词： Data mining Risk management Global Positioning System Intelligent systems Internet Loans and mortgages Internet mining methods and algorithms Feature evaluation and selection Data mining

来源：评论

学校读者我要写书评

暂无评论

Deep learning in the COVID-19 epidemic: A deep model for urban traffic revitalization index

引用

DATA & KNOWLEDGE ENGINEERING 2021年 135卷 101912-101912页

作者： Lv, Zhiqiang Li, Jianbo Dong, Chuanhao Li, Haoran Xu, Zhihao Qingdao Univ Coll Comp Sci & Technol Qingdao 266071 Peoples R China Inst Ubiquitous Networks & Urban Comp Qingdao 266070 Peoples R China

The research of traffic revitalization index can provide support for the formulation and adjustment of policies related to urban management, epidemic prevention and resumption of work and production. This paper proposes a deep model for the prediction of urban Traffic Revitalization Index (DeepTRI). The DeepTRI builds model for the data of COVID-19 epidemic and traffic revitalization index for major cities in China. The location information of 29 cities forms the topological structure of graph. The Spatial Convolution Layer proposed in this paper captures the spatial correlation features of the graph structure. The special Graph Data Fusion module distributes and fuses the two kinds of data according to different proportions to increase the trend of spatial correlation of the data. In order to reduce the complexity of the computational process, the Temporal Convolution Layer replaces the gated recursive mechanism of the traditional recurrent neural network with a multi-level residual structure. It uses the dilated convolution whose dilation factor changes according to convex function to control the dynamic change of the receptive field and uses causal convolution to fully mine the historical information of the data to optimize the ability of long-term prediction. The comparative experiments among DeepTRI and three baselines (traditional recurrent neural network, ordinary spatial-temporal model and graph spatial-temporal model) show the advantages of DeepTRI in the evaluation index and resolving two under-fitting problems (under-fitting of edge values and under-fitting of local peaks).

关键词： COVID-19 Traffic revitalization index Data mining Data models mining methods and algorithms

来源：评论

学校读者我要写书评

暂无评论

Behavior Action mining

引用

IEEE ACCESS 2019年 7卷 19954-19964页

作者： Su, Peng Zeng, Daniel Zhao, Huimin Dali Univ Sch Math & Comp Sci Dali 671003 Peoples R China Chinese Acad Sci Inst Automat State Key Lab Intelligent Control & Management Co Beijing 100190 Peoples R China Univ Wisconsin Sheldon B Lubar Sch Business Milwaukee WI 53201 USA

The actionable behavioral rules suggest specific actions that may influence certain behavior in the stakeholders' best interest. In mining such rules, it was assumed previously that all attributes are categorical while the numerical attributes have been discretized in advance. However, this assumption significantly reduces the solution space, and thus hinders the potential of mining algorithms, especially when the numerical attributes are prevalent. As the numerical data are ubiquitous in business applications, there is a crucial need for new mining methodologies that can better leverage such data. To meet this need, in this paper, we define a new data mining problem, named behavior action mining, as a problem of continuous variable optimization of expected utility for action. We then develop three approaches to solving this new problem, which uses regression as a technical basis. The experimental results based on a marketing dataset demonstrate the validity and superiority of our proposed approaches.

关键词： Business decision support knowledge and data engineering tools and techniques mining methods and algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：