检索结果-内蒙古大学图书馆

Detection of broiler heat stress by using the generalised sequential pattern algorithm

BIOSYSTEMS ENGINEERING 2020年 199卷 121-126页

作者： Branco, Tatiane Moura, Daniella J. Naas, Irenilza A. Oliveira, Stanley R. M. Univ Estadual Campinas Coll Agr Engn Campinas SP Brazil Embrapa Agr Informat Campinas SP Brazil

The sequence pattern mining method aims to identify frequent sequences that exceed a user-specified support threshold. The present study uses the same approach based on sequential standards to estimate the heat stress of broilers from a resulting behavioural pattern. Experimental data were recorded in a climate chamber where the behaviour of broilers was recorded under thermoneutral (comfort) conditions, set as standard, and when exposed to thermal stress (cold and heat). The Generalised Sequential Patterns (GSP) algorithm was used to evaluate the heat stress of broilers in the third and fourth week of growth. The results indicated that the mining of pattern sequences is a useful and straightforward technique to estimate the welfare of broiler chickens, allowing the identification of temporal relations between thermal stress and the consequent behaviour of the broiler. Temperature 8 degrees C below the standard thermoneutral conditions showed that the broiler remained lying down most of the time, walking only to the drinker and feeder trough. Broilers exposed to temperatures 8 degrees C above the standard thermoneutral conditions () tend to decrease locomotor activities, showing lower welfare status. (C) 2019 Published by Elsevier Ltd on behalf of IAgrE.

关键词： Animal welfare Behavioural pattern data mining algorithm Detection of sequential frequency

来源：评论

学校读者我要写书评

暂无评论

Monitoring Distillation Column Systems Using Improved Nonlinear Partial Least Squares-Based Strategies

引用

IEEE SENSORS JOURNAL 2019年第23期19卷 11697-11705页

作者： Madakyaru, Muddu Harrou, Fouzi Sun, Ying Manipal Acad Higher Educ Dept Chem Engn Manipal Inst Technol Manipal 576104 Karnataka India KAUST Comp Elect & Math Sci & Engn CEMSE Div Thuwal 239556900 Saudi Arabia

Fault detection in industrial systems plays a core role in improving their safety, productivity and avoiding expensive maintenance. This paper proposed and verified data-driven anomaly detection schemes based on a nonlinear latent variable model and statistical monitoring algorithms. Integrating both the suitable characteristics of partial least squares (PLS) and adaptive neural network fuzzy inference systems (ANFIS) procedure, PLS-ANFIS model is employed to allow for flexible modeling of multivariable nonlinear processes. Furthermore, PLS-ANFIS modeling was connected with k-nearest neighbors (kNN)-based data mining schemes and employed for nonlinear process monitoring. Specifically, residuals generated from the PLS-ANFIS model are used as the input to the kNN-based mechanism to uncover anomalies in the data. Moreover, kNN-based exponentially smoothing with parametric and nonparametric thresholds is adopted to better anomaly detection. The effectiveness of the proposed approach is evaluated using real measurements from an actual bubble cap distillation column.

关键词： Anomaly detection data mining algorithm unsupervised monitoring distillation column systems

来源：评论

学校读者我要写书评

暂无评论

A Web Semantic mining Method for Fake Cybersecurity Threat Intelligence in Open Source Communities

引用

INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS 2024年第1期20卷

作者： Li, Zhihua Yu, Xinye Zhao, Yukai Jiangnan Univ Wuxi Peoples R China

In order to overcome the challenges of inadequate classification accuracy in existing fake cybersecurity threat intelligence mining methods and the lack of high-quality public datasets for training classification models, we propose a novel approach that significantly advances the field. We improved the attention mechanism and designed a generative adversarial network based on the improved attention mechanism to generate fake cybersecurity threat intelligence. Additionally, we refine text tokenization techniques and design a detection model to detect fake cybersecurity threats intelligence. Using our STIX-CTIs dataset, our method achieves a remarkable accuracy of 96.1%, outperforming current text classification models. Through the utilization of our generated fake cybersecurity threat intelligence, we successfully mimic data poisoning attacks within open-source communities. When paired with our detection model, this research not only improves detection accuracy but also provides a powerful tool for enhancing the security and integrity of open-source ecosystems.

关键词： Cybersecurity Threat Intelligence Fake Threat Intelligence Generation data mining algorithm

来源：评论

学校读者我要写书评

暂无评论

mining sequential patterns by pattern-growth: The PrefixSpan approach

引用

IEEE TRANSACTIONS ON KNOWLEDGE AND data ENGINEERING 2004年第11期16卷 1424-1440页

作者： Pei, J Han, JW Mortazavi-Asl, B Wang, JY Pinto, H Chen, QM Dayal, U Hsu, MC Simon Fraser Univ Sch Comp Sci Burnaby BC V5A 1S6 Canada Univ Illinois Dept Comp Sci Urbana IL 61801 USA Univ Minnesota Minneapolis MN 55455 USA Simon Fraser Univ Sch Comp Sci Burnaby BC V5A 1S6 Canada Packetmot Inc San Mateo CA 94403 USA Hewlett Packard Labs Palo Alto CA 94303 USA Commerce One Inc San Francisco CA 94105 USA

Sequential pattern mining is an important data mining problem with broad applications. However, it is also a difficult problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Most of the previously developed sequential pattern mining methods, such as GSP, explore a candidate generation-and-test approach [1] to reduce the number of candidates to be examined. However, this approach may not be efficient in mining large sequence databases having numerous patterns and/or long patterns. In this paper, we propose a projection-based, sequential pattern-growth approach for efficient mining of sequential patterns. In this approach, a sequence database is recursively projected into a set of smaller projected databases, and sequential patterns are grown in each projected database by exploring only locally frequent fragments. Based on an initial study of the pattern growth-based sequential pattern mining, FreeSpan [8], we propose a more efficient method, called PSP, which offers ordered growth and reduced projected databases. To further improve the performance, a pseudoprojection technique is developed in PrefixSpan. A comprehensive performance study shows that PrefixSpan, in most cases, outperforms the a priori-based algorithm GSP, FreeSpan, and SPADE [29] ( a sequential pattern mining algorithm that adopts vertical data format), and PrefixSpan integrated with pseudoprojection is the fastest among all the tested algorithms. Furthermore, this mining methodology can be extended to mining sequential patterns with user-specified constraints. The high promise of the pattern-growth approach may lead to its further extension toward efficient mining of other kinds of frequent patterns, such as frequent substructures.

关键词： data mining algorithm sequential pattern frequent pattern transaction database sequence database scalability performance analysis

来源：评论

学校读者我要写书评

暂无评论

Supervised inductive learning with Lotka-Volterra derived models

引用

KNOWLEDGE AND INFORMATION SYSTEMS 2011年第2期26卷 195-223页

作者： Hovsepian, Karen Anselmo, Peter Mazumdar, Subhasish O Wayne Rollins Res Ctr Taylor Lab Atlanta GA 30322 USA Emory Univ Dept Biol Atlanta GA 30329 USA New Mexico Inst Min & Technol Dept Management Socorro NM 87801 USA New Mexico Inst Min & Technol Dept Comp Sci Socorro NM 87801 USA

We present a classification algorithm built on our adaptation of the Generalized Lotka-Volterra model, well-known in mathematical ecology. The training algorithm itself consists only of computing several scalars, per each training vector, using a single global user parameter and then solving a linear system of equations. Construction of the system matrix is driven by our model and based on kernel functions. The model allows an interesting point of view of kernels' role in the inductive learning process. We describe the model through axiomatic postulates. Finally, we present the results of the preliminary validation experiments.

关键词： Supervised machine learning Classification Lotka-Volterra model data mining algorithm Generalization theory

来源：评论

学校读者我要写书评

暂无评论

An UpDown Directed Acyclic Graph Approach for Sequential Pattern mining

引用

IEEE TRANSACTIONS ON KNOWLEDGE AND data ENGINEERING 2010年第7期22卷 913-928页

作者： Chen, Jinlin CUNY Queens Coll Dept Comp Sci Flushing NY 11367 USA

Traditional pattern growth-based approaches for sequential pattern mining derive length-(k + 1) patterns based on the projected databases of length-k patterns recursively. At each level of recursion, they unidirectionally grow the length of detected patterns by one along the suffix of detected patterns, which needs k levels of recursion to find a length-k pattern. In this paper, a novel data structure, UpDown Directed Acyclic Graph (UDDAG), is invented for efficient sequential pattern mining. UDDAG allows bidirectional pattern growth along both ends of detected patterns. Thus, a length-k pattern can be detected in left perpendicular log(2)k + 1 right perpendicular levels of recursion at best, which results in fewer levels of recursion and faster pattern growth. When minSup is large such that the average pattern length is close to 1, UDDAG and PrefixSpan have similar performance because the problem degrades into frequent item counting problem. However, UDDAG scales up much better. It often outperforms PrefixSpan by almost one order of magnitude in scalability tests. UDDAG is also considerably faster than Spade and LapinSpam. Except for extreme cases, UDDAG uses comparable memory to that of PrefixSpan and less memory than Spade and LapinSpam. Additionally, the special feature of UDDAG enables its extension toward applications involving searching in large spaces.

关键词： data mining algorithm directed acyclic graph performance analysis sequential pattern transaction database

来源：评论

学校读者我要写书评

暂无评论

SMOTEFUNA: Synthetic Minority Over-Sampling Technique Based on Furthest Neighbour algorithm

引用

IEEE ACCESS 2020年 8卷 59069-59082页

作者： Tarawneh, Ahmad S. Hassanat, Ahmad B. A. Almohammadi, Khalid Chetverikov, Dmitry Bellinger, Colin Eotvos Lorand Univ Dept Algorithms & Their Applicat H-1117 Budapest Hungary Mutah Univ Dept Comp Sci Al Karak 61711 Jordan Univ Tabuk Community Coll Dept Comp Sci Tabuk 71491 Saudi Arabia Univ Tabuk Ind Innovat & Robot Ctr Tabuk 71491 Saudi Arabia Inst Comp Sci & Control Budapest Hungary CNR Ottawa ON K1A 0R6 Canada

Class imbalance occurs in classification problems in which the "normal" cases, or instances, significantly outnumber the "abnormal" instances. Training a standard classifier on imbalanced data leads to predictive biases which cause poor performance on the class(es) with lower prior probabilities. The less frequent classes are often critically important events, such as system failure or the occurrence of a rare disease. As a result, the class imbalance problem has been considered to be of great importance for many years. In this paper, we propose a novel algorithm that utilizes the furthest neighbor of a candidate example to generate new synthetic samples. A key advantage of SOMTEFUNA over existing methods is that it does not have parameters to tune (such as K in SMOTE). Thus, it is significantly easier to utilize in real-world applications. We evaluate the benefit of resampling with SOMTEFUNA against state-of-the-art methods including SMOTE, ADASYN and SWIM using Naive Bayes and Support Vector Machine classifiers. Also, we provide a statistical analysis based on Wilcoxon Signed-rank test to validate the significance of the SMOTEFUNA results. The results indicate that the proposed method is an efficient alternative to the current methods. Specifically, SOMTEFUNA achieves better 5-fold cross validated ROC and precision-recall space performance.

关键词： Binary classification data mining algorithm furthest neighbor imbalance problem SMOTE

来源：评论

学校读者我要写书评

暂无评论

Knowledge discovery from spatial transactions

引用

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS 2007年第1期28卷 1-22页

作者： Rinzivillo, Salvatore Turini, Franco Univ Pisa Dept Comp Sci Pisa Italy

We propose a general mechanism to represent the spatial transactions in a way that allows the use of the existing data mining methods. Our proposal allows the analyst to exploit the layered structure of geographical information systems in order to define the layers of interest and the relevant spatial relations among them. Given a reference object, it is possible to describe its neighborhood by considering the attribute of the object itself and the objects related by the chosen relations. The resulting spatial transactions may be either considered like "traditional" transactions, by considering only the qualitative spatial relations, or their spatial extension can be exploited during the data mining process. We explore both these cases. First we tackle the problem of classifying a spatial dataset, by taking into account the spatial component of the data to compute the statistical measure (i.e., the entropy) necessary to learn the model. Then, we consider the task of extracting spatial association rules, by focusing on the qualitative representation of the spatial relations. The feasibility of the process has been tested by implementing the proposed method on top of a GIS tool and by analyzing real world data.

关键词： qualitative spatial relations spatial dataset spatial transactions geographical information systems data mining algorithm

来源：评论

学校读者我要写书评

暂无评论

Contiguous item sequential pattern mining using UpDown Tree

引用

INTELLIGENT data ANALYSIS 2008年第1期12卷 25-49页

作者： Chen, Jinlin CUNY Queens Coll Dept Comp Sci Flushing NY 11367 USA

In this paper the problem of Contiguous Item Sequential Pattern ( CISP) mining is presented as a sequential pattern mining problem under two constraints. First, each element in a sequence consists of only one item. Second, items appearing in the sequences that contain a pattern must be adjacent with respect to the underlying order as they appear in the pattern. Even though the problem of CISP mining can be solved by using previous approaches on sequential pattern mining under a general constraint description framework, this may lead to poor performance due to the large searching space. To efficiently solve this problem, a new data structure, UpDown Tree, is proposed for CISP mining. UpDown Tree based approach can greatly improve the efficiency of CISP mining in terms of both time and memory comparing to previous approaches. An extensive experimental study has shown promising results with our approach.

关键词： data mining algorithm sequential pattern contiguous sequential pattern transaction database sequence database performance analysis

来源：评论

学校读者我要写书评

暂无评论

Innovative use of health informatics to augment contact tracing during the COVID-19 pandemic in an acute hospital

引用

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION 2020年第12期27卷 1964-1967页

作者： Venkataraman, Narayan Poon, Beng Hoong Siau, Chuin Changi Gen Hosp Data Management & Informat Singapore Singapore Changi Gen Hosp Resp & Crit Care Med Ambulatory Care Singapore Singapore

This case report describes the innovative design and build of an algorithm that integrates available data from separate hospital-based informatics systems, which perform different daily functions to augment the contact-tracing process of COVID-19 patients by identifying exposed neighboring patients and healthcare workers and assessing their risk. Prior to the establishment of the algorithm, contact-tracing teams comprising 6 members would spend up to 10 hours each to complete contact tracing for 5 new COVID-19 patients. With the augmentation by the algorithm, we observed >= 60% savings in overall man-hours needed for contact tracing when there were 5 or more daily new cases through a time-motion study and Monte Carlo simulation. This improvement to the hospital's contact-tracing process supported more expeditious and comprehensive downstream contact-tracing activities as well as improved manpower utilization in contact tracing.

关键词： COVID-19 contact tracing informatics system data mining algorithm improved manpower utilization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：