检索结果-内蒙古大学图书馆

A Three-Stage Based ensemble learning for Improved Software Fault Prediction: An Empirical Comparative Study

INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS 2018年第1期11卷 1229-1247页

作者： Yohannese, Chubato Wondaferaw Li, Tianrui Bashir, Kamal Southwest Jiaotong Univ Sch Informat Sci & Technol Chengdu 611756 Sichuan Peoples R China

Software Fault Prediction (SFP) research has made enormous endeavor to accurately predict fault proneness of software modules, thus maximize precious software test resources, reduce maintenance cost and contributes to produce quality software products. In this regard, Machine learning (ML) has been successfully applied to solve classification problems for SFP. However, SFP has many challenges that are created due to redundant and irrelevant features, class imbalance problem and the presence of noise in software defect datasets. Yet, neither of ML techniques alone handles those challenges and those may deteriorate the performance depending on the predictor's sensitiveness to data corruptions. In the literature, it is widely claimed that building ensemble classifiers from preprocessed datasets and combining their predictions is an interesting method of overcoming the individual problems produced by each classifier. This statement is usually not supported by thorough empirical studies considering problems in combined implementation with resolving different types of challenges in defect datasets and, therefore, it must be carefully studied. Thus, the objective of this paper is to conduct large scale comprehensive experiments to study the effect of resolving those challenges in SFP in three stages in order to improve the practice and performance of SFP. In addition to that, the paper presents a thorough and statistically sound comparison of these techniques in each stage. Accordingly, a new three-stage based ensemble learning framework that efficiently handles those challenges in a combined form is proposed. The experimental results confirm that the proposed framework has exhibited the robustness of combined techniques in each stage. Particularly high performance results have achieved using combined ELA on selected features of balanced data after removing noise instances. Therefore, as shown in this study, ensemble techniques used for SFP must be carefully examined and c

关键词： Software Fault Prediction Software Testing ensemble learning algorithms Feature Selection Data Balancing Noise Filtering

来源：评论

学校读者我要写书评

暂无评论

An ensemble learning based Wi-Fi Network Intrusion Detection System (WNIDS) 17

An Ensemble Learning based Wi-Fi Network Intrusion Detection...

引用

17th IEEE International Symposium on Network Computing and Applications (NCA)

作者： Vaca, Francisco D. Niyaz, Quamar Purdue Univ Northwest Coll Engn & Sci Dept Elect & Comp Engn Hammond IN 46323 USA

ISBN: (纸本)9781538676592

As the use of Wi-Fi networks grows, so does the increase in security threats. Attackers continue to improve their attack methods, which create the need for developing effective mechanisms to detect the sophisticated attacks. In this work, we propose an implementation of intrusion detection system for Wi-Fi networks using an ensemble learning method. The AWID Wi-Fi intrusion dataset is used to discover the necessary features needed for the efficient IDS implementation. We apply several ensemble learning methods on this dataset and finalize the best one for the proposed IDS implementation. The performance of IDS is reported using well-known metrics including accuracy, precision, recall, and f-measure.

关键词： Network security Wi-Fi network intrusion detection system (WNIDS) ensemble learning algorithms

来源：评论

学校读者我要写书评

暂无评论

Classification of nucleotide sequences for quality assessment using logistic regression and decision tree approaches

引用

NEURAL COMPUTING & APPLICATIONS 2018年第8期29卷 251-262页

作者： Kurt, Serkan Oz, Ersoy Askin, Oykum Esra Oz, Yeliz Yucel Yildiz Tech Univ Fac Elect & Elect Engn Dept Elect & Commun Engn Istanbul Turkey Yildiz Tech Univ Fac Arts & Sci Dept Stat Istanbul Turkey Istanbul Tech Univ Mol Biol Biotechnol Istanbul Turkey Iontek AS Istanbul Turkey

Knowledge of DNA sequences is indispensable for basic biological research. Many researchers use DNA sequencing for various purposes including molecular biology research and sequence comparison for individual identification. Automated DNA sequencing devices use four colored chromatograms or base-calling signals to indicate strength of hybridization for each base channel. Typically, relative strengths of peaks at each base location are used to quantify the quality and/or reliability of individual readings. However, assessment of overall quality of whole DNA trace files remains to be an open problem. Therefore, classification of raw DNA trace files as high or low quality is an important issue for efficient utilization of resources. In this study, we have used several supervised machine learning approaches, including logistic regression and ensemble decision trees, to identify high- or acceptable-quality chromatogram files and compared their prediction performances. In order to test and develop our ideas, we have used a public DNA trace repository consisting of 1626 high- and 631 low-quality files marked by our expert molecular biologist. Our results indicate that, although all of the methods tried offer comparable and acceptable performances, random forest decision tree algorithm with adapting boosting ensemble learning shows slightly higher prediction accuracy with as few as four features.

关键词： DNA sequencing Decision tree ensemble learning algorithms Logistic regression

来源：评论

学校读者我要写书评

暂无评论

ensembles Based Combined learning for Improved Software Fault Prediction: A Comparative Study 12

Ensembles Based Combined Learning for Improved Software Faul...

引用

12th International Conference on Intelligent Systems and Knowledge Engineering (IEEE ISKE)

作者： Yohannese, Chubato Wondaferaw Li, Tianrui Simfukwe, Macmillan Khurshid, Faisal Southwest Jiaotong Univ Sch Informat Sci & Technol Chengdu 611756 Sichuan Peoples R China

ISBN: (纸本)9781538618295

Software Fault Prediction (SFP) research has made enormous endeavor to accurately predict fault proneness of software modules to maximize precious software test resources, reduce maintenance cost, help to deliver software products on time and satisfy customer, which ultimately contribute to produce quality software products. In this regard, Machine learning (ML) has been successfully applied to solve classification problems for SFP. Moreover, from ML, it has been observed that ensemble learning algorithms (ELA) are known to improve the performance of single learning algorithms. However, neither of ELA alone handles the challenges created by redundant and irrelevant features and class imbalance problem in software defect datasets. Therefore, the objective of this paper is to independently examine and compare prominent ELA and improves their performance combined with Feature Selection (FS) and Data Balancing (DB) techniques to identify more efficient ELA that better predict the fault proneness of software modules. Accordingly, a new framework that efficiently handles those challenges in a combined form is proposed. The experimental results confirm that the proposed framework has exhibited the robustness of combined techniques. Particularly the framework has high performance when using combined bagging ELA with DB on selected features. Therefore, as shown in this study, ensemble techniques used for SFP must be carefully examined and combined with both FS and DB in order to obtain robust performance.

关键词： Software Fault Prediction ensemble learning algorithms Feature Selection Data Balancing

来源：评论

学校读者我要写书评

暂无评论

Exploiting label dependencies for improved sample complexity

引用

MACHINE learning 2013年第1期91卷 1-42页

作者： Chekina, Lena Gutfreund, Dan Kontorovich, Aryeh Rokach, Lior Shapira, Bracha Ben Gurion Univ Negev Dept Informat Syst Engn IL-84105 Beer Sheva Israel Ben Gurion Univ Negev Telekom Innovat Labs IL-84105 Beer Sheva Israel IBM Res Haifa Israel Ben Gurion Univ Negev Dept Comp Sci IL-84105 Beer Sheva Israel

Multi-label classification exhibits several challenges not present in the binary case. The labels may be interdependent, so that the presence of a certain label affects the probability of other labels' presence. Thus, exploiting dependencies among the labels could be beneficial for the classifier's predictive performance. Surprisingly, only a few of the existing algorithms address this issue directly by identifying dependent labels explicitly from the dataset. In this paper we propose new approaches for identifying and modeling existing dependencies between labels. One principal contribution of this work is a theoretical confirmation of the reduction in sample complexity that is gained from unconditional dependence. Additionally, we develop methods for identifying conditionally and unconditionally dependent label pairs;clustering them into several mutually exclusive subsets;and finally, performing multi-label classification incorporating the discovered dependencies. We compare these two notions of label dependence (conditional and unconditional) and evaluate their performance on various benchmark and artificial datasets. We also compare and analyze labels identified as dependent by each of the methods. Moreover, we define an ensemble framework for the new methods and compare it to existing ensemble methods. An empirical comparison of the new approaches to existing base-line and state-of-the-art methods on 12 various benchmark datasets demonstrates that in many cases the proposed single-classifier and ensemble methods outperform many multi-label classification algorithms. Perhaps surprisingly, we discover that the weaker notion of unconditional dependence plays the decisive role.

关键词： Multi-label classification Conditional and unconditional label dependence Generalization bounds Multi-label evaluation measures ensemble learning algorithms ensemble models diversity Empirical experiment Artificial datasets

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：