检索结果-内蒙古大学图书馆

To combat multi-class imbalanced problems by means of over-sampling and boosting techniques

SOFT COMPUTING 2015年第12期19卷 3369-3385页

作者： Abdi, Lida Hashemi, Sattar Shiraz Univ CSE & IT Dept Shiraz Iran

Imbalanced problems are quite pervasive in many real-world applications. In imbalanced distributions, a class or some classes of data, called minority class(es), is/are under-represented compared to other classes. This skewness in the data underlying distribution causes many difficulties for typical machine learning algorithms. The notion becomes even more complicated when machine learning algorithms are to combat multi-class imbalanced problems. The presented solutions for tackling the issues arising from imbalanced distributions, generally fall into two main categories: data-oriented methods and model-based algorithms. Focusing on the latter, this paper suggests an elegant blend of boosting and over-sampling paradigms, which is called MDOBoost, to bring considerable benefits to the learning ability of multi-class imbalanced data sets. The over-sampling technique introduced and adopted in this paper, Mahalanobis distance-based over-sampling technique (MDO in short), is delicately incorporated into boosting algorithm. In fact, the minority classes are over-sampled via MDO technique in such a way that they almost preserve the original minority class characteristics. MDO, in comparison with the popular method in this field, SMOTE, generates more similar minority class examples to original class samples. Moreover, the broader representation of minority class examples is provided via MDO, and this, in turn, causes the classifier to build larger decision regions. MDOBoost increases the generalization ability of a classifier, since it indicates better results with pruned version of C4.5 classifier;unlike other over-sampling/boosting procedures, which have difficulties with pruned version of C4.5. MDOBoost is applied to real-world multi-class imbalanced benchmarks and its performance is then compared with several data-level and model-based algorithms. The empirical results and theoretical analyses reveal that MDOBoost offers superior advantages compared to popular class de

关键词： Multi-class imbalance Over-sampling Mahalanobis distance boosting algorithm Class decomposition techniques

来源：评论

学校读者我要写书评

暂无评论

Optimally-smooth adaptive boosting and application to agnostic learning

引用

JOURNAL OF MACHINE LEARNING RESEARCH 2004年第1期4卷 101-117页

作者： Gavinsky, D Univ Calgary Dept Comp Sci Calgary AB T2N 1N4 Canada

We describe a new boosting algorithm that is the first such algorithm to be both smooth and adaptive. These two features make possible performance improvements for many learning tasks whose solutions use a boosting technique. The boosting approach was originally suggested for the standard PAC model;we analyze possible applications of boosting in the context of agnostic learning, which is more realistic than the PAC model. We derive a lower bound for the final error achievable by boosting in the agnostic model and show that our algorithm actually achieves that accuracy (within a constant factor). We note that the idea of applying boosting in the agnostic model was first suggested by Ben-David, Long and Mansour (2001) and the solution they give is improved in the present paper. The accuracy we achieve is exponentially better with respect to the standard agnostic accuracy parameter beta. We also describe the construction of a boosting "tandem" whose asymptotic number of iterations is the lowest possible (in both gamma and epsilon) and whose smoothness is optimal in terms of O((.)). This allows adaptively solving problems whose solution is based on smooth boosting (like noise tolerant boosting and DNF membership learning), while preserving the original (non-adaptive) solution's complexity.

关键词： .terms factor suggested accuracy derive Applying lower bound smoothness Optimally-Smooth Adaptive boosting boosting algorithm boosting approach PAC model agnostic model Long and Mansour number of iterations DNF

来源：评论

学校读者我要写书评

暂无评论

boosting image classification with LDA-based feature combination for digital photograph management

引用

PATTERN RECOGNITION 2005年第6期38卷 887-901页

作者： Liu, XZ Zhang, L Li, MJ Zhang, HJ Wang, DX Microsoft Res Asia Beijing 100080 Peoples R China Tsinghua Univ Dept Comp Sci & Technol Beijing 100080 Peoples R China

Image classification is of great importance for digital photograph management. In this paper we propose a general statistical learning method based on boosting algorithm to perform image classification for photograph annotation and management. The proposed method employs both features extracted from image content (i.e., color moment and edge direction histogram) and features from the EXIT metadata recorded by digital cameras. To fully utilize potential feature correlations and improve the classification accuracy, feature combination is needed. We incorporate linear discriminant analysis (LDA) algorithm to implement linear combinations between selected features and generate new combined features. The combined features are used along with the original features in boosting algorithm for improving classification performance. To make the proposed learning algorithm more efficient, we present two heuristics for selective feature combinations, which can significantly reduce training computation without losing performance. The proposed image classification method has several advantages: small model size, computational efficiency and improved classification performance based on LDA feature combination. (c) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

关键词： image classification boosting algorithm LDA-based feature combination training acceleration

来源：评论

学校读者我要写书评

暂无评论

Imbalanced Data Classification algorithm Based on boosting and Cascade Model

Imbalanced Data Classification Algorithm Based on Boosting a...

引用

IEEE International Conference on Systems, Man, and Cybernetics

作者： Xiaolong Zhang Chao Cheng School of Computer Science and Technology Wuhan University of Science and Technology

ISBN: (纸本)9781467317139

Traditional classification algorithms are difficult in dealing with imbalance data. This paper proposes a classification algorithm called CascadeBoost, which combines with the advantages of boosting algorithm and cascade model that can learn imbalance data. Cascade model allows the pre-training data to be balanced by gradually reducing the number of the major class;and then the most rich information samples based on the weight distribution can be gradually selected using boosting algorithm. The experimental results show that the proposed method obtains better performance compared to other methods.

关键词： Imbalance Data boosting algorithm Cascade Model SVM AUC Boost Sorting algorithm cascades data classification

来源：评论

学校读者我要写书评

暂无评论

Effect of reservoir heterogeneity on well placement prediction in CO2-EOR projects using machine learning surrogate models: Benchmarking of boosting-based algorithms

GEOENERGY SCIENCE AND ENGINEERING

引用

GEOENERGY SCIENCE AND ENGINEERING 2024年 233卷

作者： Esfandi, Tanin Sadeghnejad, Saeid Jafari, Arezou Tarbiat Modares Univ Fac Chem Engn Dept Petr Engn Tehran Iran

Rising Carbon Dioxide (CO2) levels from human activities are driving climate change. Carbon capture and storage (CCS) during enhanced oil recovery (EOR) in underground reservoirs offer both environmental and economic benefits. This method boosts oil production, cuts greenhouse gas emissions, and supports sustainable energy. Precise well placement in CO2-EOR is a crucial task for effective oil displacement, but traditional reservoir simulators are costly. This study explores and compares boosting algorithms, as fast surrogate models, to achieve accurate well placement during CO2-EOR in light oil carbonate reservoirs. The research considers various reservoir scenarios with different geological heterogeneity levels (i.e., homogeneous, moderately het-erogeneous, and highly heterogeneous reservoirs). Various parameters, such as injection and production well locations, the distance between production and injection wells in an inverted five-spot pattern, pattern angle, and injection and production rates are explored using a compositional reservoir simulator to assess their impact on the well placement problem. A comprehensive analysis of various boosting algorithms, including AdaBoost, CatBoost, Gradient boosting, LightGBM, and XGBoost is performed using the simulated dataset to assess their efficacy. The results demonstrate that LightGBM outperformed the other algorithms with the lowest Mean Ab-solute Error and Root Mean Square Error of 115.3 x 106 $ and 188.2 x 106 $, respectively. Additionally, it demonstrates exceptional speed, averaging 3 to 8 times faster than other boosting algorithms in the three reservoir scenarios. This superior performance coupled with its efficient runtime makes LightGBM the ideal choice for the study objectives. Moreover, the mass balance approach highlights the significant CO2 storage efficiency, emphasizing the effectiveness of CO2-EOR in storing CO2 in underground heterogeneous reservoirs.

关键词： Surrogate models boosting algorithm Accurate well placement SubsurfaceCO2 storage Compositional simulation Heterogeneous light oil reservoir

来源：评论

学校读者我要写书评

暂无评论

Quadratic boosting

引用

PATTERN RECOGNITION 2008年第1期41卷 331-341页

作者： Pham, Thang V. Smeulders, Amold W. M. Vrije Univ Amsterdam Med Ctr Canc Ctr Amsterdam Oncoproteom Lab NL-1081 HV Amsterdam Netherlands Univ Amsterdam Inst Informat ISLA NL-1098 SJ Amsterdam Netherlands

This paper presents a strategy to improve the AdaBoost algorithm with a quadratic combination of base classifiers. We observe that learning this combination is necessary to get better performance and is possible by constructing an intermediate learner operating on the combined linear and quadratic terms. This is not trivial, as the parameters of the base classifiers are not under direct control, obstructing the application of direct optimization. We propose a new method realizing iterative optimization indirectly. First we train a classifier by randomizing the labels of training examples. Subsequently, the input learner is called repeatedly with a systematic update of the labels of the training examples in each round. We show that the quadratic boosting algorithm converges under the condition that the given base learner minimizes the empirical error. We also give an upper bound on the VC-dimension of the new classifier. Our experimental results on 23 standard problems show that quadratic boosting compares favorably with AdaBoost on large data sets at the cost of training speed. The classification time of the two algorithms, however, is equivalent. (C) 2007 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

关键词： AdaBoost boosting algorithm coordinate descent generalization error object detection quadratic boosting randomized relabeling VC-dimension

来源：评论

学校读者我要写书评

暂无评论

General Sparse boosting: Improving Feature Selection of L₂boosting by Correlation-Based Penalty Family

引用

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION 2015年第6期44卷 1612-1640页

作者： Zhao, Junlong Beihang Univ Sch Math & Syst Sci Beijing 100191 Peoples R China

In high-dimensional setting, componentwise L(2)boosting has been used to construct sparse model that performs well, but it tends to select many ineffective variables. Several sparse boosting methods, such as, Sparse L(2)boosting and Twin boosting, have been proposed to improve the variable selection of L(2)boosting algorithm. In this article, we propose a new general sparse boosting method (GSboosting). The relations are established between GSboosting and other well known regularized variable selection methods in the orthogonal linear model, such as adaptive Lasso, hard thresholds, etc. Simulation results show that GSboosting has good performance in both prediction and variable selection.

关键词： Adaptive Lasso boosting algorithm Model selection Sparsity

来源：评论

学校读者我要写书评

暂无评论

SVR-boosting ensemble model for electricity price forecasting in electric power market

引用

Journal of Harbin Institute of Technology(New Series) 2008年第1期15卷 90-94页

作者：周佃民高琳管晓宏高峰 Energy Department Shanghai Baoshan Iron & Steel Co.Ltd. Systems Engineering Institute Xi’an Jiaotong University

A revised support vector regression (SVR) ensemble model based on boosting algorithm (SVR-boosting) is presented in this paper for electricity price forecasting in electric power market. In the light of characteristics of electricity price sequence, a new triangular-shaped 为oss function is constructed in the training of the forecasting model to inhibit the learning from abnormal data in electricity price sequence. The results from actual data indicate that, compared with the single support vector regression model, the proposed SVR-boosting ensemble model is able to enhance the stability of the model output remarkably, acquire higher predicting accuracy, and possess comparatively satisfactory generalization capability.

关键词： electricity price forecasting support vector regression boosting algorithm ensemble model gen-eralization capability

来源：评论

学校读者我要写书评

暂无评论

A new flood forecasting model based on SVM and boosting learning algorithms

A new flood forecasting model based on SVM and boosting lear...

引用

IEEE Congress on Evolutionary Computation (CEC) held as part of IEEE World Congress on Computational Intelligence (IEEE WCCI)

作者： Li, Shijin Ma, Kaikai Jin, Zhou Zhu, Yuelong Hohai Univ Sch Comp & Informat Nanjing Jiangsu Peoples R China

ISBN: (纸本)9781509006229

Support vector machine (SVM) has been widely applied in flood forecasting models and achieved good results. However, it has been plagued by two problems. One is its over-reliance on the number and quality of the raw input data;the other is that a single model cannot describe the complex relationships hidden in the flood evolution processes. To tackle the two problems, this paper presents an SVM flood forecasting model based on kernel principal component analysis(KPCA) and boosting algorithm. The nonlinear KPCA is applied to extract the useful information from historical flood data. To eliminate the interference caused by redundant information, boosting learning algorithm with multiple SVM models exploits various distribution characteristics of the historical flood data. Each sub-model focuses on the learning of a certain type of samples. Finally the prediction is achieved through combining multiple models. Application to the flood forecasting of Wangjiaba station at Huaihe River shows that the proposed SVM ensemble model based on KPCA and boosting learning can improve the flood forecasting accuracy effectively.

关键词： flood forecasting boosting algorithm kernel principal component analysis support vector machine

来源：评论

学校读者我要写书评

暂无评论

A Novel QoE model based on boosting Support Vector Regression

A Novel QoE model based on Boosting Support Vector Regressio...

引用

IEEE Wireless Communications and Networking Conference (WCNC)

作者： Ben Youssef, Yosr Afif, Mariem Ksantini, Riadh Tabbane, Sami Carthage Univ Higher Sch Commun Tunis Supcom MEDIATRON Lab Tunis Tunisia Carthage Univ Higher Sch Commun Tunis Supcom SecuriteNumer Lab Tunis Tunisia Univ Windsor 401 SunsetAve Windsor ON Canada

ISBN: (纸本)9781538617342

The main telecom operator goal is to build end user loyalty towards offered services. Computing the perceived quality, known, Quality of Experience (QoE) has become a crucial topic for investigation. Machine learning algorithms provide a solution to tease out the complex relationships between several influencing factors and QoE. This paper proposes a novel QoE estimation model for video service, namely, boosting Support Vector Regression (BSVR) based QoE model. The purpose of this model is to investigate the effectiveness of combining multiple learners instead of classical individual learner, in order to improve prediction accuracy of the QoE. The BSVR is based on a combination of two principal techniques: boosting algorithm and Support Vector Regression (SVR). More precisely, multiple SVR models were trained in an iterative boosting algorithm to create a powerful predictive model. In fact, the use of SVRs as weak learners has several advantages. First, the SVR is based on a convex optimization problem, where a global optimal solution exploits a limited number of support vectors, which results in improved prediction accuracy, while maintaining low computational complexity. Second, each SVR uses flexible Radial Basis Function (RBF) kernel function to model QoE data efficiently. Comparative evaluation of our proposed BSVR-based QoE model is performed to show its superiority over relevant ensemble learning methods and regression models based on single learner, in terms of prediction accuracy and computational complexity

关键词： QoE Ensemble Learning Regression models Video Service Support Vector Regression boosting algorithm

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：