版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:UCL Dept Chem Engn Ctr Proc Syst Engn Torrington Pl London WC1E 7JE England
出 版 物:《EXPERT SYSTEMS WITH APPLICATIONS》 (专家系统及其应用)
年 卷 期:2019年第121卷
页 面:362-372页
核心收录:
学科分类:1201[管理学-管理科学与工程(可授管理学、工学学位)] 0808[工学-电气工程] 08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:UK Leverhulme Trust [RPG-2015-240]
主 题:Mathematical programming Regression analysis Optimisation Information criterion Machine learning
摘 要:Regression is a predictive analysis tool that examines the relationship between independent and dependent variables. The goal of this analysis is to fit a mathematical function that describes how the value of the response changes when the values of the predictors vary. The simplest form of regression is linear regression which in the case multiple regression, tries to explain the data by simply fitting a hyperplane minimising the absolute error of the fitting. Piecewise regression analysis partitions the data into multiple regions and a regression function is fitted to each one. Such an approach is the OPLRA (Optimal Piecewise Linear Regression Analysis) model (Yang, Liu, Tsoka, & Papage, 2016) which is a mathematical programming approach that optimally partitions the data into multiple regions and fits a linear regression functions minimising the Mean Absolute Error between prediction and truth. However, using many regions to describe the data can lead to overfitting and bad results. In this work an extension of the OPLRA model is proposed that deals with the problem of selecting the optimal number of regions as well as overfitting. To achieve this result, information criteria such as the Akaike and the Bayesian are used that reward predictive accuracy and penalise model complexity. (C) 2018 Published by Elsevier Ltd.