版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Cent South Univ Sch Automat Changsha 410083 Peoples R China
出 版 物:《ANALYTICA CHIMICA ACTA》 (Anal. Chim. Acta)
年 卷 期:2025年第1341卷
页 面:343655页
核心收录:
学科分类:081704[工学-应用化学] 07[理学] 08[工学] 0817[工学-化学工程与技术] 070302[理学-分析化学] 0703[理学-化学]
基 金:National Natural Science Foundation of China
主 题:Spectral analysis Variable selection Multi-objective optimization Evolutionary algorithm Roulette probability
摘 要:Background: In spectral analysis, selecting the right spectral variables is crucial for effective modeling. It reduces data dimensionality, removes irrelevant wavelength points, and improves both the generalization ability and computational efficiency of the model. However, the number of available samples often falls short of the total possible combinations of wavelengths, making variable selection a non-deterministic polynomial-time (NP) hard optimization problem. The current dedicated variable selection and heuristic algorithms fail to balance the effectiveness and speed of variable selection. Therefore, there is a great need for a more advanced approach to address this problem. (92) Results: In this paper, we adopt a different perspective by considering variable selection as a large-scale sparse multi-objective optimization problem, modeled with fewer variables to achieve lower prediction errors. Then a novel interval sparse evolutionary algorithm (ISEA) was proposed, merging the benefits of dedicated variable selection algorithms and evolutionary algorithms. It incorporates a roulette probability mechanism and enhances the selection probability of key informative variables through a sparse population initialization strategy (SPIS) and a regional sparse evolution strategy (RSES). Specifically, the SPIS prioritizes variable regions through interval partial least squares (iPLS) and initializes the sparse population based on regional roulette probability, thereby enhancing the likelihood of selection of important regional variables in the initial sparse population. The RSES further focuses on more important regions, ensuring the variables in more important regions have a higher survival probability in subsequent generations. (138) Significance: Applied to datasets of corn oil, soil, and diesel fuels, ISEA outperforms nine state-of-the-art methods by maintaining both the effectiveness of variable selection and running speed. Additionally, unlike dedicated variable se