检索结果-内蒙古大学图书馆

Deep Reinforcement Learning for Semisupervised Hyperspectral Band Selection

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 2022年 60卷 1页

作者： Feng, Jie Li, Di Gu, Jing Cao, Xianghai Shang, Ronghua Zhang, Xiangrong Jiao, Licheng Xidian Univ Key Lab Intelligent Percept & Image Understanding Minist Educ China Xian 710071 Peoples R China

Band selection is an important step in efficient processing of hyperspectral images (HSIs), which can be seen as the combination of powerful band search technique and effective evaluation criterion. The existing deep-learning-based methods make the network parameters sparse to search the spectral bands using threshold-based functions or regularization terms. These methods may lead to an intractable optimization problem. Furthermore, these methods need to repeatedly train deep networks for evaluating candidate band subsets. In this article, we formalize hyperspectral band selection as a reinforcement learning (RL) problem. Band search is regarded as a sequential decision-making process, where each state in the search space is a feasible band subset. To evaluate each state, a semisupervised convolutional neural network (CNN), called EvaluateNet, is constructed by adding the intraclass compactness constraint of both limited labeled and sufficient unlabeled samples. A simple stochastic band sampling method is designed to train EvaluateNet, making it possible to efficiently evaluate without any fine-tuning. In RL, new reward functions are defined by taking the EvaluateNet and the penalty of repeated selection into account. Finally, advantage actorx2013;critic algorithms are designed to explore in the state space and select the band subset according to the expected accumulated reward. The experimental results on HSI data sets demonstrate the effectiveness and efficiency of the proposed algorithms for hyperspectral band selection.

关键词： Hyperspectral imaging Reinforcement learning Optimization Deep learning Task analysis Neural networks Convolutional neural networks actor-critic algorithm band selection deep reinforcement learning (DRL) hyperspectral image (HSI) classification semisupervised learning

来源：评论

学校读者我要写书评

暂无评论

CVLight: Decentralized learning for adaptive traffic signal control with connected vehicles

引用

TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES 2022年 141卷

作者： Mo, Zhaobin Li, Wangzhi Fu, Yongjie Ruan, Kangrui Di, Xuan Columbia Univ Data Sci Inst New York NY 10027 USA Columbia Univ Dept Civil Engn & Engn Mech New York NY 10027 USA

This paper develops a decentralized reinforcement learning (RL) scheme for multi-intersection adaptive traffic signal control (TSC), called "CVLight", that leverages data collected from connected vehicles (CVs). The state and reward design facilitates coordination among agents and considers travel delays collected by CVs. A novel algorithm, Asymmetric Advantage actor critic (Asym-A2C), is proposed where both CV and non-CV information is used to train the critic network, while only CV information is used to execute optimal signal timing. Comprehensive experiments show the superiority of CVLight over state-of-the-art algorithms under a 2-by 2 synthetic road network with various traffic demand patterns and penetration rates. The learned policy is then visualized to further demonstrate the advantage of Asym-A2C. A pre train technique is applied to improve the scalability of CVLight, which significantly shortens the training time and shows the advantage in performance under a 5-by-5 road network. A case study is performed on a 2-by-2 road network located in State College, Pennsylvania, USA, to further demonstrate the effectiveness of the proposed algorithm under real-world scenarios. Compared to other baseline models, the trained CVLight agent can efficiently control multiple intersections solely based on CV data and achieve the best performance, especially under low CV penetration rates.

关键词： Traffic signal control Deep reinforcement learning Connected vehicles actor-critic algorithm

来源：评论

学校读者我要写书评

暂无评论

Multi-agent reinforcement learning-based dynamic task assignment for vehicles in urban transportation system

引用

INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS 2021年 240卷 108251-108251页

作者： Qin, Wei Sun, Yan-Ning Zhuang, Zi-Long Lu, Zhi-Yao Zhou, Yao-Ming Shanghai Jiao Tong Univ Sch Mech Engn Shanghai 200240 Peoples R China

The task assignment for vehicles plays an important role in urban transportation system, which is the key to cost reduction and efficiency improvement. The development of information technology and the emergence of "sharing economy" create a more convenient transportation mode, but also bring a greater challenge to efficient operation of urban transportation system. On the one hand, considering the complex and dynamic environment of urban transportation, an efficient method for assigning transportation tasks to idle vehicles is desired. On the other hand, to meet the users' expectations on immediate response of vehicle, the task assignment problem with dynamic arrival remains to be resolved. In this study, we propose a dynamic task assignment method for vehicles in urban transportation system based on the multi-agent reinforcement learning (RL). The transportation task assignment problem is transformed into a stochastic game process from vehicles' perspective, and then an extended actor-critic (AC) algorithm is employed to obtain the optimal strategy. Based on the proposed method, vehicles can independently make decisions in real time, thus eliminating a lot of communication cost. Compared with the methods based on first-come-first-service (FCFS) rule and classic contract net algorithm (CNA), the results show that the proposed method can obtain higher acceptance rate and profit rate in the service cycle.

关键词： Urban transportation system Transportation task assignment Multi-agent reinforcement learning actor-critic algorithm

来源：评论

学校读者我要写书评

暂无评论

Sampling diversity driven exploration with state difference guidance

引用

EXPERT SYSTEMS WITH APPLICATIONS 2022年 203卷

作者： Lu, Jiayi Han, Shuai Lu, Shuai Kang, Meng Zhang, Junwei Jilin Univ Key Lab Symbol Computat & Knowledge Engn Minist Educ Changchun 130012 Peoples R China Jilin Univ Coll Comp Sci & Technol Changchun 130012 Peoples R China Univ Utrecht Dept Informat & Comp Sci Utrecht Netherlands

Exploration is one of the key issues of deep reinforcement learning, especially in the environments with sparse or deceptive rewards. Exploration based on intrinsic rewards can handle these environments. However, these methods cannot take both global interaction dynamics and local environment changes into account simultaneously. In this paper, we propose a novel intrinsic reward for off-policy learning, which not only encourages the agent to take actions not fully learned from a global perspective, but also instructs the agent to trigger remarkable changes in the environment from a local perspective. Meanwhile, we propose the doubleactors-double-critics framework to combine intrinsic rewards with extrinsic rewards to avoid the inappropriate combination of intrinsic and extrinsic rewards in previous methods. This framework can be applied to off policy learning algorithms based on the actor-critic method. We provide a comprehensive evaluation of our approach on the MuJoCo benchmark environments. The results demonstrate that our method can perform effective exploration in the environments with dense, deceptive and sparse rewards. Besides, we conduct sufficient ablation and quantitative analyses to intrinsic rewards. Furthermore, we also verify the superiority and rationality of our double-actors-double-critics framework through comparative experiments.

关键词： Reinforcement learning Exploration Intrinsic rewards Off-policy actor-critic algorithm

来源：评论

学校读者我要写书评

暂无评论

Drone Elevation Control Based on Python-Unity Integrated Framework for Reinforcement Learning Applications

引用

DRONES 2023年第4期7卷 225-225页

作者： Abbass, Mahmoud Abdelkader Bashery Kang, Hyun-Soo Chungbuk Natl Univ Sch Elect & Comp Engn Dept Informat & Commun Engn Cheongju 28644 South Korea Helwan Univ Dept Mech Power Engn Cairo 11772 Egypt

Reinforcement learning (RL) applications require a huge effort to become established in real-world environments, due to the injury and break down risks during interactions between the RL agent and the environment, in the online training process. In addition, the RL platform tools (e.g., Python OpenAI's Gym, Unity ML-Agents, PyBullet, DART, MoJoCo, RaiSim, Isaac, and AirSim), that are required to reduce the real-world challenges, suffer from drawbacks (e.g., the limited number of examples and applications, and difficulties in implementation of the RL algorithms, due to difficulties with the programing language). This paper presents an integrated RL framework, based on Python-Unity interaction, to demonstrate the ability to create a new RL platform tool, based on making a stable user datagram protocol (UDP) communication between the RL agent algorithm (developed using the Python programing language as a server), and the simulation environment (created using the Unity simulation software as a client). This Python-Unity integration process, increases the advantage of the overall RL platform (i.e., flexibility, scalability, and robustness), with the ability to create different environment specifications. The challenge of RL algorithms' implementation and development is also achieved. The proposed framework is validated by applying two popular deep RL algorithms (i.e., Vanilla Policy Gradient (VPG) and actor-critic (A2C)), on an elevation control challenge for a quadcopter drone. The validation results for these experimental tests, prove the innovation of the proposed framework, to be used in RL applications, because both implemented algorithms achieve high stability, by achieving convergence to the required performance through the semi-online training process.

关键词： reinforcement learning agent simulation platform Python programming language Unity simulation software UDP communication protocol Vanilla Policy Gradient algorithm actor-critic algorithm

来源：评论

学校读者我要写书评

暂无评论

How to use prior knowledge for injection molding in industry 4.0

引用

RESULTS IN ENGINEERING 2024年 23卷

作者： Parizs, Richard Dominik Torok, Daniel Budapest Univ Technol & Econ Fac Mech Engn Dept Polymer Engn Muegyetem Rkp 3 H-1111 Budapest Hungary MTA BME Lendulet Lightweight Polymer Composites Re Muegyetem Rkp 3 H-1111 Budapest Hungary

Searching for the optimal injection molding settings for a new product usually requires much time and money. This article proposes a new method that uses reinforcement learning with prior knowledge for the optimization of settings. This method uses an actor-critic algorithm for the optimization of the filling phase and the holding phase. For five different injection molded products, the filling phase and holding phase were adjusted with the above-mentioned method. The learning algorithm optimized the settings for one product (pre-learning) and used this acquired knowledge (prior knowledge) to optimize the injection molding settings for a new product (post- learning). This research shows that the method is able to optimize the injection molding parameters in a reasonable time when prior knowledge is derived from a product with a different material, gate design or even geometry. On average, less than 16 injection molding cycles were needed for the algorithm to optimize the filling phase and less than 10 cycles to optimize the holding phase. The presented method can greatly facilitate the development of self-adjusting injection molding machines.

关键词： Injection molding Reinforcement learning actor-critic algorithm Industry 4.0 Self-adjustment

来源：评论

学校读者我要写书评

暂无评论

A Transfer Learning Framework for Energy Efficient Wi-Fi Networks and Performance Analysis Using Real Data

A Transfer Learning Framework for Energy Efficient Wi-Fi Net...

引用

IEEE International Conference on Advanced Networks and Telecommuncations Systems

作者： Shreyata Sharma S. J. Darak Anand Srivastava Honggang Zhang Department of Electronics and Communication Engineering IIIT Delhi College of Information Science & Electronic Engineering (ISEE) Zhejiang University

ISBN: (纸本)9781509021949

In the recent past, there has been an exponential increase in data intensive services over the communication networks. This trend would sustain in future communication networks as well, especially in the Wi-Fi networks. This could be attributed to rapid growth of business and institutional entities and the need for cellular data off-loading for which localized Wi-Fi networks are preferred due to higher offered data rate. In such networks, a major portion of energy consumption occurs at the access network entities making energy efficient operation of Wi-Fi access points (APs) extremely crucial. In this paper, an actor-critic (AC) reinforcement learning (RL) framework is designed to enable traffic based ON/OFF switching of APs in Wi-Fi network. Furthermore, previously estimated traffic statistics is exploited in future scenarios which speeds up the learning process and provide additional improvement in energy saving. The important feature of the present study is the validation of the proposed framework on real data collected from an institute's Wi-Fi network. The simulation results for 20 APs of a Wi-Fi network shows that the proposed framework can lead to around 75% saving in energy consumption as compared to the case when AP switching is not considered.

关键词： Reinforcement learning Energy saving in Wi-Fi networks Transfer learning actor-critic algorithm Wi-Fi Transfer (Psychology) Learning networks(communications) observational data learning (artificial intelligence) Apus Energy efficiency

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：