检索结果-内蒙古大学图书馆

optimization of three-dimensional urban underground logistics system alignment: a deep reinforcement learning approach

引用

COMPUTERS & INDUSTRIAL ENGINEERING 2025年 205卷

作者： Hou, Longlong Xu, Yuanxian Ren, Rui Yang, Jianping Su, Lijie Beijing Univ Technol Coll Architecture & Urban Planning Beijing 100124 Peoples R China Nanjing Audit Univ Sch Engn Audit Nanjing 211815 Peoples R China Army Engn Univ PLA Coll Def Engn Nanjing 210007 Peoples R China Xuzhou Univ Technol Sch Civil Engn Xuzhou 221018 Peoples R China CRRC Yangtze Co Ltd Wuhan 430212 Peoples R China

The three-dimensional (3D) alignment design of the underground logistics system (ULS) is a key factor in determining the rationality of its underground infrastructure layout. However, existing research mostly focuses on the two-dimensional horizontal alignment optimization, while neglecting the alignment design of vertical space scales, which makes it difficult for research results to effectively support projects implementation. Furthermore, traditional operations research methods struggle to handle the massive underground space data processing tasks required for detailed design of ULS alignment. Therefore, this study aims to consider the dual attributes of underground infrastructure and logistics infrastructure of ULS, proposing innovative deep reinforcement learning (DRL) method to achieve 3D alignment planning. Firstly, a DRL model was developed considering the key design factors of underground infrastructure alignment such as construction, cost, space suitability, and underground space resources. Secondly, given the large and complex optimization search space of the problem, a curriculum learning-based proximal policy optimization (CL-PPO) algorithm was proposed to efficiently solve the model. Finally, based on the Suzhou ULS case, simulations of alignment optimization results under different planning orientations were conducted to demonstrate the effectiveness of the model and algorithm. Results show that CL-PPO has significant advantages over PPO in terms of computational efficiency and global optimization capabilities. Additionally, planning orientations have a significant impact on the ULS alignment layout and project construction cost. The innovative optimization method not only enriches the infrastructure planning theory of ULS, but also provides space layout guidance for the utilization of urban underground space.

关键词： Underground logistics system 3D alignment design proximal policy optimization algorithm Deep reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

SAR-PPO(Segmented Adaptive Reward): Robotic Arm Open Door Motion Control With Reinforcement Learning Based on Segmented Adaptive Reward

SAR-PPO(Segmented Adaptive Reward): Robotic Arm Open Door Mo...

引用

第43届中国控制会议

作者： Jianjun Yu Xinyue Feng Daoxiong Gong Yunai Gong the Beijing Key Laboratory of Computational Intelligence and Intelligent Systems Beijing University of Technology

ISBN: (数字)9789887581581

ISBN: (纸本)9798350366907

Door opening, as one of the common actions in daily life, has become an important direction for robotic arm applications. Different door handles open in different ways, to enable the robotic arm to complete the corresponding door-opening operation according to the handle category, the proximal policy optimization algorithm is used to open the door. Opening the door contains a multi-segment process such as approaching the handle, operating the handle and pushing the door open. The sparse reward that focuses only on the result of opening the door will lead to the extension of the training time of the robotic arm,or even fail to converge. To address this problem, this paper proposes a segmented adaptive reward. First, consider the segment task of opening the door, design the segmented reward, formulate segmented training rules, and gradually guide the robotic arm to improve the overall training effect. At the same time, the reward adds an adaptive weight adjustment mechanism, which adaptively adjusts the weights according to the current stage of attention to different tasks, and then matches the segmented training to accelerate the training speed. In a simulation environment, the experimental results show that the door opening success rate of our algorithm is 61.04% higher than that of the original PPO algorithm, and it can achieve the round handle opening task that cannot be solved by the original algorithm.

关键词： Segmented Adaptive Reward proximal policy optimization algorithm Robotic Arm Motion Control

来源：评论

学校读者我要写书评

暂无评论

EN-DIVINE: An Enhanced Generative Adversarial Imitation Learning Framework for Knowledge Graph Reasoning 14th

EN-DIVINE: An Enhanced Generative Adversarial Imitation Lear...

引用

14th International Conference on Knowledge Science, Engineering, and Management (KSEM)

作者： Wu, Yuejia Zhou, Jiantao Inner Mongolia Univ Coll Comp SciMinist EducNatl & Local Joint Engn Inner Mongolia Engn Lab Cloud Comp & Serv Softwar Inner Mongolia Key Lab Social Comp & Data ProcEn Hohhot Peoples R China

ISBN: (纸本)9783030821364;9783030821357

Knowledge Graphs (KGs) are often incomplete and sparse. Knowledge graph reasoning aims at completing the KG by predicting missing paths between entities. The reinforcement learning (RL) based method is one of the state-of-the-art approaches to this work. However, existing RL-based methods have some problems, such as unstable training and poor reward function. Although the DIVINE framework, which a novel plug-and-play framework based on generative adversarial imitation learning, improved existing RL-based algorithms without extra reward engineering, the rate of policy update is slow. This paper proposes the EN-DIVINE framework, using proximal policy optimization algorithms to perform gradient descent when discriminator parameters take policy steps to improve the framework's training speed. Experimental results show that our work can provide an accessible improvement for the DIVINE framework.

关键词： Knowledge graph reasoning Reinforcement learning Imitation learning proximal policy optimization algorithm

来源：评论

学校读者我要写书评

暂无评论

Application of Deep Reinforcement Learning in Guandan Game

Application of Deep Reinforcement Learning in Guandan Game

引用

第34届中国控制与决策会议

作者： Jiahong Pan Zhongtian Zhang Hengheng Shen Yi Zeng Lei Wu School of Computer Science and Technology Anhui University

In recent years,imperfect information game has become an important touchstone to test the level of artificial *** are many imperfect information game scenarios in the real-world,such as economic transactions,military games,automatic ***,the study of imperfect information game problems has very important practical *** is a type of imperfect information card game with four players which are divided into two *** mass hidden information in the Guandan game leads to a high-dimensional game *** learning algorithm has efficient ability in strategy search of computer *** it cannot converge under the condition of imperfect information and high-dimensional state space which caused by Guandan *** to these problems,this paper introduces the proximal policy optimization(PPO) algorithm based on deep reinforcement learning to solve the problem of imperfect information,high-dimensional state space,and action *** enables the agent to perceive high-dimensional information and makes decisions according to the acquisition *** experiment result shows that the decision model based on the proximal policy optimization algorithm is better than the intelligence level of the policy Gradient algorithm and A2 C algorithm,which proves that the system has a self-learning,ability to improve the game level of Guandan.

关键词： Imperfect Information Game Guandan Deep Reinforcement Learning proximal policy optimization algorithm Self-Learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：