In order to reduce the construction risk caused by human operation error and improve the geological adaptive ability of the shield machine, an autonomous intelligent control method is proposed for shield machine withi...
详细信息
In order to reduce the construction risk caused by human operation error and improve the geological adaptive ability of the shield machine, an autonomous intelligent control method is proposed for shield machine within the framework of interaction-judgment-decision based on deepdeterministicpolicygradient (DDPG) deep reinforcement learning in this study. Due to the strong nonlinear relationship between the shield machine's tunneling parameters, this research builds a deep reinforcement learning environment using mechanism model of sealed cabin pressure. DDPG agent model of the shield machine is established to replace the shield machine to interact and train with the geological environment. By minimizing the difference between the target pressure setting value and the sealed cabin pressure value, the dynamic balance between the sealed cabin pressure and the pressure on the excavation surface is realized, and the best strategy is obtained. Through real-time interaction with the geological environment, the method in this paper can dynamically adjust the tunneling parameters, accurately control the sealed cabin pressure, and has a strong geological adaptive ability. By realizing the intelligent decision-making of the tunneling parameters, it greatly improves the independent decision-making ability of the shield machine system, reduces the inaccuracy of human operation, and provides an effective guarantee for the efficient and safe operation of the shield machine. This study applies deep reinforcement learning technology to the control field of earth pressure balance shield machine, promotes AI technology, and provides a new idea for the development of AI construction technology in engineering field.
This paper proposes a method for finding the shortest path of a mobile robot using deep reinforcement learning with utilizing Proximal policy optimization algorithm (PPO) enhanced with curriculum learning. By modellin...
详细信息
This paper proposes a method for finding the shortest path of a mobile robot using deep reinforcement learning with utilizing Proximal policy optimization algorithm (PPO) enhanced with curriculum learning. By modelling the environment in 3D space using the Webots simulator, we extend the PPO algorithm's capabilities to handle continuous states from 8 IR sensors and control the velocities of two motors of E -puck robot. Our study uniquely integrates curriculum learning into the PPO framework, aiming to improve adaptability and training efficiency in complex environments. A comparative analysis is conducted between the modified PPO, the original PPO, and the deep deterministic policy gradient algorithm, highlighting the strengths of our approach The results demonstrate that our curriculum -augmented PPO algorithm not only accelerates the training process but also shows superior adaptability and generalization in new environments. This work underscores the significant potential of curriculum learning in enhancing the performance of deep reinforcement learning algorithms for robust and efficient robotic navigation.
暂无评论