Surface roughness is a key indicator of product quality. Traditional measurement methods generally rely on manual trials and inspections, which are time-consuming and costly. Therefore, developing an accurate and reli...
详细信息
Surface roughness is a key indicator of product quality. Traditional measurement methods generally rely on manual trials and inspections, which are time-consuming and costly. Therefore, developing an accurate and reliable prediction model is vital for automatic grinding, as it enables more informed decision-making, reduces waste, and improves efficiency. However, current model- and data-driven methods suffer from either low accuracy or poor interpretability. To address these challenges, this article proposes an enhanced hybrid framework that integrates a mechanistic model-based predictor with a knowledge-based fuzzy broad learning system (KBFBLS) for error correction. The mechanistic model offers a physically interpretable baseline estimate of surface roughness, while KBFBLS enhances prediction accuracy by learning the mapping from process parameters to the residual errors of the mechanistic model. Built upon the fuzzy broad learning system (FBLS)—an emerging neuro-fuzzy model for efficient nonlinear modeling—KBFBLS integrates expert knowledge-guided fuzzy partition and variance-determined Gaussian membership function (MF) widths, two novel strategies that further improve the system’s expressiveness and adaptability, making it a powerful error corrector. Experimental results on real-world robotic disk grinding tasks show that the proposed framework outperforms representative model-driven, data-driven, and hybrid methods. Furthermore, its adaptability to other machining processes is validated using the wheel grinding and milling datasets.
In this paper, a policy iteration-based Q-learning algorithm is proposed to solve infinite horizon linear nonzero-sum quadratic differential games with completely unknown dynamics. The Q-learning algorithm, which empl...
详细信息
In this paper, a policy iteration-based Q-learning algorithm is proposed to solve infinite horizon linear nonzero-sum quadratic differential games with completely unknown dynamics. The Q-learning algorithm, which employs off-policy reinforcement learning(RL), can learn the Nash equilibrium and the corresponding value functions online, using the data sets generated by behavior policies. First, we prove equivalence between the proposed off-policy Q-learning algorithm and an offline PI algorithm by selecting specific initially admissible polices that can be learned online. Then, the convergence of the off-policy Qlearning algorithm is proved under a mild rank condition that can be easily met by injecting appropriate probing noises into behavior policies. The generated data sets can be repeatedly used during the learning process, which is computationally effective. The simulation results demonstrate the effectiveness of the proposed Q-learning algorithm.
This paper considers the coordinated welding control based on deep multi-agent reinforcement *** discrete-time states and actions with local observation for the welding robots and the characteristics of the weld lines...
详细信息
This paper considers the coordinated welding control based on deep multi-agent reinforcement *** discrete-time states and actions with local observation for the welding robots and the characteristics of the weld lines(e.g.,the starting points of the weld lines and the ending points of the weld lines) are defined,which is suitable to use the monotonic value function factorisation for deep multi-agent reinforcement learning(QMIX) algorithm.A novel reward composed of trajectory optimization,collision avoidance and the task done is designed,which is proved by the simulation in the grid-world environment.
A nonlinear robust trajectory tracking strategy for a gliding hypersonic vehicle with an aileron stuck at an unknown position is presented in this paper. First, the components of translational motion dynamics perpendi...
详细信息
A nonlinear robust trajectory tracking strategy for a gliding hypersonic vehicle with an aileron stuck at an unknown position is presented in this paper. First, the components of translational motion dynamics perpendicular to the velocity are derived, and then a guidance law based on a time-varying sliding mode method is used to realize trajectory tracking. Furthermore, the rotational equations of motion are separated into an actuated subsystem and an unactuated subsystem. And an adaptive time-varying sliding mode attitude controller is proposed based on the actuated subsystem to track the command attitude and the tracking performance and robustness are therefore enhanced. The proposed guidance law and attitude controller make the hypersonic vehicle fly along the reference trajectory even when the aileron is stuck at an unknown angle. Finally, a hypersonic benchmark platform is used to demonstrate the effectiveness of the proposed strategy.
This paper proposes a self-attention based temporal intrinsic reward model for reinforcement learning (RL), to synthesize the control policy for the agent constrained by the sparse reward in partially observable envir...
详细信息
ISBN:
(纸本)9781665426480
This paper proposes a self-attention based temporal intrinsic reward model for reinforcement learning (RL), to synthesize the control policy for the agent constrained by the sparse reward in partially observable environments. This approach can solve the problem of temporal credit assignment to some extent and deal with the low efficiency of exploration. We first introduce a sequence-based self-attention mechanism to generate the temporary features, which can effectively capture the temporal property of the task for the agent. During the training process, the temporary features are employed in each sampled episode to elaborate the intrinsic rewards, which is combined with the extrinsic reward to help the agent learn a feasible policy. Then we use the meta-gradient to update this intrinsic reward model in order that the agent can achieve better performance. Experiments are given to demonstrate the superiority of the proposed method.
In order to address the problem of current object detection models being too large to be deployed on robot controllers, this paper proposes improvements to YOLOv5 for real-time detection. The YOLOv5s model is pruned a...
In order to address the problem of current object detection models being too large to be deployed on robot controllers, this paper proposes improvements to YOLOv5 for real-time detection. The YOLOv5s model is pruned at the Batch Normalization (BN) layer, and further optimized using the OpenVINO tool for quantization. The results demonstrate that our improvements effectively improve the model's inference speed while maintaining its accuracy. Compared to the original YOLOv5 model, the pruned model achieves the same accuracy without degradation, with a 36.6% reduction in parameter count and a 35% reduction in weight storage file size. Furthermore, after optimization with OpenVINO, the final model achieves a FPS of 56.
Peak load management is very important for the electric power system. This paper analyzes the impact of residential swimming pool pumps (RSPPs) on the peak load. First, this paper analyzes the challenges of non-intrus...
详细信息
This paper presents an improved deep deterministic policy gradient algorithm based on a six-DOF(six multi-degree-offreedom) arm robot. First, we build a robot model based on the DH(Denavit-Hartenberg) parameters of th...
详细信息
This paper presents an improved deep deterministic policy gradient algorithm based on a six-DOF(six multi-degree-offreedom) arm robot. First, we build a robot model based on the DH(Denavit-Hartenberg) parameters of the UR5 arm robot. Then,we improved the experience pool of the traditional DDPG(deep deterministic policy gradient) algorithm by adding a success experience pool and a collision experience pool. Next, the reward function is improved to increase the degree of successful reward and the penalty of collision. Finally, the training is divided into segments, the front three axes are trained first, and then the six axes. The simulation results in ROS(Robot Operating System) show that the improved DDPG algorithm can effectively solve the problem that the six-DOF arm robot moves too far in the configuration space. The trained model can reach the target area in five steps. Compared with the traditional DDPG algorithm, the improved DDPG algorithm has fewer training episodes,but achieves better results.
As an interdisciplinary of fuzzy theory and clustering, Fuzzy C-Means(FCM) is widely applied for identifying categories with unlabeled data. However, its application to data which is hard to visualize rises the diff...
详细信息
As an interdisciplinary of fuzzy theory and clustering, Fuzzy C-Means(FCM) is widely applied for identifying categories with unlabeled data. However, its application to data which is hard to visualize rises the difficulty for users to determine the input parameters, especially for the number of clusters. In this paper, a kind of fuzzy clustering algorithm with self-regulated parameters named Density-Based Fuzzy C-Means(DBFCM) is proposed by integrating the idea of Density-Based Spatial Clustering of Application with Noise(DBSCAN) into FCM. Its advantage is using the inherit density characteristic of input data to self-determine the parameters of fuzzy clustering. The experimental results demonstrate that the proposed DBFCM can not only self-determine the proper parameters, but also accelerate the convergence process compared to the original FCM.
Analytic Hierarchy Process(AHP) is a multi criteria decision-making method,which can describe and transform the qualitative problems quantitatively,and then get the quantitative analysis results in accordance with t...
详细信息
Analytic Hierarchy Process(AHP) is a multi criteria decision-making method,which can describe and transform the qualitative problems quantitatively,and then get the quantitative analysis results in accordance with the causal relationship between decision *** this paper,a granular Analytic Hierarchy Process,which introduces the granularity mechanism,is proposed to solve the portfolio selection problem under the mean-risk *** the proposed method,the scale value of scheme layer is no longer limited to nine positive integers from 1 to 9,which gives granularity attributes to the comparison of advantages and disadvantages in a specific criterion layer between different *** proposed method reflects small differences between different alternative schemes through granularity attribute,so it can provide rich decision information for decision *** numeric examples from the real-world financial market(China Shanghai Stock Exchange) are provided to illustrate an essence of the proposed method.
暂无评论