Asynchronous advantage actor‐critic(A3C)algorithm is a commonly used policy opti-mization algorithm in reinforcement learning,in which asynchronous is parallel inter-active sampling and training,and advantage is a sa...
详细信息
Asynchronous advantage actor‐critic(A3C)algorithm is a commonly used policy opti-mization algorithm in reinforcement learning,in which asynchronous is parallel inter-active sampling and training,and advantage is a sampling multi‐step reward estimation method for computing *** order to address the problem of low efficiency and insufficient convergence caused by the traditional heuristic exploration of A3C algorithm in reinforcement learning,an improved A3C algorithm is proposed in this *** this algorithm,a noise network function,which updates the noise tensor in an explicit way is constructed to train the *** advantage estimation(GAE)is also adopted to describe the dominance ***,a new mean gradient parallelisation method is designed to update the parameters in both the primary and secondary networks by summing and averaging the gradients passed from all the sub‐processes to the main *** experiments were conducted in a gym environment using the PyTorch Agent Net(PTAN)advanced reinforcement learning library,and the results show that the method enables the agent to complete the learning training faster and its convergence during the training process is *** improved A3C algorithm has a better performance than the original algorithm,which can provide new ideas for sub-sequent research on reinforcement learning algorithms.
In this paper, a fault detection method based on improved support vector data description is proposed for wastewater treatment plants. First, an improved Multi-Kernel Support Vector Data Description (MKSVDD) method is...
详细信息
The majority of causes that contribute to metro railway traffic accidents can be attributed to human errors. The widespread application of railway perception systems is expected to reduce the occurrence rate of railwa...
详细信息
Fatigue driving is a major contributor to traffic accidents, as it reduces alertness and can even be fatal. To investigate alertness changes during prolonged driving, we first built a simulated driving experiment plat...
详细信息
To enhance the performance of automatic landing systems for commercial aircraft, a control law of localizer pre-capture was developed based on the maximum pitch angle limit of the approach, the characteristics of the ...
详细信息
Artificial intelligence hopes machines perform tasks in a way naturally like a human being, a useful approach is behavior cloning (BC), which makes the machine learn human behavior. This paper addresses the online lea...
详细信息
Isolated power converters are widely used for its safety and flexible adjustment between input and output voltage range. EMI occurs due to the presence of switching process. The existence of parasitic parameters in tr...
详细信息
Isolated power converters are widely used for its safety and flexible adjustment between input and output voltage range. EMI occurs due to the presence of switching process. The existence of parasitic parameters in transformers causes the work of EMI prediction in isolated DC-DC power converters more complicated. Parasitic parameters in transformer are the crucial propagation paths for CM noise conduction. For improving the accuracy of EMI prediction, this paper developed a wide-band frequency transformer model based on the two-capacitance model, further considering both the effect capacitive and inductive coupling on conductive common mode noise. Furthermore, the influence of permeability versus frequency of Mn-Zn ferrite on CM transmission, is investigated in detail. Two-port measurement is used for the validation of the proposed high frequency model. The experiment results demonstrate that the proposed wind-band frequency transformer model can well predict the CM noise behavior in the frequency range of 100 kHz to 100 MHz.
Emotions are an essential part of human physiological performance. Therefore, the study of emotion recognition is extremely relevant in both practical applications and theoretical research. Based on the electroencepha...
详细信息
In this paper, aiming at the nonlinear control problem of the hybrid energy source system, encompassing both the battery and the supercapacitor, an Interconnection and Damping Assignment Passivity-Based control (IDA-P...
详细信息
In order to improve the stability of the robot at work, this paper adds an insertion point between the end of the robot and the next target point, and changes the original linear trajectory to a new trajectory planned...
详细信息
暂无评论