Learning-based autonomous vehicle trajectory planning methods have shown excellent performance in a variety of complex traffic scenarios. However, the existing imitation learning (IL) and reinforcement learning (RL) a...
Learning-based autonomous vehicle trajectory planning methods have shown excellent performance in a variety of complex traffic scenarios. However, the existing imitation learning (IL) and reinforcement learning (RL) algorithms still have their limitations, such as poor safety and generalizability for IL, and low data efficiency for RL. To leverage their respective advantages and mitigate the limitations, this paper proposes a novel hybrid RL algorithm for autonomous vehicle planning, where IL is embedded in it to guide its exploration with expert knowledge. Different from existing approaches, we use multi-step trajectory prediction instead of behavior cloning as the IL method integrated with online RL. Through such design, we make a further step in the research about how expert demonstration can be helpful to RL. Moreover, we conduct parallel training and testing of the algorithm based on real-world driving data. Experimental results demonstrate that our proposed approach outperforms standalone IL and RL methods, and performs better than RL methods enhanced by behavior cloning.
In this paper we extend our previous research on coherent observer-based pole placement approach to study the synthesis of robust decoherence-free (DF) modes for linear quantum passive systems, which is aimed at prese...
详细信息
ISBN:
(数字)9781665410205
ISBN:
(纸本)9781665410212
In this paper we extend our previous research on coherent observer-based pole placement approach to study the synthesis of robust decoherence-free (DF) modes for linear quantum passive systems, which is aimed at preservation of quantum information. In particular, DF modes can be generated by placing the poles on the imaginary axis via a coherent feedback design scheme, and these modes can further be simultaneously made robust against perturbations to the system parameters by minimizing the condition number associated with imaginary poles. We develop explicit algebraic conditions for the existence of such a coherent quantum controller, with the corresponding deign procedure provided. Examples are given to illustrate the process of tuning the DF modes towards perfect robustness via the proposed pole placement technique.
A new iterative spiking adaptive dynamic programming (SADP) algorithm based on the Poisson process for optimal impulsive control problems is investigated with convergence discussion of the iterative process. For a fix...
详细信息
Operating in narrow spaces is an important challenge in the development of robots. Redundant manipulators are one way to solve this problem, but their mechanism design and control method still have much room for impro...
Operating in narrow spaces is an important challenge in the development of robots. Redundant manipulators are one way to solve this problem, but their mechanism design and control method still have much room for improvement. In this paper, we propose a coiled cable-conduit-driven hyper-redundant manipulator (C-CDHRM) with great slenderness and flexibility. In terms of mechanism design, it considers both compactness and operability. By imitating the structure and behavior of a constricting snake, it can be uncoiled sequentially from a coiled storage state, led by the head. In terms of control methods, we propose a multi-layer control system that can make remote operations more accurate and reliable. On the one hand, guiding, segmenting, and following the path overcome the planning ambiguity caused by redundancy. On the other hand, conduit transmission modeling and cable length correction overcome the nonlinear mapping of cable-driven joints and were verified in experiments. Through tests, the mobile integrated system composed of C-CDHRM has an excellent performance in operation precision and accuracy, ensuring safety and accessibility in narrow spaces. Finally, in field experiments, the inspection and cleaning of various types of electrical equipment have been successfully completed, showing excellent application prospects.
Multi-view representation learning aims to derive robust representations that are both view-consistent and view-specific from diverse data sources. This paper presents an in-depth analysis of existing approaches in th...
详细信息
This paper proposes a hand position tracking algorithm based on optimized consistent extended Kalman filter(CEKF).By introducing the previous work of the authors and analyzing the parameter of the original CEKF algori...
详细信息
This paper proposes a hand position tracking algorithm based on optimized consistent extended Kalman filter(CEKF).By introducing the previous work of the authors and analyzing the parameter of the original CEKF algorithm,the key parameter that is negatively correlated with the degree of the estimation of uncertain dynamics is ***,two metaheuristic methods,the Genetic Algorithm(GA) and the Particle Swarm Optimization(PSO) are used to optimize the original CEKF *** quantify the performance of the algorithms,the root-mean-square error(RMSE) is employed as the performance ***,the numerical simulation and practical experiment of the hand position tracking are carried out,and the optimized algorithm achieves 9.52% and 10.94% improvements of the performance,respectively.
Aiming at a large number of ambiguous,imprecise and incomplete data in the real world,fuzzy time series has come into being and developed into an effective forecasting *** the process of modeling and forecasting of fu...
详细信息
Aiming at a large number of ambiguous,imprecise and incomplete data in the real world,fuzzy time series has come into being and developed into an effective forecasting *** the process of modeling and forecasting of fuzzy time series,the prediction performance of fuzzy time series can be effectively improved by partitioning the universe of discourse into different *** this paper,a forecasting approach for fuzzy time series,which introduces the granularity mechanism into interval division and employs differential data for incremental forecasting,is proposed to solve the problem of time series forecasting with high forecasting *** the proposed approach,in order to describe the fuzzy logic relationship and fuzzy trend of historical data,we first do differential processing on the historical ***,Fuzzy C-means(FCM) clustering algorithm is used to generate several partition intervals *** the sequel,we use the principle of justifiable granularity to constantly adjust the width of all the intervals,so that these information granules associated with corresponding intervals become the most"informative" information ***,the boundary of information granules is used as the basis of interval division to complete the forecasting *** illustrative example is provided to demonstrate the essence of the proposed *** comparative experiment with other representative approaches shows that the proposed approach can significantly improve the prediction accuracy of time series.
This paper investigates the high-precision path tracking control of tracked paver combined with global satellite navigation *** the paver is performing paving operations,it requires high path tracking accuracy and goo...
详细信息
This paper investigates the high-precision path tracking control of tracked paver combined with global satellite navigation *** the paver is performing paving operations,it requires high path tracking accuracy and good vehicle ***,considering the influence of road curvature on path tracking accuracy and vehicle stability,and the situation that the vehicle can not move quickly to the expectation path when the lateral position of the vehicle deviates from the expectation path,this paper proposes a lateral path tracking control method based on improved Pure Pursuit *** control method is verified through *** experimental results show that the maximum lateral tracking error of the improved algorithm is 0.04 m,which is 55.56% lower than that of the original algorithm,and the average lateral tracking error is 0.02 m,which is60% lower than that of the original *** purpose of high-precision path tracking of the paver is realized.
Aiming at the problem of how to realize the reasonable distribution of goods and the resource scheduling of route planning under dynamic conditions, a phased refresh method is proposed to deal with the requirements in...
详细信息
Aiming at the problem of how to realize the reasonable distribution of goods and the resource scheduling of route planning under dynamic conditions, a phased refresh method is proposed to deal with the requirements in sections. At the same time, the clustering method is introduced to generate the initial population. And the cross and mutation rate is determined by a step-by-step reduction method, which is used to solve the problem of the traditional genetic algorithm falling into the local optimum caused by the uneven distribution of searchable solutions. Finally, the effectiveness of the improved method is verified by simulation experiments.
Adaptive dynamic programming(ADP) is a kind of intelligent control method,and it is a non-model-based method that can directly approximate the optimal control policy via online *** gradient algorithm is usually used t...
详细信息
Adaptive dynamic programming(ADP) is a kind of intelligent control method,and it is a non-model-based method that can directly approximate the optimal control policy via online *** gradient algorithm is usually used to update weights of action networks and critic networks,however it is clear that gradient descent-based learning methods are generally very slow due to improper learning steps or may easily converge to local *** this paper,in order to overcome those disadvantages of gradient descent-based learning methods,a novel ADP algorithm based on initial-training-free online extreme learning machine(ITF-OELM),in which the critic network link weights of hidden nodes to output nodes can be obtained by least squares instead of gradient algorithm,is ***,the ADP algorithm based on ITF-OELM is tested on a discrete time torsional pendulum system,and simulation results indicate that this algorithm makes the system converge in a shorter time compared with the ADP based on gradient algorithm.
暂无评论