In this paper, some on-line TD(λ) learning algorithms for average reward stochastic dynamic programming problems are presented. Instead of the cost-to-go function in previous algorithms, the relative function is the ...
详细信息
In this paper, some on-line TD(λ) learning algorithms for average reward stochastic dynamic programming problems are presented. Instead of the cost-to-go function in previous algorithms, the relative function is the principal object to be learned in the algorithms proposed. This work is an extension and generalization of the work on previous TD(λ) methods and R-leammg algorithm. A robot experiment called Wall-Following is presented to test and illustrate the new algorithms with variant parameters. The results show that the new methods work well in approximating the relative value function.
Parameter estimation with the nonparametric impulse response model gives unbiased results, but suffers from the large number of parameters. By simple structuring of the measurement vector they can be considerably redu...
详细信息
Parameter estimation with the nonparametric impulse response model gives unbiased results, but suffers from the large number of parameters. By simple structuring of the measurement vector they can be considerably reduced and an identification tool is obtained, that combines the advantages of parametric and nonparametric methods: It is appropriate for on-line identification, no a-priori knowledge about order or dead time of the process is required and it is insensitive to noise. Computationally efficient algorithms are given for the reconstruction of the continuous-time impulse response from the reduced impulse response sequence and for on-line computation of the parameters of the continuous-time transfer function.
In multiple-input multiple-output radar, independent waveforms are transmitted from different antennas, and the target parameters are estimated via the linearly independent echoes from different targets. Several adapt...
详细信息
ISBN:
(纸本)9781479925667
In multiple-input multiple-output radar, independent waveforms are transmitted from different antennas, and the target parameters are estimated via the linearly independent echoes from different targets. Several adaptive approaches are directly applied to target angle and target amplitude estimation, including Capon, APES (amplitude and phase estimation). The CCA (canonical correlation analysis) approach is first proposed to estimate target locations which has high peak amplitudes, then a gradient-based algorithm is presented to improve the target angle estimation accuracy based on Capon approach which has a high resolution. With an initial angle, the angle sequence is iteratively updated with adaptive steps and converges to local peaks which indicate the target locations. Simulations show that the target angle accuracy is improved, and the common DOA (direction-of-arrive) problem is avoided.
作者:
Rizvi, FarheenGuidance and Control Systems Engineering Group
Guidance and Control Section Autonomous Systems Division Jet Propulsion Laboratory California Institute of Technology Mail Stop 230-104 4800 Oak Grove Drive Pasadena CA 91109 United States
The effect of altering a gain parameter in the Cassini reaction control system (RCS) delta-V controller on the maneuver execution errors during orbit trim maneuvers (OTMs) is explored. Cassini consists of two reaction...
详细信息
ISBN:
(数字)9781624102240
ISBN:
(纸本)9781624102240
The effect of altering a gain parameter in the Cassini reaction control system (RCS) delta-V controller on the maneuver execution errors during orbit trim maneuvers (OTMs) is explored. Cassini consists of two reaction control thruster branches (A & B) each with eight thrusters. Currently, the B-branch is operational while the A-branch serves as a back-up. The four Z-thrusters control the X and Y-axes, while the four Y-thrusters control the Z-axis. During an OTM, the Z-thrusters fire to maintain the X and Y-axes pointing within an attitude control dead-zone (-10 to 10 milliradians). The errors do not remain at zero due to pointing error sources such as spacecraft center of mass offset from the geometric center of the Z-facing thrusters, and variability in the thruster forces due to the thruster hardware differences. The delta-V reaction control system (RCS) controller ensures that the attitude error remains within this dead-zone. Gain parameters within the RCS delta-V controller affect the maneuver execution errors. Different parameter values are used to explore effect on these errors. It is found that pointing error decreases and magnitude error increases rapidly for gain parameters 10 times greater than the current parameter values used in the flight software.
The paper presents a model of the air stream ratio in the dry grinding and classification circuit with the electromagnetic mill. The concept of the grinding system is described along with indirect measurement methodol...
详细信息
The paper presents a model of the air stream ratio in the dry grinding and classification circuit with the electromagnetic mill. The concept of the grinding system is described along with indirect measurement methodology. Model structure and identification problems are discussed and a supervisory control algorithm based on the model is derived.
The recently developed technique for computation of piecewise quadratic Lyapunov functions is specialized to Lyapunov functions that arc piecewise linear. This establishes a unified framework for compulation of quadra...
The recently developed technique for computation of piecewise quadratic Lyapunov functions is specialized to Lyapunov functions that arc piecewise linear. This establishes a unified framework for compulation of quadratic, piecewise quadratic, piecewise linear and polytopic Lyapunov functions. The search for a piecewise linear Lyapunov function is formulated as a linear programming problem, and duality is used to address the non-trivial issue of partition refinements.
A parity space approach to fault detection by an optimal measurement selection is proposed for networked controlsystems(NCS).A distributed process,decomposed into p sub-processes with different sampling times,is mode...
详细信息
A parity space approach to fault detection by an optimal measurement selection is proposed for networked controlsystems(NCS).A distributed process,decomposed into p sub-processes with different sampling times,is modeled as a linear time-invariant discrete-time system by means of the lifting *** order to reduce the network load and data transmission cost,an optimal scheme,which manages and schedules the data transmission through the networks from the local subsystems to the central fault detection system,is *** scheme ensures that a minimum communication load or measurement cost for the data transmission of NCS can be achieved,while the fault detection performance is optimum simultaneously.
Uncertainty representation and management are important in prognostics and health management (PHM). This paper first introduces concept of uncertainty as well as the importance of uncertainty in PHM. And the processin...
Uncertainty representation and management are important in prognostics and health management (PHM). This paper first introduces concept of uncertainty as well as the importance of uncertainty in PHM. And the processing of uncertainty is mainly summarized, containing the uncertainty representation, estimation and management. The main approaches for uncertainty processing, such as probability theory, fuzzy set theory, evidence theory and rough set theory are analyized systematically in detail.
暂无评论