While multitask representation learning has become a popular approach in reinforcement learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it works is still limited. Most previou...
ISBN:
(纸本)9781713871088
While multitask representation learning has become a popular approach in reinforcement learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it works is still limited. Most previous analytical works could only assume that the representation function is already known to the agent or from linear function class, since analyzing general function class representation encounters non-trivial technical obstacles such as generalization guarantee, formulation of confidence bound in abstract function space, etc. However, linear-case analysis heavily relies on the particularity of linear function class, while real-world practice usually adopts general non-linear representation functions like neural networks. This significantly reduces its applicability. In this work, we extend the analysis to general function class representations. Specifically, we consider an agent playing M contextual bandits (or MDPs) concurrently and extracting a shared representation function ϕ from a specific function class Φ using our proposed Generalized Functional Upper Confidence Bound algorithm (GFUCB). We theoretically validate the benefit of multitask representation learning within general function class for bandits and linear MDP for the first time. Lastly, we conduct experiments to demonstrate the effectiveness of our algorithm with neural net representation.
A multi-functional full-space metasurface based on frequency and polarization multiplexing is *** metasurface unit consists of metallic patterns printed on the two faces of a single-layered dielectric *** unit cell ca...
详细信息
A multi-functional full-space metasurface based on frequency and polarization multiplexing is *** metasurface unit consists of metallic patterns printed on the two faces of a single-layered dielectric *** unit cell can control electromagnetic wavefronts to achieve a broadband transmission with amplitudes greater than 0.4 from 4.4 to 10.4 ***,at 11.7 GHz and 15.4 GHz,four high-efficiency reflection channels with a reflection amplitude greater than 0.8 are also *** illuminated by linearly polarized waves,five different functions can be realized at five different frequencies,which are demonstrated by theoretical calculations,full-wave simulations,and experimental measurements.
In this brief paper, a data-based method on the fault diagnosis in aero-engine transmission systems is ***, during the operation of splines, we acquire the acceleration vibration signal. We process the signal into tim...
In this brief paper, a data-based method on the fault diagnosis in aero-engine transmission systems is ***, during the operation of splines, we acquire the acceleration vibration signal. We process the signal into time-frequency images using the short-time Fourier transform. Then, the improved convolutional neural network with channel attention is trained using time-frequency images of multiple fault signals. Finally, we implement fault prediction for spline wear and misalignment faults in aero-engine transmission systems. The result shows that the fault diagnosis using the proposed method has a high level of accuracy.
This research explored the application of pulsed vacuum technology on the drying(PVD)of pineapple *** of drying temperature and pulsed vacuum ratio(PVR)on drying characteristics and pineapple quality(color,rehydration...
详细信息
This research explored the application of pulsed vacuum technology on the drying(PVD)of pineapple *** of drying temperature and pulsed vacuum ratio(PVR)on drying characteristics and pineapple quality(color,rehydration characteristics,microstructure,and texture)were *** expected,increasing the drying temperature resulted in a higher drying rate and effective moisture *** optimal PVR of 5:5 was beneficial in accelerating the drying rate of pineapple slices and the corresponding effective moisture diffusion coefficient(8.9601×10^(-10))was higher than other PVR conditions based on material center *** material temperature increased during the normal pressure period and decreased rapidly when the pressure dropped to the vacuum condition,which indirectly reflected the moisture transfer that occurred during the vacuum holding period,while moisture diffusion happened during the atmospheric pressure holding *** optimal pulsed vacuum drying process(PVR of 5:5)could expand air and water vapor and create a looser structure so as to obtain better rehydration performance(rehydration ratio(RR)was 5.43).High drying temperature led to the decrease of L^(*)value,the increase ofΔE value,and even the formation of surface scorch at 80℃.At the same drying temperature,the color quality depended on the drying time,and the color difference increased with the extension of the drying *** chewiness and hardness of pineapple slices dried by PVD were significantly higher than those of fresh samples,which was conducive to the chewing taste.
Fault-tolerant syndrome extraction is a key ingredient in implementing fault-tolerant quantum computations. While conventional methods use a number of extra qubits linear in the weight of the syndrome, several improve...
详细信息
Recently, quantum federated learning (QFL) is advocated to leverage the robust computing power of quantum edge computing devices (QECDs) within unmanned aerial vehicle (UAV)-assisted wireless networks, to enhance the ...
详细信息
Protection of human rights is one of the most important problems of the modern world. In this paper, we construct a Twitter dataset that covers one of the most significant human rights contradiction in recent years wh...
Protection of human rights is one of the most important problems of the modern world. In this paper, we construct a Twitter dataset that covers one of the most significant human rights contradiction in recent years which affected the whole world: the George Floyd incident. We propose a labeled dataset for topic detection that contains about 17 million tweets. These Tweets are collected from 25 May 2020 to 21 August 2020, covering about 90 days from the start of the incident. We labeled the dataset by monitoring most trending news topics from global and local newspapers and used TF-IDF and LDA as baselines. We evaluated the results of these two methods with three different k values for precision, recall and F1-score.
This paper presents a novel fault-tolerant controller for a class of uncertain high-order fully actuated systems (FASs) with actuator faults. Recently, the FAS approaches have been rapidly developed in different field...
This paper presents a novel fault-tolerant controller for a class of uncertain high-order fully actuated systems (FASs) with actuator faults. Recently, the FAS approaches have been rapidly developed in different fields. In order to cope with the fault tolerance issue for actuators in FAS framework, a multiplicative fault gain is introduced to the basic uncertain high-order system. Based on Lyapunov stability theory, a full-state feedback strategy is presented to ensure the globally uniformly asymptotically (GUA) stability of the considered FAS. Additionally, in the absence of actuator faults, the proposed fault-tolerant controller can robustly stabilize the FASs. A numerical experiment illustrates the effectiveness of the proposed method.
High-dimensional expensive problems are often encountered in the design and optimization of complex robotic and automated systems and distributed computing systems, and they suffer from a time-consuming fitness evalua...
详细信息
This paper addresses local path re-planning for n-dimensional systems by introducing an informed sampling scheme and cost function to achieve collision avoidance with minimum deviation from an (optimal) nominal path. ...
详细信息
暂无评论