Reinforcement Learning (RL) environments can produce training data with spurious correlations between features due to the amount of training data or its limited feature coverage. This can lead to RL agents encoding th...
ISBN:
(纸本)9781713899921
Reinforcement Learning (RL) environments can produce training data with spurious correlations between features due to the amount of training data or its limited feature coverage. This can lead to RL agents encoding these misleading correlations in their latent representation, preventing the agent from generalising if the correlation changes within the environment or when deployed in the real world. Disentangled representations can improve robustness, but existing disentanglement techniques that minimise mutual information between features require independent features, thus they cannot disentangle correlated features. We propose an auxiliary task for RL algorithms that learns a disentangled representation of high-dimensional observations with correlated features by minimising the conditional mutual information between features in the representation. We demonstrate experimentally, using continuous control tasks, that our approach improves generalisation under correlation shifts, as well as improving the training performance of RL algorithms in the presence of correlated features.
In 1985 I joined Texas Instruments' (TI's) Deformable Mirror Device ( DMD) group to develop applications of the cantilever device in coherent optical signal processing. At that time I witnessed the "aha d...
详细信息
ISBN:
(纸本)9781510670617;9781510670600
In 1985 I joined Texas Instruments' (TI's) Deformable Mirror Device ( DMD) group to develop applications of the cantilever device in coherent optical signal processing. At that time I witnessed the "aha discovery" that led to the invention of the DLP. It is interesting to consider the many years of effort that led Larry Hornbeck to this commercially successful implementation, not just the technology, but the efforts to sustain the project through sponsored R&D. While TI viewed the only sustainable market as (incoherent) display applications, the DMD group sustained the effort with DoD funding for coherent and incoherent optical signal processingsystems, including matched filter correlators, digital optical switches, optical crossbar switches and related neural network processors. For coherent signal processing the need for a 2p phase-only (piston-motion pixel) spatial light modulator (SLM) was readily apparent to the sponsors. While TI saw little commercial justification for the phase-only device, this need inspired me around 1991 to develop a new class of real-time computer-generated holography algorithms referred to a pseudorandom encoding, in which each phase-only pixel is encoded with a desired magnitude and phase. The optical Fourier transforms of the modulation enabled my developments of multi-spot object targeting and laser tweezer systems. Around 2005 I began using Digital Light processing (DLP) developer kits in place of scanners to time-share images with a small number of detectors. One system using a single, high sensitivity detector together with well- chosen DLP frames quickly forms a "partial image" of a point-like scene objects - arguably, an early version of compressive sensing. This paper concludes with recommendations on optimizing the performance and applications of, and potential markets for TI's recently demonstrated phase-only DLP.
Prod is a seminal algorithm in full-information online learning, which has been conjectured to be fundamentally sub-optimal for multi-armed bandits. By leveraging the interpretation of Prod as a first-order OMD approx...
Direction of arrival (DOA) estimation on uniform linear array with single snapshot has always been a hot topic in radar signal processing. The traditional subspace class estimation methods, such as multiple signal cla...
详细信息
In urban environment, communication signals encompass non-line-of-sight (NLOS) paths that contain environmental information. This paper investigates utilizing NLOS component signal to estimate the environment paramete...
详细信息
We demonstrate a photonic rectified linear unit (ReLU) function accomplished through frequency-coded neural signals. We show operation of an optical neuron with weighted sum and ReLU activation to perform with a 1% pe...
详细信息
ISBN:
(纸本)9798350377583
We demonstrate a photonic rectified linear unit (ReLU) function accomplished through frequency-coded neural signals. We show operation of an optical neuron with weighted sum and ReLU activation to perform with a 1% penalty in accuracy. (c) 2024 The Author(s)
In this paper we discuss the methods for measuring optical signal to noise ratio (OSNR) in high-speed coherent channels of optical transmission systems. The following OSNR measurement methods are presented in this art...
详细信息
The paper presents the results of research on the possibilities of automating the processing of measurement data obtained from the Brillouin optical reflectometer. Analyzing the parameters of the Mandelstam-Brillouin ...
详细信息
Invoice processing is a time-consuming and tedious task that can be automated using optical character recognition (OCR) technology. Tesseract is a popular open-source OCR engine that can be used to extract text from s...
详细信息
The feasibility of surface modifications of a gas dynamic seal made of silicon carbide ceramic by microstructuring to reduce the friction coefficient for use in stationary gas-turbine equipment has been determined. La...
详细信息
暂无评论