Addressing decision-making problems using sequence modeling to predict future trajectories shows promising results in recent years. In this paper, we take a step further to leverage the sequence predictive method in w...
详细信息
The notion of algorithmic fairness has been actively explored from various aspects of fairness, such as counterfactual fairness (CF) and group fairness (GF). However, the exact relationship between CF and GF remains t...
ISBN:
(纸本)9798331314385
The notion of algorithmic fairness has been actively explored from various aspects of fairness, such as counterfactual fairness (CF) and group fairness (GF). However, the exact relationship between CF and GF remains to be unclear, especially in image classification tasks; the reason is because we often cannot collect counter-factual samples regarding a sensitive attribute, essential for evaluating CF, from the existing images (e.g., a photo of the same person but with different secondary sex characteristics). In this paper, we construct new image datasets for evaluating CF by using a high-quality image editing method and carefully labeling with human annotators. Our datasets, CelebA-CF and LFW-CF, build upon the popular image GF benchmarks; hence, we can evaluate CF and GF simultaneously. We empirically observe that CF does not imply GF in image classification, whereas previous studies on tabular datasets observed the opposite. We theoretically show that it could be due to the existence of a latent attribute G that is correlated with, but not caused by, the sensitive attribute (e.g., secondary sex characteristics are highly correlated with hair length). From this observation, we propose a simple baseline, Counterfactual Knowledge Distillation (CKD), to mitigate such correlation with the sensitive attributes. Extensive experimental results on CelebA-CF and LFW-CF demonstrate that CF-achieving models satisfy GF if we successfully reduce the reliance on G (e.g., using CKD).
This paper describes our submission to the PragTag task, which aims to categorize each sentence from peer reviews into one of the six distinct pragmatic tags. The task consists of three conditions: full, low, and zero...
In this paper, we perform an object rearrangement task for target retrieval in an environment with a confined space and limited observation directions. The agent must create a collision-free path to bring out the targ...
In this paper, we perform an object rearrangement task for target retrieval in an environment with a confined space and limited observation directions. The agent must create a collision-free path to bring out the target object by relocating the surrounding objects using the prehensile action, i.e., pick-and-place. Object rearrangement in a confined space is a non-monotone problem, and finding a valid plan within a reasonable time is challenging. We propose a novel algorithm that divides the target retrieval task, which requires a long sequence of actions, into sequential sub-problems and explores each solution through Monte Carlo tree search (MCTS). In the experiment, we verify that the proposed algorithm can find safe rearrangement plans with various objects efficiently compared to the existing planning methods. Furthermore, we show that the proposed method can be transferred to a real robot experiment without additional training.
Cooperative multi-agent scenarios are prevalent in real-world applications. Optimal coordination of agents requires appropriate task allocation, considering each task's complexity and each agent's capability. ...
Cooperative multi-agent scenarios are prevalent in real-world applications. Optimal coordination of agents requires appropriate task allocation, considering each task's complexity and each agent's capability. This becomes challenging under decentralization and partial observability, as agents must self-allocate tasks using limited state information. We introduce a novel multi-agent environment in which effective sub-task assignment is crucial for high-scoring performance. In addition, we propose a new multi-agent reinforcement learning framework named as attention-based randomized ensemble multi-agent Q-learning, or AREQ for short. This approach integrates a unique network structure using a multi-head attention mechanism, efficiently extracting task-related information from observations. AREQ also incorporates a randomized ensemble method, enhancing sample efficiency. We explore the impact of this attention-based structure and the random ensemble method through an ablation study and show AREQ's superiority compared to existing MARL methods within our proposed environment.
In this paper, we propose a signed distance field (SDF)-based deep Q-learning framework for multi-object re-arrangement. Our method learns to rearrange objects with non-prehensile manipulation, e.g., pushing, in unstr...
In this paper, we propose a signed distance field (SDF)-based deep Q-learning framework for multi-object re-arrangement. Our method learns to rearrange objects with non-prehensile manipulation, e.g., pushing, in unstructured environments. To reliably estimate Q-values in various scenes, we train the Q-network using an SDF-based scene graph as the state-goal representation. To this end, we introduce SDFGCN, a scalable Q-network structure which can estimate Q-values from a set of SDF images satisfying permutation invariance by using graph convolutional networks. In contrast to grasping-based rearrangement methods that rely on the performance of grasp predictive models for perception and movement, our approach enables rearrangements on unseen objects, including hard-to-grasp objects. Moreover, our method does not require any expert demonstrations. We observe that SDFGCN is capable of unseen objects in challenging configurations, both in the simulation and the real world.
This paper investigates a missing feature imputation problem for graph learning tasks. Several methods have previously addressed learning tasks on graphs with missing features. However, in cases of high rates of missi...
详细信息
We propose a novel type of map for visual navigation, a renderable neural radiance map (RNR-Map), which is designed to contain the overall visual information of a 3D environment. The RNR-Map has a grid form and consis...
详细信息
In the realm of autonomous agents, ensuring safety and reliability in complex and dynamic environments remains a paramount challenge. Safe reinforcement learning addresses these concerns by introducing safety constrai...
详细信息
ISBN:
(数字)9798350377705
ISBN:
(纸本)9798350377712
In the realm of autonomous agents, ensuring safety and reliability in complex and dynamic environments remains a paramount challenge. Safe reinforcement learning addresses these concerns by introducing safety constraints, but still faces challenges in navigating intricate environments such as complex driving situations. To overcome these challenges, we present the safe constraint reward (Safe CoR) framework, a novel method that utilizes two types of expert demonstrations—reward expert demonstrations focusing on performance optimization and safe expert demonstrations prioritizing safety. By exploiting a constraint reward (CoR), our framework guides the agent to balance performance goals of reward sum with safety constraints. We test the proposed framework in diverse environments, including the safety gym, metadrive, and the real-world Jackal platform. Our proposed framework improves algorithm performance by 39% and reduces constraint violations by 88% on the real-world Jackal platform, highlighting its effectiveness. Through this innovative approach, we expect significant advancements in real-world performance, leading to transformative effects in the realm of safe and reliable autonomous agents.
Data-driven controls using Gaussian process regression have recently gained much attention. In such approaches, system identification by Gaussian process regression is mainly followed by model-based controller designs...
Data-driven controls using Gaussian process regression have recently gained much attention. In such approaches, system identification by Gaussian process regression is mainly followed by model-based controller designs. However, the outcomes of Gaussian process regression are often too complicated to apply conventional control designs, which makes the numerical design such as model predictive control employed in many cases. To overcome the restriction, our idea is to perform Gaussian process regression to the inverse of the plant with the same input/output data for the conventional regression. With the inverse, one can design a model reference controller without resorting to numerical control methods. This paper considers single-input single-output (SISO) discrete-time nonlinear systems of minimum phase with relative degree one. It is highlighted that the model reference Gaussian process regression (MR-GPR) controller is designed directly from precollected input/output data without identification of the system itself.
暂无评论