Inverse Reinforcement Learning (IRL) and Reinforcement Learning from Human Feedback (RLHF) are pivotal methodologies in reward learning, which involve inferring and shaping the underlying reward function of sequential...
详细信息
Inverse Reinforcement Learning (IRL) and Reinforcement Learning from Human Feedback (RLHF) are pivotal methodologies in reward learning, which involve inferring and shaping the underlying reward function of sequential decision-making problems based on observed human demonstrations and feedback. Most prior work in reward learning has relied on prior knowledge or assumptions about decision or preference models, potentially leading to robustness issues. In response, this paper introduces a novel linear programming (LP) framework tailored for offline reward learning. Utilizing pre-collected trajectories without online exploration, this framework estimates a feasible reward set from the primal-dual optimality conditions of a suitably designed LP, and offers an optimality guarantee with provable sample efficiency. Our LP framework also enables aligning the reward functions with human feedback, such as pairwise trajectory comparison data, while maintaining computational tractability and sample efficiency. We demonstrate that our framework potentially achieves better performance compared to the conventional maximum likelihood estimation (MLE) approach through analytical examples and numerical experiments. Copyright 2024 by the author(s)
Accurate and timely diagnosis of pulmonary diseases is critical in the field of medical imaging. While deep learning models have shown promise in this regard, the current methods for developing such models often requi...
详细信息
Accurate and timely diagnosis of pulmonary diseases is critical in the field of medical imaging. While deep learning models have shown promise in this regard, the current methods for developing such models often require extensive computing resources and complex procedures, rendering them impractical. This study focuses on the development of a lightweight deep-learning model for the detection of pulmonary diseases. Leveraging the benefits of knowledge distillation (KD) and the integration of the ConvMixer block, we propose a novel lightweight student model based on the MobileNet architecture. The methodology begins with training multiple teacher model candidates to identify the most suitable teacher model. Subsequently, KD is employed, utilizing the insights of this robust teacher model to enhance the performance of the student model. The objective is to reduce the student model's parameter size and computational complexity while preserving its diagnostic accuracy. We perform an in-depth analysis of our proposed model's performance compared to various well-established pre-trained student models, including MobileNetV2, ResNet50, InceptionV3, Xception, and NasNetMobile. Through extensive experimentation and evaluation across diverse datasets, including chest X-rays of different pulmonary diseases such as pneumonia, COVID-19, tuberculosis, and pneumothorax, we demonstrate the robustness and effectiveness of our proposed model in diagnosing various chest infections. Our model showcases superior performance, achieving an impressive classification accuracy of 97.92%. We emphasize the significant reduction in model complexity, with 0.63 million parameters, allowing for efficient inference and rapid prediction times, rendering it ideal for resource-constrained environments. Outperforming various pre-trained student models in terms of overall performance and computation cost, our findings underscore the effectiveness of the proposed KD strategy and the integration of the Conv
Using Buck converter as the final stage for 48V-To1V conversion currently is still one of the main solutions for CPU and GPU power delivery due to its extremely mature monolithic IC industry, controller development an...
详细信息
The integrated circuit (IC) industry has experienced exponential growth in the complexity and scale of hardware designs. To sustain this growth, faster development cycles and cost-effective solutions have been the foc...
详细信息
Hydrogen fuel cells offer a clean and efficient energy source for various applications, including transportation. However, the nonlinear output voltage and current characteristics of fuel cells present challenges in t...
详细信息
Sensors, electronic devices, and smart systems have invaded the market and our daily lives. As a result, their utility in smart home contexts to improve the quality of life, especially for the elderly and people with ...
详细信息
Most power electronics equipment like solar inverters or solid-state circuit breakers (SSCBs) require short-time overload current requirements, which subsequently necessitate the surge current capability of power semi...
详细信息
We present a novel trajectory traversability estimation and planning algorithm for robot navigation in complex outdoor environments. We incorporate multimodal sensory inputs from an RGB camera, 3D LiDAR, and the robot...
详细信息
This paper studies the power density limits of propulsion motor for electric aircraft considering thermal aspects and breakdown voltage reduction of insulation. The study em-ploys multi-objective optimization (MOO) to...
详细信息
The research discusses a multimodal framework for analyzing and detecting fake news and understanding its impact on society. This framework employs diverse strategies, including linguistic analysis, social network mon...
详细信息
暂无评论