Crowds can lead up to severe disasterous consequences resulting in fatalities. Videos obtained through public cameras or captured by drones flying overhead can be processed with artificial intelligence-based crowd ana...
详细信息
In this paper, we study offline-to-online Imitation Learning (IL) that pretrains an imitation policy from static demonstration data, followed by fast finetuning with minimal environmental interaction. We find the na...
详细信息
In this paper, we study offline-to-online Imitation Learning (IL) that pretrains an imitation policy from static demonstration data, followed by fast finetuning with minimal environmental interaction. We find the naïve combination of existing offline IL and online IL methods tends to behave poorly in this context, because the initial discriminator (often used in online IL) operates randomly and discordantly against the policy initialization, leading to misguided policy optimization and unlearning of pretraining knowledge. To overcome this challenge, we propose a principled offline-to-online IL method, named OLLIE, that simultaneously learns a near-expert policy initialization along with an aligned discriminator initialization, which can be seamlessly integrated into online IL, achieving smooth and fast finetuning. Empirically, OLLIE consistently and significantly outperforms the baseline methods in 20 challenging tasks, from continuous control to vision-based domains, in terms of performance, demonstration efficiency, and convergence speed. This work may serve as a foundation for further exploration of pretraining and finetuning in the context of IL. Copyright 2024 by the author(s)
In this paper, we introduce an innovative approach to weakly supervised medical image segmentation with box annotations. Different from the previous methods which simply utilize a single conventional network with the ...
详细信息
Our paper introduces a novel video dataset specifically for Temporal Intention Localization (TIL), aimed at identifying hidden abnormal intention in densely populated and complex environments. Traditional Temporal Act...
详细信息
Speech is a fundamental means of human interaction. Speaker Identification (SI) plays a crucial role in various applications, such as authentication systems, forensic investigation, and personal voice assistance. Howe...
详细信息
Speech is a fundamental means of human interaction. Speaker Identification (SI) plays a crucial role in various applications, such as authentication systems, forensic investigation, and personal voice assistance. However, achieving robust and secure SI in both open and closed environments remains challenging. To address this issue, researchers have explored new techniques that enable computers to better understand and interact with humans. Smart systems leverage Artificial Neural Networks (ANNs) to mimic the human brain in identifying speakers. However, speech signals often suffer from interference, leading to signal degradation. The performance of a Speaker Identification System (SIS) is influenced by various environmental factors, such as noise and reverberation in open and closed environments, respectively. This research paper is concerned with the investigation of SI using Mel-Frequency Cepstral Coefficients (MFCCs) and polynomial coefficients, with an ANN serving as the classifier. To tackle the challenges posed by environmental interference, we propose a novel approach that depends on symmetric comb filters for modeling. In closed environments, we study the effect of reverberation on speech signals, as it occurs due to multiple reflections. To address this issue, we model the reverberation effect with comb filters. We explore different domains, including time, Discrete Wavelet Transform (DWT), Discrete Cosine Transform (DCT), and Discrete Sine Transform (DST) domains for feature extraction to determine the best combination for SI in case of reverberation environments. Simulation results reveal that DWT outperforms other transforms, leading to a recognition rate of 93.75% at a Signal-to-Noise Ratio (SNR) of 15 dB. Additionally, we investigate the concept of cancelable SI to ensure user privacy, while maintaining high recognition rates. Our simulation results show a recognition rate of 97.5% at 0 dB using features extracted from speech signals and their DCTs. Fo
Teachers take attendance by having pupils sign in or check-in classes and transportation. Student absences often result from individual mistakes. This article examines a technology that records data from classroom pho...
详细信息
Context: Android games are gaining wide attention from users in recent years. However, the existing literature reports alarming statistics about banning popular and top-trending Android apps. The popular gaming apps h...
详细信息
During the past few years, especially after the emergence of the Covid-19 pandemic, researchers have devoted their efforts in improving the global health sector by supporting it with the latest technologies. Among the...
详细信息
Nowadays, Non-Intrusive Load Monitoring (NILM) with Federated Learning (FL) framework has become a growing study towards providing a secure energy disaggregation system in smart homes. This study aims at deploying an ...
详细信息
This article proposes a novel approach to traffic signal control that combines phase re-service with reinforcement learning (RL). The RL agent directly determines the duration of the next phase in a pre-defined sequen...
详细信息
暂无评论