objectdetection is a critical component in autonomous driving systems, requiring robust performance across diverse lighting conditions, including nighttime scenarios where RGB cameras underperform due to low visibili...
详细信息
The perception of the environment plays a decisive role in the safe and secure operation of autonomous vehicles. The perception of the surrounding is way similar to human vision. The human's brain perceives the en...
详细信息
ISBN:
(纸本)9781665417143
The perception of the environment plays a decisive role in the safe and secure operation of autonomous vehicles. The perception of the surrounding is way similar to human vision. The human's brain perceives the environment by utilizing different sensory channels and develop a view-invariant representation model. In this context, different exteroceptive sensors like cameras, Lidar, are deployed on the autonomous vehicle to perceive the environment. These sensors have illustrated their benefit in the visible spectrum domain yet in the adverse weather conditions;for instance, they have limited operational capability at night, leading to fatal accidents. This work explores thermal object detection to model a view-invariant model representation by employing the self-supervised contrastive learning approach. We have proposed a deep neural network Self Supervised thermal Network (SSTN) for learning the feature embedding to maximize the information between visible and infrared spectrum domain by contrastive learning. Later, these learned feature representations are employed for thermal object detection using a multi-scale encoder-decoder transformer network. The proposed method is extensively evaluated on the two publicly available datasets: the FLIR-ADAS dataset and the KAIST Multi-Spectral dataset. The experimental results illustrate the efficacy of the proposed method.
作者:
Yuan, ChunyuAgaian, Sos S.CUNY
Grad Ctr Dept Comp Sci New York NY 10016 USA CUNY
Dept Comp Sci Coll Staten Isl New York NY 10021 USA
As traditional RGB cameras cannot perform well under weak light in the darkness and poor weather conditions, thermal cameras have become an essential component of edge systems. This paper proposes a lightweight, faste...
详细信息
ISBN:
(纸本)9781510650770;9781510650763
As traditional RGB cameras cannot perform well under weak light in the darkness and poor weather conditions, thermal cameras have become an essential component of edge systems. This paper proposes a lightweight, faster binarized R-CNN network (a state-of-the-art instance segmentation model), called BithermalNet, for thermal object detection with high detection capabilities and lower memory usage. It designs a new Region Proposal Network(RPN) structure with a binary neural network (BNN) to lower model size by 16%, having higher accuracy performance. BithermalNet adds novel-designed residual gates to maximum information entropy and offers channel-wise weight and bias to reduce errors from binarization. The extensive experiments on different thermal datasets (such as Dogs&People thermal Dataset, UNIRI-TID) confirm that BithermalNet can outperform traditional faster R-CNN by sizable margins with smaller models size. Moreover, a comparative analysis of the proposed methods on thermal images will also be presented.
Domain adaptation for objectdetection typically entails transferring knowledge from one visible domain to another visible domain. However, there are limited studies on adapting from the visible to the thermal domain,...
详细信息
ISBN:
(纸本)9798350353013;9798350353006
Domain adaptation for objectdetection typically entails transferring knowledge from one visible domain to another visible domain. However, there are limited studies on adapting from the visible to the thermal domain, because the domain gap between the visible and thermal domains is much larger than expected, and traditional domain adaptation can not successfully facilitate learning in this situation. To overcome this challenge, we propose a Distinctive Dual-Domain Teacher (D3T) framework that employs distinct training paradigms for each domain. Specifically, we segregate the source and target training sets for building dual-teachers and successively deploy exponential moving average to the student model to individual teachers of each domain. The framework further incorporates a zigzag learning method between dual teachers, facilitating a gradual transition from the visible to thermal domains during training. We validate the superiority of our method through newly designed experimental protocols with well-known thermal datasets, i.e., FLIR and KAIST. Source code is available at https://***/EdwardDo69/D3T.
Segment Anything Model (SAM) is drastically accelerating the speed and accuracy of automatically segmenting and labeling large Red-Green-Blue (RGB) imagery datasets. However, SAM is unable to segment and label images ...
详细信息
Segment Anything Model (SAM) is drastically accelerating the speed and accuracy of automatically segmenting and labeling large Red-Green-Blue (RGB) imagery datasets. However, SAM is unable to segment and label images outside of the visible light spectrum, for example, for multispectral or hyperspectral imagery. Therefore, this paper outlines a method we call the Multispectral Automated Transfer Technique (MATT). By transposing SAM segmentation masks from RGB images we can automatically segment and label multispectral imagery with high precision and efficiency. For example, the results demonstrate that segmenting and labeling a 2,400-image dataset utilizing MATT achieves a time reduction of 87.8% in developing a trained model, reducing roughly 20 hours of manual labeling, to only 2.4 hours. This efficiency gain is associated with only a 6.7% decrease in overall mean average precision (mAP) when training multispectral models via MATT, compared to a manually labeled dataset. We consider this an acceptable level of precision loss when considering the time saved during training, especially for rapidly prototyping experimental modeling methods. This research greatly contributes to the study of multispectral objectdetection by providing a novel and open-source method to rapidly segment, label, and train multispectral objectdetection models with minimal human interaction. Future research needs to focus on applying these methods to 1) space-based multispectral;and 2) drone-based hyperspectral imagery.
Pedestrian detection is an important task in computer vision, which is also an important part of intelligent transportation systems. For privacy protection, thermal images are widely used in pedestrian detection probl...
详细信息
Pedestrian detection is an important task in computer vision, which is also an important part of intelligent transportation systems. For privacy protection, thermal images are widely used in pedestrian detection problems. However, thermal pedestrian detection is challenging due to the significant effect of temperature variation on the illumination of images and that fine-grained illumination annotations are difficult to be acquired. The existing methods have attempted to exploit coarse-grained day/night labels, which however even hampers the model performance. In this work, we introduce a novel idea of regressing conditional thermal-visible feature distribution, dubbed as Illumination Distribution-Aware adaptation (IDA). The key idea is to predict the conditional visible feature distribution given a thermal image, subject to their pre-computed joint distribution. Specifically, we first estimate the thermal-visible feature joint distribution by constructing feature co-occurrence matrices, offering a conditional probability distribution for any given thermal image. With this pairing information, we then form a conditional probability distribution regression task for model optimization. Critically, as a model agnostic strategy, this allows the visible feature knowledge to be transferred to the thermal counterpart implicitly for learning more discriminating feature representation. Experiment results show that our method outperforms the prior art methods, which use extra illumination annotations. Besides, as a plug-in, our method can averagely reduce about 2% MR on KAIST dataset, and improve about 1% mAP on FLIR-aligned and Autonomous Vehicles datasets without extra calculation for test. Code is available at https://***/HaMeow-lst1/IDA.
暂无评论