camouflaged object detection (COD) has been increasingly studied and the detection performance has been greatly improved based on deep learning models in recent years. However, the context and boundary information hav...
详细信息
camouflaged object detection (COD) has been increasingly studied and the detection performance has been greatly improved based on deep learning models in recent years. However, the context and boundary information have not been efficiently used simultaneously in the existing COD methods, leading to inferior detection for large camouflagedobjects, occluded objects, multiple and small objects, and objects with rich boundaries. Therefore, to effectively enhance the performance of COD, we propose a novel camouflaged object detection model, i.e., context-aware and boundary refinement (CABR). Specifically, CABR mainly consists of three modules: the global context information enhanced module (GCIEM), the attention-inducing neighbor fusion module (AINFM), and the boundary refinement module (BRM). GCIEM is designed to fully capture the long-range dependencies to obtain rich global context information to completely detect large objects and occluded objects. AINFM is capable of adaptively fusing adjacent layers to focus on the global and local context information simultaneously to improve the detection performance of multiple and small camouflagedobjects effectively. BRM can refine the boundaries by utilizing the spatial information in low-level features and suppressing the non-camouflage factors to detect camouflagedobjects with rich boundaries effectively. Quantitative and qualitative experiments are conducted on four benchmark datasets, and the experimental results demonstrate the effectiveness of our CABR with competitive performance to existing state-of-the-art methods according to most evaluation metrics.
camouflaged object detection (COD) aims to identify the objects that conceal themselves in natural scenes. Accurate COD suffers from a number of challenges associated with low boundary contrast and the large variation...
详细信息
camouflaged object detection (COD) aims to identify the objects that conceal themselves in natural scenes. Accurate COD suffers from a number of challenges associated with low boundary contrast and the large variation of object appearances, e.g., object size and shape. To address these challenges, we propose a novel Context-aware Cross-level Fusion Network ((CF)-F-2-Net), which fuses context-aware cross-level features for accurately identifying camouflagedobjects. Specifically, we compute informative attention coefficients from multi-level features with our Attention-induced Cross-level Fusion Module (ACFM), which further integrates the features under the guid ance of attention coefficients. We then propose a Dual-branch Global Context Module (DGCM) to refine the fused features for informative feature representations by exploiting rich global context information. Multiple ACFMs and DGCMs are integrated in a cascaded manner for generating a coarse prediction from high-level features. The coarse prediction acts as an attention map to refine the low-level features before passing them to our Camouflage Inference Module (CIM) to generate the final prediction. We perform extensive experiments on three widely used benchmark datasets and compare (CF)-F-2-Net with state-of the-art (SOTA) models. The results show that (CF)-F-2-Net is an effective COD model and outperforms SOTA models remark ably. Further, an evaluation on polyp segmentation datasets demonstrates the promising potentials of our (CF)-F-2-Net in COD downstream applications. Our code is publicly available at: https://***/Ben57882/C2FNet-TSCVT
The task of camouflaged object detection (COD) aims to accurately segment camouflagedobjects that integrated into the environment, which is more challenging than ordinary detection as the texture between the target a...
详细信息
ISBN:
(纸本)9781665468916
The task of camouflaged object detection (COD) aims to accurately segment camouflagedobjects that integrated into the environment, which is more challenging than ordinary detection as the texture between the target and background is visually indistinguishable. In this paper, we proposed a novel Feature Grafting and Distractor Aware network (FDNet) to handle the COD task. Specifically, we use CNN and Transformer to encode multi-scale images in parallel. In order to better explore the advantages of the two encoders, we design a cross-attentionbased Feature Grafting Module to graft features extracted from Transformer branch into CNN branch, after which the features are aggregated in the Feature Fusion Module. A Distractor Aware Module is designed to explicitly model the two possible distractor in the COD task to refine the coarse camouflage map. We also proposed the largest artificial camouflagedobject dataset which contains 2000 images with annotations, named ACOD2K. We conducted extensive experiments on four widely used benchmark datasets and the ACOD2K dataset. The results show that our method significantly outperforms other state-of-the-art methods. The code and the ACOD2K will be available at https://***/syxvision/FDNet.
Infrared (IR) images can be seen as complementary to visible light (RGB) images, as they can capture accurate targets in low-visibility conditions. However, camouflaged object detection (COD) based on RGB and IR image...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Infrared (IR) images can be seen as complementary to visible light (RGB) images, as they can capture accurate targets in low-visibility conditions. However, camouflaged object detection (COD) based on RGB and IR images is expensive. To this end, we propose to exploit a style transfer-based data augmentation method to generate pseudo-IR images by absorbing the style information of IR images into RGB images, and to perform COD based on RGB and the pseudo-IR images. For RGB and IR-based COD, we propose a novel Edge-guided Uncertainty-aware Fusion Network (EUFNet), to make better use of the complementarity between the two kinds of images. Specifically, an uncertainty-aware fusion module is first proposed to aggregate RGB and IR features by estimating their uncertainties. Then, an edge enhancement module is proposed to extract and enhance the edge information in multiple stages. Lastly, a hierarchical integration module is designed to integrate RGB and IR features with edge cues. Extensive experiments demonstrate the effectiveness of the generated pseudo-IR images as well as the proposed EUFNet. The code is available at https://***/csdahunzi/COD.
camouflaged object detection (COD) aims to segment objects that closely resemble their surroundings. Accurately recognizing camouflagedobjects in these complex environments is challenging due to factors such as low i...
详细信息
ISBN:
(纸本)9798350359329;9798350359312
camouflaged object detection (COD) aims to segment objects that closely resemble their surroundings. Accurately recognizing camouflagedobjects in these complex environments is challenging due to factors such as low illumination, object occlusion, small size, and similar background. To this end, we propose a novel network for camouflaged object detection, the Multi-Level Feature Cross-Fusion Network (MFCF-Net). This framework aims to learn and utilize background features at different scales through cross-fusion, thereby improving detection accuracy. The core of our approach is to use a modified version of the Pyramid Vision Transformer (PVTv2) as a backbone network to effectively capture contextual information at different scales. Then, we design the Multi-scale Feature Enhancement (MFE) module to optimize features at each scale. In addition, to enhance the model's ability to recognize camouflagedobjects in complex contexts, we cross-fused these enhanced features. Finally, we designed the Balanced Multilevel Feature CrossFusion (BMFCF) module. This module improves the accuracy of camouflaged object detection by deeply learning and effectively utilizing contextual feature information and cross-fusing these multi-scale features. Extensive research results show that our MFCF-Net significantly outperforms 18 leading methods on four widely used standard datasets.
Camouflage is the art of deception which is often used in the animal world. It is also used on the battlefield to hide military assets. camouflagedobjects hide within their environments by taking on colors and textur...
详细信息
ISBN:
(纸本)9781510650695;9781510650688
Camouflage is the art of deception which is often used in the animal world. It is also used on the battlefield to hide military assets. camouflagedobjects hide within their environments by taking on colors and textures that are similar to their surroundings. In this work, we explore the classification and localization of camouflaged enemy assets including soldiers. In this paper we address two major challenges: a) how to overcome the paucity of domain-specific labeled data and b) how to perform camouflage objectdetection using edge devices. To address the first challenge, we develop a deep neural style transfer model that blends content images of objects such as soldiers, tanks, and mines/improvised explosive devices with style images depicting deserts, jungles, and snow-covered regions. To address the second challenge, we develop combined depth-guided deep neural network models that combine image features with depth features. Previous research suggests that depth features not only contain local information about object geometry but also provide information on the position, and shape for camouflagedobject identification and localization. In this work, we use precomputed monocular method for the generation of the depth maps. The novel fusion-based architecture provides an efficient representation learning space for objectdetection. In addition, we perform ablation studies to measure the performance of depth versus RGB in detecting camouflagedobjects. We also demonstrate how such as model can be deployed in edge devices for real-time object identification and localization.
To meet the challenge of camouflaged object detection (COD),which has a high degree of intrinsic similarity between the object and background,this paper proposes a double-branch fusion network(DBFN)with a parallel...
详细信息
To meet the challenge of camouflaged object detection (COD),which has a high degree of intrinsic similarity between the object and background,this paper proposes a double-branch fusion network(DBFN)with a parallel attention selection mechanism (PASM).In detail,a schismatic receptive field block(SRF)combined with an attention mechanism for low-level information is performed to learn texture features in one branch,and an integration of the SRF,a hybrid attention mechanism (HAM),and a depth feature polymerization module (DFPM)is employed for high-level information to extract detection features in the other ***,both texture features and detection features are input into the PASM to acquire selective expression ***,the final result is obtained after further selective matrix optimization with atrous spatial pyramid pooling (ASPP)and a residual channel attention block (RCAB)being applied *** results on three public datasets verify that our method outperforms the state-of-the-art methods in terms of four evaluation metrics,i.e.,mean absolute error (MAE),weighted F βmeasure (Fβω),structural measure (Sα),and E-measure (Eφ)
camouflaged object detection (COD) is a challenging visual task due to its complex contour, diverse scales, and high similarity to the background. Existing COD methods encounter two predicaments: One is that they are ...
详细信息
camouflaged object detection (COD) is a challenging visual task due to its complex contour, diverse scales, and high similarity to the background. Existing COD methods encounter two predicaments: One is that they are prone to falling into local perception, resulting in inaccurate object localization;Another issue is the difficulty in achieving precise object segmentation due to a lack of detailed information. In addition, most COD methods typically require larger parameter amounts and higher computational complexity in pursuit of better performance. To this end, we propose a global localization perception and local guidance refinement network (PRNet), that simultaneously addresses performance and computational costs. Through effective aggregation and use of semantic and details information, the PRNet can achieve accurate localization and refined segmentation of camouflagedobjects. Specifically, with the help of a Cascaded Attention Perceptron (CAP) designed, we can effectively integrate and perceive multi-scale information to localize camouflagedobjects. We also design a Guided Refinement Decoder (GRD) in a top-down manner to extract context information and aggregate details to further refine camouflaged prediction results. Extensive experimental results demonstrate that our PRNet outperforms 12 state-of-the-art models on 4 challenging datasets. Meanwhile, the PRNet has a smaller number of parameters (12.74M), lower computational complexity (10.24G), and real-time inference speed (105FPS). Source codes are available at https://***/hu-xh/PRNet.
The key that hinders the performance improvement of current camouflaged object detection (COD) models is the lack of discriminability of features at fine granularity. We solve this problem from two complementary persp...
详细信息
The key that hinders the performance improvement of current camouflaged object detection (COD) models is the lack of discriminability of features at fine granularity. We solve this problem from two complementary perspectives. Firstly, complex scenes result in the discriminative feature representations of camouflagedobjects being present at different scales and semantic abstraction levels. Therefore, a mechanism is needed to increase the diversity of features to integrate more information potentially beneficial for COD. Second, appearance similarity between objects and environments will inevitably lead to similarity in features. Enhancing feature diversity alone is not enough to solve the above problems. Therefore, it is necessary to give the model semantic perception capabilities to expand the subtle discrepancies between objects and environments in feature embedding. Inspired by the first point, we propose a cross-scale interaction module (CSIM) that utilizes cross-attention between different scales to enhance the diversity of feature representations. Regarding the second point, the semantic guided feature learning (SGFL) is proposed to promote the model to expand feature discrepancies through explicit supervision. Experiments on four popular COD datasets show that our method outperforms recent SOTA methods. In addition, polyp segmentation experiments show that it is also effective for other COD-like tasks.
Recently, camouflaged object detection (COD), which suffers from numerous challenges such as low contrast between camouflagedobjects and background and large variations of camouflagedobject appearances, has received...
详细信息
Recently, camouflaged object detection (COD), which suffers from numerous challenges such as low contrast between camouflagedobjects and background and large variations of camouflagedobject appearances, has received more and more concerns. However, the performance of existing camouflaged object detection methods is still unsatisfactory, especially when dealing with complex scenes. Therefore, in this article, we propose a novel Decoupling and Integration Network (DINet) to detect camouflagedobjects. Here, the depiction of camouflagedobjects can be regarded as the iterative decoupling and integration of the body features and detail features, where the former focuses on the center of camouflagedobjects and the latter contains pixels around edges. Concretely, firstly, we deploy two complementary decoder branches including a detail branch and a body branch to learn the decoupling features, namely body decoder features and detail decoder features. Particularly, each decoder block of the two branches incorporates features from three components, i.e., the previous interactive feature fusion (IFF) module, adjacent encoder layers, and corresponding encoder layer. Besides, to further elevate the body decoder features, the body blocks also introduce the global contextual information, which is the combination of all body encoder features via the global context (GC) unit, to provide coarse object location information. Secondly, to integrate the two decoupling decoder features, we deploy the interactive feature fusion (IFF) module based on the interactive combination and channel attention. Following this way, we can progressively provide a complete and accurate representation for camouflagedobjects. Extensive experiments on three public challenging datasets, including CAMO, COD10 K, and NC4K, show that our DINet presents competitive performance when compared with the state-of-the-art models.
暂无评论