Existing camouflaged object detection methods have made impressive achievements, however, the interference from highly similar backgrounds, as well as the indistinguishable object boundary, still hider the detection a...
详细信息
Existing camouflaged object detection methods have made impressive achievements, however, the interference from highly similar backgrounds, as well as the indistinguishable object boundary, still hider the detection accuracy. In this paper, we propose a three-stage bilateral decoupling complementarity learning network (BDCL-Net) to explore how to utilize the specific advantages of multi-level encoded features for achieving high-quality inference. Specifically, all side-output features are decoupled into two branches to generate three complementary features. Different from previous methods that focus on obtaining the camouflagedobject and body boundary, our body modeling stage, which includes a global positioning flow (GPF) module and a multi-scale body warping (MBW) module, is deployed to obtain a global contextual feature that provides coarse localization of potential camouflagedobjects and a body feature that emphasizes learning the central areas of camouflagedobjects. The detail preservation stage is designed to generate a detail feature that pays attention to the regions around the boundary. Consequently, the body prediction can avoid disturbances from the highly similar backgrounds, while the detail prediction can reduce errors caused by imbalanced boundary pixels. The complementary feature integration (CFI) module in the feature aggregation stage is designed to fuse these complementary features in an interactive learning manner. We conduct extensive experiments on four public datasets to demonstrate the effectiveness and superiority of our proposed network. The code is available at http://***/iuueong/BDCLNet.
camouflaged object detection (COD) aims to segment targeted objects that have similar colors, textures, or shapes to their background environment. Due to the limited ability in distinguishing highly similar patterns, ...
详细信息
camouflaged object detection (COD) aims to segment targeted objects that have similar colors, textures, or shapes to their background environment. Due to the limited ability in distinguishing highly similar patterns, existing COD methods usually produce inaccurate predictions, especially around the boundary areas, when coping with complex scenes. This paper proposes a Progressive Region-to-Boundary Exploration Network (PRBE-Net) to accurately detect camouflagedobjects. PRBE-Net follows an encoder-decoder framework and includes three key modules. Specifically, firstly, both high-level and low-level features of the encoder are integrated by a region and boundary exploration module to explore their complementary information for extracting the object's coarse region and fine boundary cues simultaneously. Secondly, taking the region cues as the guidance information, a Region Enhancement (RE) module is used to adaptively localize and enhance the region information at each layer of the encoder. Subsequently, considering that camouflagedobjects usually have blurry boundaries, a Boundary Refinement (BR) decoder is used after the RE module to better detect the boundary areas with the assistance of boundary cues. Through top-down deep supervision, PRBE-Net can progressively refine the prediction. Extensive experiments on four datasets indicate that our PRBE-Net achieves superior results over 21 state-of-the-art COD methods. Additionally, it also shows good results on polyp segmentation, a COD-related task in the medical field.
作者:
Ren, PengBai, TianSun, FumingJilin Univ
Coll Comp Sci & Technol Changchun 130012 Peoples R China Jilin Univ
Key Lab Symbol Computat & Knowledge Engn Minist Educ Changchun 130012 Peoples R China Dalian Minzu Univ
Sch Informat & Commun Engn Dalian 116600 Peoples R China
Although most of the existing camouflaged object detection methods achieve significant progress, they still have limitations in the following aspects. Firstly, they usually ignore the internal topological structure of...
详细信息
Although most of the existing camouflaged object detection methods achieve significant progress, they still have limitations in the following aspects. Firstly, they usually ignore the internal topological structure of objects. Secondly, they are unable to simultaneously balance performance and efficiency. To address the above issues, we propose an Efficient Skeleton-guided Network (ESNet), which utilizes skeleton information to focus on the geometric structure and extension properties of objects. Specifically, to improve the ability of the network to represent the skeleton information, we design a lightweight skeleton encoder and make skeleton labels to supervise the skeleton features generated by the skeleton encoder. To enhance the ability of the network to perceive the internal structure of objects, we design a Skeleton Guidance Fusion Module (SGFM), which introduces skeleton information into semantic features. Furthermore, we introduce a Part- object Relational Mapping (PORM) component to further improve the micro integrity of camouflagedobjects. Extensive experimental results show that ESNet achieves excellent performance, and it has only 10.8M parameters and real-time inference speed (140.3 FPS).
In comparison to traditional objectdetection or segmentation tasks, camouflaged object detection (COD) poses greater challenges, as humans are often perplexed or deceived by the inherent similarities between foregrou...
详细信息
In comparison to traditional objectdetection or segmentation tasks, camouflaged object detection (COD) poses greater challenges, as humans are often perplexed or deceived by the inherent similarities between foreground objects and their background surroundings. Polarization information serves as a valuable asset for discerning the attributes of objects with varied characteristics and surface texture. Taking inspiration from the polarization vision systems observed in animals, this study presents the High-Resolution Intensity & Polarization Fusion (HIPF) Net, a high-efficiency cross-modal fusion network that leverages trichromatic intensity and linear orthogonal polarization cues to produce a scene representation that is rich in texture and edge details. Specifically, the Early Adaptive Stokes Fusion (EASF) module maximizes the utilization of information from linear orthogonal polarization images. Subsequently, the Mix-Attention Feature Interaction Module (MAI) is introduced to facilitate complementary interaction among low-level features. Additionally, the Attentional Receptive Field Block (ARFB) enables the model to uncover concealed cues effectively, capturing objects of various sizes. Finally, the Weighted Cross-Level Decoder(WCFD) is designed to dynamically fuse and assign weights to cross-level contextual information for robust detection. Training and extensive validation of our model are performed on the polarization-based dataset as well as non-polarization-based datasets, with experimental results demonstrating that HIPFNet consistently outperforms state-of-the-art methods. Source codes are available at https://***/CVhfut/HIPFNet.
camouflaged object detection (COD) aims to detect objects that blend in with their surroundings and is a challenging task in computer vision. High-level semantic information and low-level spatial information play impo...
详细信息
camouflaged object detection (COD) aims to detect objects that blend in with their surroundings and is a challenging task in computer vision. High-level semantic information and low-level spatial information play important roles in localizing camouflagedobjects and reinforcing spatial cues. However, current COD methods directly connect high-level features with low-level features, ignoring the importance of the respective features. In this paper, we design a Semantic-spatial guided Context Propagation Network (SCPNet) to efficiently mine semantic and spatial features while enhancing their feature representations. Firstly, we design a twin positioning module (TPM) to explore semantic cues to accurately locate camouflagedobjects. Afterward, we introduce a spatial awareness module (SAM) to mine spatial cues in shallow features deeply. Finally, we develop a context propagation module (CPM) to assign semantic and spatial cues to multi-level features and enhance their feature representations. Experimental results show that our SCPNet outperforms state-of-the-art methods on three challenging datasets. Codes will be made available at https://***/RJC0608/SCPNet.
camouflaged object detection (COD) aims to accurately detect objects concealed within the surrounding environment, playing a crucial role in various vision applications. Existing camouflaged object detection (COD) met...
详细信息
camouflaged object detection (COD) aims to accurately detect objects concealed within the surrounding environment, playing a crucial role in various vision applications. Existing camouflaged object detection (COD) methods frequently yield unsatisfactory results when confronted with large objects and multiple objects within complex scenes, primarily due to their constrained capability in comprehensively capturing and analyzing both the holistic scene context and fine-grained details within the object regions. In this paper, we propose a camouflaged object detection method called G2LNet, which aims to achieve accurate camouflage objectdetection through global to local information communication. The method includes three key components: Global Awareness Module (GAM), Local Adaptive Module (LAM), and Information Communication Module (ICM). Firstly, the GAM generates rough predictions by effectively capturing global contextual information on a global scale. The LAM further complements local textural detail with boundary cues. Finally, the ICM facilitates the transmission of information from global to local scales, ensuring effective communication and collaboration among different modules. With the integration of above key components, G2LNet excels at accurately segmenting camouflagedobjects in challenging scenes. Extensive results show that our G2LNet significantly outperforms 19 state-of-the-art methods on four widely used benchmark datasets. Meanwhile, our method also achieved excellent performance in a variety of COD-related visual tasks, including polyp segmentation and salient objectdetection. Our code is available at https://***/wangnayi/G2LNet.
camouflagedobjects are typically assimilated into their backgrounds and exhibit fuzzy boundaries. The complex environmental conditions and the high intrinsic similarity between camouflaged targets and their surroundi...
详细信息
camouflagedobjects are typically assimilated into their backgrounds and exhibit fuzzy boundaries. The complex environmental conditions and the high intrinsic similarity between camouflaged targets and their surroundings pose significant challenges inaccurately locating and segmenting these objects in their entirety. While existing methods have demonstrated remarkable performance in various real-world scenarios, they still face limitations when confronted with difficult cases, such as small targets, thin structures, and indistinct boundaries. Drawing inspiration from human visual perception when observing images containing camouflagedobjects, we propose a three-stage model that enables coarse-to-fine segmentation in a single iteration. Specifically, our model employs three decoders to sequentially process subsampled features, cropped features, and high-resolution original features. This proposed approach not only reduces computational overhead but also mitigates interference caused by background noise. Furthermore, considering the significance of multi- scale information, we have designed a multi-scale feature enhancement module that enlarges the receptive field while preserving detailed structural cues. Additionally, a boundary enhancement module has been developed to enhance performance by leveraging boundary information. Subsequently, a mask-guided fusion module is proposed to generate fine-grained results by integrating coarse prediction maps with high-resolution feature maps. Our network shows superior performance without introducing unnecessary complexities. Upon acceptance of the paper, the source code will be made publicly available at https://***/clelouch/TSNet.
camouflaged object detection (COD) aims to segment camouflagedobjects which exhibit very similar patterns with the surrounding environment. Recent research works have shown that enhancing the feature representation v...
详细信息
camouflaged object detection (COD) aims to segment camouflagedobjects which exhibit very similar patterns with the surrounding environment. Recent research works have shown that enhancing the feature representation via the frequency information can greatly alleviate the ambiguity problem between the foreground objects and the background. With the emergence of vision foundation models, like InternImage, Segment Anything Model etc, adapting the pretrained model on COD tasks with a lightweight adapter module shows a novel and promising research direction. Existing adapter modules mainly care about the feature adaptation in the spatial domain. In this paper, we propose a novel frequency-guided spatial adaptation method for COD task. Specifically, we transform the input features of the adapter into frequency domain. By grouping and interacting with frequency components located within non overlapping circles in the spectrogram, different frequency components are dynamically enhanced or weakened, making the intensity of image details and contour features adaptively adjusted. At the same time, the features that are conducive to distinguishing object and background are highlighted, indirectly implying the position and shape of camouflagedobject. We conduct extensive experiments on four widely adopted benchmark datasets and the proposed method outperforms 26 state-of-the-art methods with large margins. Code will be released.
Recently, with the continuous development in the field of camouflaged object detection (COD), effectively separating objects highly similar to the background has become a focal point of research. Due to the high simil...
详细信息
Recently, with the continuous development in the field of camouflaged object detection (COD), effectively separating objects highly similar to the background has become a focal point of research. Due to the high similarity between camouflagedobjects and backgrounds, traditional single visual branch often perform poorly in such scenarios. To address this issue, we propose a multi-view learning detection network based on the Pyramid Vision Transformer, named Multi-view Learning for camouflaged object detection with PVTv2 (MVLNet). By utilizing the information from RGB and noise views, our method can provide a more comprehensive description of the relationship between objects and backgrounds to improve the accuracy and robustness for COD. Inspired by human visual attention during observation, we design a Global Context Aggregation Module by using a U-shaped structure and progressively increasing dilation rates to simulate the human behavior of zooming in and out. Extensive experiments demonstrate that the proposed MVLNet outperforms 23 other representative models on three public datasets.
camouflaged object detection (COD) aims to segment camouflagedobjects hidden within their environment. Existing COD models, aside from image features, mostly focus on a single coarse-grained spatial structure, such a...
详细信息
camouflaged object detection (COD) aims to segment camouflagedobjects hidden within their environment. Existing COD models, aside from image features, mostly focus on a single coarse-grained spatial structure, such as depth information, texture information, or edge information. However, when faced with complex scenes where the target and background textures are similar and overlapping, or when subjected to noise interference, this design often leads to insufficient detection accuracy and robustness. To address these issues, we proposed a strategy for multiple spatial explorations and designed Spatial Bi-Exploration Network (SPNet). SPNet conducts a comprehensive analysis of complex camouflage scenarios by jointly exploring depth spatial, contour spatial, and image feature information, thereby enhancing detection performance and maintaining robustness. Unlike existing methods, SPNet leverages dual exploration of depth and contour spaces to mitigate the vulnerability of coarse structures to noise. Depth spatial information aids the model in recognizing the deep relationships between objects and the background, reducing the impact of noise on object boundaries, while contour spatial information improves edge detection accuracy. This dual approach significantly enhances robustness, especially in the face of adversarial attacks. Extensive experiments on benchmark datasets demonstrate that our model not only outperforms existing methods in detection performance but also exhibits superior robustness against adversarial attacks.
暂无评论