Camouflaged objects, exhibiting high similarity with their surroundings, pose a substantial challenge for both humans and machines to detect when concealed within the environment. Existing methods for camouflage objec...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
Camouflaged objects, exhibiting high similarity with their surroundings, pose a substantial challenge for both humans and machines to detect when concealed within the environment. Existing methods for camouflage object detection (COD) struggle in accurately segmenting the overall structure of camouflaged objects. To address this issue, we propose a novel boundary-guided fusion of multi-level features network (BGFM-Net) for COD. In contrast to existing boundary-guided methods, we pay more attention to addressing the significant imbalance in the pixel quantities between boundary and background features, allowing for a more comprehensive representation of boundary features. BGFM-Net primarily consists of a multi-scale aggregation module (MSAM), a boundary-guided feature module (BFM), and a cross-Level fusion module (CLFM). MSAM effectively integrates contextual semantics at different scales, achieving a powerful and efficient feature representation. BFM adeptly combines edge features while constraining interference from background features, guiding the learning of camouflaged object boundary representation. CLFM integrates multi-level features for predicting camouflaged objects while adaptively adjusting channel weights to emphasize important channels and diminish the impact of less relevant channels for the task. Extensive experiments on three benchmark camouflage datasets demonstrate that our BGFM-Net outperforms other state-of-the-art COD models.
Memristor is frequently used to construct synapses and memristor bridge synapse is a typical example of such a synapse. Unlike the traditional memristor bridge synapse, the forgetting memristor bridge synapse can expr...
详细信息
Image inpainting is a kind of use known area of information technology to repair the loss or damage to the *** feature extraction is the core of image *** enough space for information and a larger receptive field is v...
详细信息
Image inpainting is a kind of use known area of information technology to repair the loss or damage to the *** feature extraction is the core of image *** enough space for information and a larger receptive field is very important to realize high-precision image ***,in the process of feature extraction,it is difficult to meet the two requirements of obtaining sufficient spatial information and large receptive fields at the same *** order to obtain more spatial information and a larger receptive field at the same time,we put forward a kind of image restoration based on space path and context path *** the space path,we stack three convolution layers for 1/8 of the figure,the figure retained the rich spatial *** the context path,we use the global average pooling layer,where the accept field is the maximum of the backbone network,and the pooling module can provide global context information for the maximum accept *** order to better integrate the features extracted from the spatial and contextual paths,we study the fusion module of the two *** fusionmodule first path output of the space and context path,and then through themass normalization to balance the scale of the characteristics,finally the characteristics of the pool will be connected into a feature vector and calculate the weight *** of images in order to extract context information,we add attention to the context path refinement *** modules respectively from channel dimension and space dimension to weighted images,in order to obtain more effective *** show that our method is better than the existing technology in the quality and quantity of themethod,and further to expand our network to other inpainting networks,in order to achieve consistent performance improvements.
Recently, multi-label deep cross-modal hashing (MDCH), which incorporates deep neural networks, hashing and multi-label learning for cross-modal retrieval tasks, has achieved excellent cross-modal retrieval results an...
详细信息
Exploiting the shared information among tasks to significantly improve the sparse reconstruction performance lays the essence of multi-task compressive sensing. In this paper, a novel generative model of multi-task co...
详细信息
Currently, deep learning-based speech enhancement methods generally focus on target speech extraction while neglecting modeling the other sound sources in the mixture. These methods still can't distinguish the tar...
详细信息
Currently, deep learning-based speech enhancement methods generally focus on target speech extraction while neglecting modeling the other sound sources in the mixture. These methods still can't distinguish the target speech from the interference well. In this paper, we present a monaural speech enhancement network via Modeling the Noise (MN-Net), which includes a shared Encoder and three separate Decoders for parallel modeling the magnitude and phase spectrogram of target speech, and the complex spectrogram of noise. Specifically, we propose a Multi-Branch Feature Extractor (MBFE) module to capture the richer contextual information in mixture, and a Spatial Reconstruction Unit (SRU) to remove the redundancy from extracted features. We compared our proposed MN-Net with 18 classical speech enhancement methods on the VoiceBank+DEMAND dataset, and with 9 ones on DNS-Challenge dataset for denoising task, and with 7 ones on the WHAMR! dataset for simultaneous denoising & de-reverberation task. Our proposed MBFE module was applied to two classical speech enhancement methods, DB-AIAT and CMGAN, replacing their DenseBlocks module. The results demonstrate that applying the MBFE module can boost their performances while keeping smaller model size. A series of visualization analysis intuitively verify that modeling the noise can enable the network to distinguish the target speech from noise and other interference more accurately.
Gaze estimation is pivotal in human scene comprehension tasks, particularly in medical diagnostic analysis. Eye-tracking technology facilitates the recording of physicians’ ocular movements during image interpretatio...
详细信息
Outlier detection refers to the identification of anomalous samples that deviate significantly from the distribution of normal data and has been extensively studied and used in a variety of practical tasks. However, m...
详细信息
The metaverse, constructed through digital technology, serves as a virtual realm intertwining with reality. Within this context, the challenge of evaluating data from diverse sources arises, and the application of lar...
详细信息
Neural Radiance Fields (NeRF) have been gaining attention as a significant form of 3D content representation. With the proliferation of NeRF-based creations, the need for copyright protection has emerged as a critical...
详细信息
暂无评论