检索结果-内蒙古大学图书馆

9th International conference on signal and Image processing (ICSIP)

作者： Xu, Changhan Li, Xuemei Wu, Qinming Chengdu Univ Technol Chengdu Peoples R China

ISBN: (纸本)9798350350920

The current deep learning-based target detection algorithms have problems such as the network perception domain being limited, poor adaptation to scale changes, feature mismatch in feature fusion, and small datasets. Aiming at the current problems in the field of infrared target detection, a global infrared image detection method based on graph convolutional neural network is proposed. In this paper, global feature interaction module and feature pyramid module are designed. It also proposes a graph-based knowledge distillation model compression method to provide support for hardware deployment. Finally, the algorithm proposed in this paper is experimentally demonstrated, using the classical infrared small target dataset for experiments, comparing the mainstream infrared small target detection algorithms, comparing and verifying that the algorithm of this paper has an effective performance enhancement in infrared small targets in infrared targets. Design ablation experiments to verify the performance of individual modules and fusion modules[2], to prove the effectiveness and enhancement of the module. Finally, the visualization analysis facilitates the subjective evaluation by the human eye, proving the excellence of this paper's algorithm.

关键词： Graph Convolutional Neural Networks Global Interaction Knowledge Distillation Infrared small Objects

来源：评论

学校读者我要写书评

暂无评论

FBI-Net: Frequency Band Integration Network for Infrared small Target Segmentation

FBI-Net: Frequency Band Integration Network for Infrared Sma...

引用

2025 IEEE International conference on Acoustics, Speech, and signal processing, ICASSP 2025

作者： Xin, Biqiao Li, Qiang Mao, Qianchen Wang, Jinbao Wang, Bingshu School of Software Northwestern Polytechnical University Xi'an China National Engineering Laboratory for Big Data System Computing Technology Shenzhen University Shenzhen China

ISBN: (纸本)9798350368741

small targets in infrared imagery exhibit challenging characteristics due to their minimal semantic information and the extremely imbalanced distribution between the targets and the background. In this paper, we propose a frequency band integration network to extract salient features of infrared small targets in both the spatial and frequency domains. To excavate the high-frequency features of the small targets, we propose a frequency decoupling-fusion module. To decrease the semantic loss that occurs in deep networks, we propose a semantic injection mechanism to assist in retaining critical information from shallow layers. Experimental results show that our proposed method reaches higher prediction accuracy and robustness in the infrared small target segmentation task compared with other state-of-the-art approaches. © 2025 IEEE.

关键词： Feature Fusion Infrared small Target Segmentation Semantic Injection Wavelet Transformation

来源：评论

学校读者我要写书评

暂无评论

IR-MPE: A Long-Term Optical Flow-Based Motion Pattern Extractor for Infrared small Dim targets

引用

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT 2025年 74卷

作者： Liu, Xuyang Zhu, Wenming Yan, Pei Tan, Yihua Huazhong Univ Sci & Technol Sch Artificial Intelligence & Automat Natl Key Lab Multispectral Informat Intelligent Pr Wuhan 430074 Peoples R China

In complex scenarios, the utilization of temporal motion information can improve the detection performance of infrared small and dim targets. However, existing multiframe methods only consider short-term motion information at each moment, which is difficult to capture reliable motion information for small and dim targets. In addition, existing multiframe data-driven methods generally utilize complex network structures which have longer inference times compared with the single-frame models of infrared target detection. Such a problem limits the applicability of the existing multiframe methods. In this article, we propose a long-term optical flow (OF)-based infrared small target motion pattern extractor (IR-MPE) to generate long-term OF energy (OFE) maps, which reflect the motion patterns of targets at the current moment. First, we design a long-term OF adaptive accumulation module (OFAAM) to adaptively control the update of current motion information and the retention of previous motion information. Second, we design an offset correction module (OCM), which is embedded in the OFAAM module to rectify the OFE from the previous frame. Meanwhile, the OCM also corrects the output of the previous frame to assist in the detection of the current frame. Embedding our IR-MPE module into the existing single-frame methods can easily extend them as multiframe methods. The only modification is adding an extra input channel of the first layer whose input is set as the concatenation of our OFE and the original infrared image. Such a simple structure can significantly improve the detection accuracy while maintaining fast inference speed. Extensive experiments on various public datasets show that our approach outperforms the state-of-the-art methods in challenging scenarios.

关键词： data mining Optical flow Object detection Feature extraction Deep learning Fuses Detectors Dams Accuracy signal to noise ratio Infrared small dim target detection long-term optical flow (OF) motion pattern extractor

来源：评论

学校读者我要写书评

暂无评论

LEARNING CONTEXTUALIZED REPRESENTATION ON DISCRETE SPACE VIA HIERARCHICAL PRODUCT QUANTIZATION 49

LEARNING CONTEXTUALIZED REPRESENTATION ON DISCRETE SPACE VIA...

引用

49th IEEE International conference on Acoustics, Speech, and signal processing (ICASSP)

作者： Kim, Hyung Yong Kim, Byeong-Yeol Lim, Yunkyu Park, Jihwan Park, Jinseok Lim, Youshin Yu, Seung Woo Lee, Hanbin 42Dot Inc Seoul South Korea

ISBN: (纸本)9798350344868;9798350344851

Self-supervised learning has recently demonstrated significant success in various speech processing applications. Recent studies report that pre-training with contextualized continuous targets plays a crucial role in fine-tuning for better speech downstream tasks. However, unlike the continuous targets, it is challenging to produce contextualized targets on discrete space due to unstable training. To address this issue, we introduce a new hierarchical product quantizer that enables the full utilization of multi-layer features by reducing the possible case of quantized targets and preventing mode collapse through diversity loss for all codebooks. Our ablation study confirms the effectiveness of the proposed quantizer and contextualized discrete targets. For supervised ASR, the proposed model outperforms wav2vec2 and showed comparable results with data2vec. In addition, for unsupervised ASR, the proposed method surpasses two baselines.

关键词： self-supervised learning automatic speech recognition contextualized targets contrastive loss

来源：评论

学校读者我要写书评

暂无评论

A small Target Detection Method in Sea Clutter Based on Feature Manifold 2

A Small Target Detection Method in Sea Clutter Based on Feat...

引用

2nd IEEE International conference on signal, Information and data processing, ICSIDP 2024

作者： Guan, Jian Jiang, Xingyu Chen, Baoxin Ding, Hao Dong, Yunlong Liu, Ningbo Naval Aviation University Information Fusion Institute Yantai China

ISBN: (纸本)9798331515669

Target detection methods based on multidimensional features are often applied to the detection of small targets in sea clutter. However, existing algorithms do not fully utilize the correlation information between features, which to some extent limits their performance. To address this issue, this paper proposes a small target detection method in sea clutter using feature manifolds. By establishing a high-dimensional manifold representation of radar echo feature data, the radar target detection problem is transformed into a discrimination problem between two geometric points on a feature matrix manifold. This method integrates the use of multidimensional features and their correlation information, thereby effectively improving target detection performance. Experimental results on the SDRDSP dataset demonstrate that the proposed detection approach outperforms existing feature-based methods in terms of detection accuracy. © 2024 IEEE.

关键词： Clutter (information theory)

来源：评论

学校读者我要写书评

暂无评论

Learning DCT Subband using Kolmogorov-Arnold Network for Infrared small Target Detection 2

Learning DCT Subband using Kolmogorov-Arnold Network for Inf...

引用

2nd IEEE International conference on signal, Information and data processing, ICSIDP 2024

作者： Zhang, Zekai Zhao, Yingrui Zhou, Shichao Yang, Zihui Beijing Information Science and Technology University School of Information and Communication Engineering Beijing China

ISBN: (纸本)9798331515669

Deep learning-based infrared small target detection (IRSTD) methods typically exploit spatial domain cues to infer dim and weak infrared targets. However, relying solely on spatial domain information is sub-optimal due to the lack of structure and texture details of the target. Alternatively, we learn frequency priors to enhance targets in frequency domain, and propose a Frequency Domain Enhanced Network (FDENet). It contains a frequency domain discrete cosine transform enhancement (FDE) module with KAN-inspired DCT learning mechanism. Specifically, the FDE module uses multiple univariate nonlinear functions to combine continuous multivariate functions to directly map frequency-aware cues without spatial features. It then aligns the frequency-aware cues and with the spatial features to capture dimly weak targets. Generalization experiments and comparative studies on two real-world infrared single-frame image datasets fully demonstrate the effectiveness of our method. © 2024 IEEE.

关键词： Frequency domain analysis

来源：评论

学校读者我要写书评

暂无评论

DCUNet: Deformable Convolutional UNet for Multi-frame Infrared small Target Super-resolution 2

DCUNet: Deformable Convolutional UNet for Multi-frame Infrar...

引用

2nd IEEE International conference on signal, Information and data processing, ICSIDP 2024

作者： Huang, Yuanxin Zhi, Xiyang Chen, Wenbin Liang, Xiaoyang Wang, Zhipeng Zhang, Wei Harbin Institute of Technology Research Center for Space Optical Engineering Harbin China China Academy of Space Technology Institute of Remote Sensing Satellite Beijing China

ISBN: (纸本)9798331515669

The objective of infrared multi-frame super-resolution for small targets is to enhance the target's resolution by leveraging complementary information from multiple frames. However, the presence of motion variations and scale changes in infrared small targets introduces discontinuities in information between frames, posing challenges for super-resolution. To address the above challenges, we propose a multi-frame infrared small target super-resolution network based on deformable UNet. Specifically, the network takes multi-frame sequential infrared images as input and divides the input into reference frames and current frames. It utilizes deformable convolutions to align the frames to enhance the target features of the current frame. Additionally, the encoding structure of the UNet network is employed to extract multiscale features from the sequential images. By leveraging skip connections and a decoding structure, the network achieves the enhancement of target detail information and the fusion of multiscale features, ultimately outputting the super-resolved image. We conducted experiments on the IRDST and NUDT-MIRSDT datasets, and the results validate the practicality of our designed network. © 2024 IEEE.

关键词： Optical resolving power

来源：评论

学校读者我要写书评

暂无评论

FANM: Fuzzy-Aware Nested Mamba for Infrared Dense small Target Detection 2

FANM: Fuzzy-Aware Nested Mamba for Infrared Dense Small Targ...

引用

2nd IEEE International conference on signal, Information and data processing, ICSIDP 2024

作者： Sui, Yi Zhi, Xiyang Huang, Yuanxin Liang, Xiaoyang Lu, Zheng Zhang, Wei Harbin Institute of Technology Research Center for Space Optical Engineering Harbin China China Academy of Space Technology Institute of Remote Sensing Satellite Beijing China

ISBN: (纸本)9798331515669

The task of infrared dense small target detection aims to accurately locate densely distributed thermal radiation targets in complex scenes. However, in complex scenes, dense small targets often suffer from occlusion, shadows, and blurriness, leading to unclear detection bounding boxes and inaccurate localization. To address the aforementioned issues, this paper introduces an innovative approach for infrared dense small target detection, termed the Fuzzy-Aware Nested Mamba (FANM) network. Specifically, the network takes a single-frame infrared image as its input and leverages shallow, middle, and deep Mamba networks to fully exploit and integrate local and global multi-scale features. The extracted features are subsequently passed to a feature pyramid network (FPN) to balance the spatial resolution and semantic information of each feature map layer, ensuring that feature maps of different scales can be effectively utilized. Then, the output features at each level are fed into the detection head, which consists of two branches: one for regressing the positions of arbitrarily distributed bounding boxes, and the other for predicting the quality of the bounding boxes. The proposed fuzzy perceptual loss function is used to optimize these branches, ultimately outputting images with accurately detected target bounding boxes. Experiments were performed on the DMIST dataset, which includes images with varying numbers of targets, and the results validate the effectiveness of our method. © 2024 IEEE.

关键词： Infrared imaging

来源：评论

学校读者我要写书评

暂无评论

small object detection in UAV imagery based on channel-spatial fusion cross attention

引用

signal IMAGE AND VIDEO processing 2025年第4期19卷 1-14页

作者： Li, Jianlong Zheng, Chunhou Chen, Peng Zhang, Jun Wang, Bing Anhui Univ Inst Phys Sci & Informat Technol Natl Engn Res Ctr Agroecol Big Data Anal & Applica Sch InternetInformat Mat & Intelligent Sensing La Hefei 230601 Peoples R China Anhui Univ Finance & Econ Sch Management Sci & Engn Bengbu 233030 Peoples R China Anhui Rocvis Intelligent Technol Co Ltd Hefei 230001 Peoples R China

Object detection in unmanned aerial vehicle (UAV) images has become an important research area in computer vision due to its unique value and challenges. UAV images are characterized by densely distributed small targets, significant changes in target scale, and background noise, which affect the accuracy and reliability of detection. To address these issues, we propose an small target detection network based on Enhanced Scale Sequence Fusion and channel space fusion cross-attention mechanism, called *** tackle the high proportion of small targets and scale variation in UAV images, we employ Enhanced Scale Sequence Fusion, integrating fine-grained information from shallow feature maps and semantic information from deep feature maps. Additionally, we incorporate an tiny target detection head to enhance the network's ability to extract fine-grained information features for small targets. To address the issue of background noise, we propose a channel space fusion cross-attention mechanism, which first performs attention calculation on local patch block feature maps, and then performs attention calculation global patch blocks. This captures both long-range dependencies and detailed information. The method for calculating attention combines spatial description information and channel description *** experiments were conducted to validate the effectiveness of the model on the VisDrone benchmark dataset, UAVDT dataset and our self-made UAV power inspection dataset PIDrone. In comparison to the YOLOv8s model, the CSFCANet demonstrated an improvement in mAP of 7% on the PIDrone, 2.4% on the VisDrone, and 3.6% on the UAVDT.

关键词： small object detection Attention mechanism Multi-scale feature fusion UAV Aerial Imagery

来源：评论

学校读者我要写书评

暂无评论

Cross-YOLO: an object detection algorithm for UAV based on improved YOLOv8 model

引用

signal IMAGE AND VIDEO processing 2025年第6期19卷 1-10页

作者： Dong, Ying Guo, Jiahao Xu, Fucheng Dalian Minzu Univ Coll Sci 31 Jinshi Rd Dalian 116000 Liaoning Peoples R China Dalian Minzu Univ Sch Preuniv 31 Jinshi Rd Dalian 116000 Liaoning Peoples R China

In this study, Cross-YOLO, an enhanced version of the YOLOv8 model, is specifically designed to address the challenge of detecting small objects in UAV target detection scenarios. The model refines the original YOLOv8 through several innovative improvements: Firstly, in order to improve the detection accuracy of small targets, we propose Cross-FPN to bolster the original FPN. Secondly, we have redesigned a lightweight detection head, DELDH, to solve the problem of network bloat caused by the introduction of small object detection heads. Thirdly, a new attention mechanism CMCA is designed, that unifies the Coordinate attention mechanism with the Multi-scale convolutional attention mechanism to further enhance the feature extraction of small targets. Finally, the WIoU loss function is introduced to improve the accuracy of bounding box regression and improve detection performance. Experimental data in the Visdrone dataset indicate that, under the condition of the selected model size n, Cross-YOLO achieves a substantial reduction of 35.4% in parameter count compared to YOLOv8n, with only a marginal increase of 7.8% in computational load, and a significant improvement of 5.3% in mAP0.5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$mAP_{0.5}$$\end{document}. Furthermore, its strong performance on the DOTA v1.5 and TinyPerson datasets confirms the model's generalization capabilities and practical applicability.

关键词： YOLOv8 small object Object detection UAV Attention

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：