Infrared small target detection (IRSTD) is the challenging task of identifying small targets with low signal-to-noise ratios in complex backgrounds. Traditional methods in the complex background of IRSTD lead to a lar...
详细信息
Infrared small target detection (IRSTD) is the challenging task of identifying small targets with low signal-to-noise ratios in complex backgrounds. Traditional methods in the complex background of IRSTD lead to a large number of false alarms and misdetections. Although CNN-based methods have made progress in IRSTD, how to extract more effective information and fully utilize interlayer information remains an unresolved issue. Therefore, this article proposed a dual-encoder multistage feature fusion network (DMFNet). Specifically, we designed a dual-encoder with different inputs to capture more effective small target feature information. We then designed a receptive field expansion attention module (REAM) to incorporate nonlocal contextual information. In the decoding phase, the Triple Cross-layer Fusion Module (TCFM) was developed to exchange the low-level spatial details and the high-level semantic information for preserving more small target information in deeper layers. Finally, by concatenating multiscale features from various layers of the decoder, more discriminative feature maps were generated to clearly describe the infrared small targets. Experimental results on the NUDT-SIRST, NUAA-SIRST, and IRSTD-1k datasets demonstrated that DMFNet outperforms some other state-of-the-art methods, achieving superior detection performance. The codes are available at https://***/BJZHOU2000/DMFNet.
A knowledge graph is a tool for representing relationships between data. Since this graph is constructed based on existing knowledge, incomplete knowledge results in an incomplete graph. To resolve this issue, it is n...
详细信息
A knowledge graph is a tool for representing relationships between data. Since this graph is constructed based on existing knowledge, incomplete knowledge results in an incomplete graph. To resolve this issue, it is necessary to determine the existence and type of edges between nodes. This paper addresses this challenge by introducing a model for converting graph nodes to each other under an edge called a "relation". Moreover, this design simultaneously considers both the global and local structures of the graph. This model is bidirectional, with two distinct inputs: a head and a tail. The features are extracted in each path and combined with those in the opposite path. This combination task is performed within a designated block known as a 'mirror'. The paths are separated to extract all features from each datum. Furthermore, a layer based on Laplace computations of the graph was considered to incorporate the graph direction as a feature. This layer is based on the geometric structure on the graph. The designed model was evaluated using four parameters that represent its quality and accuracy. The results show that the proposed model achieved an approximate 95% accuracy for all four parameters.
This work is devoted to the development of a novel deep learning encoder-decoder algorithm for real-time noise and blur elimination in video frames, received from UAV. This work improves on existing algorithms by prov...
详细信息
ISBN:
(纸本)9798350372557
This work is devoted to the development of a novel deep learning encoder-decoder algorithm for real-time noise and blur elimination in video frames, received from UAV. This work improves on existing algorithms by providing a more flexible blind deblurring solution than existing kernel-based methods. The proposed method can be applied to both improve the drone operator's capabilities and to improve the performance of autonomous image processing tasks, such as object identification and visual navigation systems. Different types of blur as well as possible types of noise are presented. A brief overview of existing methods is provided. The problem of frame alignment due to the object's movement and associated noise is considered. Existing deblurring and image restoration methods are reviewed, including state-of-the-art. Their limitations are highlighted. To solve the limitations a method based on a fully convolutional encoder-decoder network with residual connections is presented. Dataset generation and training procedures are discussed. The approach is then compared to existing state-of-the-art deep learning methods. The proposed method enables up to 9 times faster blind image restoration with comparable quality in comparison to existing state-of-the-art image restoration methods.
暂无评论