An image compression-encryption algorithm based on 2-D compressive sensing is proposed, which can accomplish encryption and compression simultaneously. The measurements are performed in two directi-ons and the measure...
详细信息
Detecting blade tip point light sources based on airborne computer vision is a critical step in measuring blade tip distance for coaxial unmanned helicopters. However, detecting blade tip point light sources quickly a...
详细信息
A new method dealing with recognition of partially occluded and affine distortion objects is presented. The method is designed for objects with smooth curved boundary. It divides an object into affine-invariant parts ...
详细信息
A new method dealing with recognition of partially occluded and affine distortion objects is presented. The method is designed for objects with smooth curved boundary. It divides an object into affine-invariant parts based on the feature point. And a new approach for matching each part is presented in this paper. Robust Hausdorff distance (RHD) is introduced to measure the similarity between feature points set of model and that of target. In terms of the new RHD, the optimal affine transform can be estimated. And then the sub-curve match pairs are calculated based on the optimal affine transformation. The experimental results show proposed algorithm are capable of coping with partial occlusion and affine transformation.
Under the effect of solar variation, atmospheric attenuation and thermal radiation distribution, the grey value of interference source is close to or equal to the target grey value. With the distance between the imagi...
详细信息
Detecting human key points from a single image is very challenging due to occlusion, blurring, illumination and scale changes. In this paper, this problem is addressed by designing an effective network structure. Sinc...
详细信息
Detecting human key points from a single image is very challenging due to occlusion, blurring, illumination and scale changes. In this paper, this problem is addressed by designing an effective network structure. Since global and local information plays an important role in reasoning about human body structure and invisible keypoints, Multi-level Attention Network (MAN) is proposed. First, compared with traditional multi-resolution networks, it enables multi-resolution feature maps with greater information variance by generating them directly from the highest resolution feature map, which in turn increases the abundance of feature information after final fusion. Secondly, it effectively integrates global and local information in different resolution feature maps through the Feature Alignment Attention Block(FAAB), and intensifies them in a targeted manner. On the COCO dataset, with HRNet (Sun K. et al [1]) as the baseline network, HRNet of inserted MAN improves 1.1-2.3 AP points over the baseline network.
At present, in the field of pixel-level image fusion, researchers tend to treat each pixel independently, which destroys the relationship between the images to be fused. In view of this defects, this paper aims to pro...
详细信息
Monocular depth estimation is a fundamental task in computer vision and has drawn increasing attention. Recently, attention-based models and encoder-decoder architectures have led to great improvements in monocular de...
Monocular depth estimation is a fundamental task in computer vision and has drawn increasing attention. Recently, attention-based models and encoder-decoder architectures have led to great improvements in monocular depth estimation. Typically, most of the previous methods used repeated simple up-sampling operations during decoding, which may not make full use of the potential properties of the features extracted by the encoder, and there are problems of inaccurate prediction of the edge and depth maximum region. We propose an attention-based feature fusion module for encoder and decoder. We treat the monocular depth estimation as a pixel-level optimization problem, where the coarsest encoder feature is used to initialize the pixel-level optimization, which is then refined to higher resolution by the proposed attentional feature fusion (AFF). We formulate the prediction problem as ordinal regression over the bin centers that discretize the continuous depth range. It predicts a correspondingly different distribution of bins based on different pictures and we predict bins at the coarsest level using global pooling and MLP layers. In the NYUV2 dataset, the proposed architecture improving original model by 2.5.% and 1.1%, in terms of Log10 and Absolute relative error, respectively.
A reduced biquaternion neural network (RQNN) is a new type of neural network framework that has achieved significant success in machine learning. However, as the reduced biquaternion algebra system contains infinite z...
详细信息
Instance segmentation is a comprehensive computer vision task that involves a wide range of other tasks. Recently, the study of real-time instance segmentation methods has received more attention for the development o...
详细信息
Instance segmentation is a comprehensive computer vision task that involves a wide range of other tasks. Recently, the study of real-time instance segmentation methods has received more attention for the development of autonomous driving. Although existing real-time instance segmentation methods are fast, their accuracy does not meet practical needs. Most methods go for segmentation based on object detection, and their effectiveness is overly dependent on the effectiveness of detection. This paper proposes a new attention-based multiscale information fusion method based on Cheng, T. et al. [1]. Firstly, the PPM module of the baseline network is replaced with the module Multiscale Context Attention (MSCA) designed in this paper based on the baseline network, which uses atrous convolution with different ratios to obtain information of four scales, and then uses non-local attention to enhance the information of features. It can effectively suppress the interference of redundant information on the instance segmentation results. Secondly, a new feature fusion approach is designed, which no longer uses bilinear interpolation, but sub-pixel up sampling combined with attention. We did experiments related to this module on the coco dataset and demonstrated its effectiveness, with a 0.5% improvement over the baseline network.
Siamese-based trackers currently are the dominant tracking paradigm due to the balance between speed and performance. However, it is prone to drift and tracking failure when the environment is complex and similar obje...
详细信息
Siamese-based trackers currently are the dominant tracking paradigm due to the balance between speed and performance. However, it is prone to drift and tracking failure when the environment is complex and similar objects interfere. While the Siamese-based trackers perform the correlation operation, the responses of the target object and background appear in different channels, i.e., the feature spaces of the target object and background have some orthogonality. However, when meeting background clutters and similar objects interfere, this orthogonality becomes weaker and the wrong classification contribution of the object and the background reduces the stability of the learned similarity function, leading to many misclassified pixels in the heatmaps. In this work, we proposed a SiamORPN to solve the above issues. It is incorporated at two levels: an Orthogonal Region Proposal Network (ORPN) and an Adaptive Pixel-wise Aggregation (APA) module. Specifically, for ORPN, the orthogonality between the object and the background maximizes the inter-class inertia. Moreover, the ORPN introduces the orthogonal module to enhance this orthogonality. For APA, it introduces two lightweight networks to predict the weights of all pixels in different heatmaps and the weights of all pixels in different regression offsets. Experiments on challenging benchmarks, including OTB2015, VOT2016, VOT2018, GOT-10k test set, UAV123, LaSOT, and TrackingNet, demonstrate the proposed SiamORPN outperforms many SOTA trackers and achieves leading performance. The inference speed at GTX1080Ti can reach about 32 FPS, meeting the real-time requirements.
暂无评论