Optical lenses installed in most imaging devices suffer from the limited depth of field due to which objects get imaged with varying sharpness and details, thereby losing essential information. To cope with the proble...
详细信息
Self-supervised representation learning has shown promising results in recent years. However, most of the proposed methods are pre-trained on object-centric datasets with image-level pretext tasks. In this study, we f...
详细信息
ISBN:
(数字)9798350388961
ISBN:
(纸本)9798350388978
Self-supervised representation learning has shown promising results in recent years. However, most of the proposed methods are pre-trained on object-centric datasets with image-level pretext tasks. In this study, we follow DenseCL, which is pre-trained on pixel-level scene-centric datasets with contrastive learning. Our goal is to alleviate the false negative pairing problem in contrastive learning by consistency regularization. Our method outperforms DenseCL and PixContrast models in most of the scenarios. In PASCAL VOC object detection, we see 0.2% AP50 and 0.3% AP improvements. In COCO object detection, we get 0.3% AP and 0.7% AP boosts. We also improve by 0.4% AP and 0.6% AP in COCO instance segmentation, and 0.1% mAP and 0.9% mAP in PASCAL VOC semantic segmentation. Moreover, attention map visualization and k-nearest neighbour retrieval indicate qualitative improvement from the proposed method.
Learned image compression (LIC) has shown its superior compression ability. Quantization is an inevitable stage to generate quantized latent for the entropy coding. To solve the non-differentiable problem of quantizat...
详细信息
ISBN:
(纸本)9781665475938
Learned image compression (LIC) has shown its superior compression ability. Quantization is an inevitable stage to generate quantized latent for the entropy coding. To solve the non-differentiable problem of quantization in the training phase, many differentiable approximated quantization methods have been proposed. However, the derivative of quantized latent to non-quantized latent are set as one in most of the previous methods. As a result, the quantization error between non-quantized and quantized latent is not taken into consideration in the gradient descent. To address this issue, we exploit the gradient scaling method to scale the gradient of non-quantized latent in the back-propagation. The experimental results show that we can outperform the recent LIC quantization methods.
This paper proposes an ASynchronous Autoregressive Prediction (ASAP) method for satellite anomaly detection. We empirically observe that a single classification model can hardly detect unknown anomalous situations and...
详细信息
ISBN:
(纸本)9781665475938
This paper proposes an ASynchronous Autoregressive Prediction (ASAP) method for satellite anomaly detection. We empirically observe that a single classification model can hardly detect unknown anomalous situations and neglect the Markov nature of temporal satellite data. To address this, we adopt an autoregressive model to deal with the prediction of unknown anomaly for satellite data. We further propose a non-uniform temporal encoding method for asynchronous data and a median filtering method for more accurate detection. To reduce the effect of outliers, we employ an adaptive threshold selection method to achieve a more robust classification boundary. Experiments on real satellite data demonstrate that the proposed ASAP method outperforms the baseline classification method by 55.79%.
Web Real-Time communications (WebRTC) is an open-source platform, supporting developer to build voice- and video-communication solutions. Video compression engine is one infrastructure in WebRTC. As video stream accou...
详细信息
ISBN:
(纸本)9781665475938
Web Real-Time communications (WebRTC) is an open-source platform, supporting developer to build voice- and video-communication solutions. Video compression engine is one infrastructure in WebRTC. As video stream accounting for more than 90% bandwidth requirement in real-time communication. The performance of embedded video encoder in WebRTC determines the Quality of Experience of communication, including the picture quality, the communication latency and the playback smoothness. In our WebRTC, we implemented the high performance HEVC software encoder as a substitute of the VP8 encoder in WebRTC. We improve the compression efficiency of HEVC encoder significantly. As compared with the most advanced x265, the averaged 34.6% rate saving was achieved by our encoder, especially in low-latency coding applications. As compared to the default WebRTC integrated with VP8, up to 90.2% bandwidth requirement was saved by our optimizations.
images acquired in poor illumination conditions are characterized by low brightness and considerable noise which constrain the performance of computer vision systems. image enhancement thus remains crucial for improvi...
详细信息
With the rapid development of multi-sensor fusion technology in various industrial fields, many composite images closely related to human life have been produced. To meet the rapidly growing needs of various image-bas...
详细信息
ISBN:
(纸本)9781665475938
With the rapid development of multi-sensor fusion technology in various industrial fields, many composite images closely related to human life have been produced. To meet the rapidly growing needs of various image-based applications, we have established the first multi-source composite image (MSCI) database for image quality assessment (IQA). Our MSCI database contains 80 reference images and 1600 distorted images, generated by four advanced compression standards with five distortion levels. In particular, these five distortion levels are determined based on the first five just noticeable difference (JND) levels. Moreover, we verify the IQA performance of some representative methods on our MSCI database. The experimental results show that the performance of the existing methods on the MSCI database needs to be further improved.
image captioning is a technology that generates textual descriptions of images by integrating computer vision and natural language processing. This review aims to provide a comprehensive overview of current state-of-t...
详细信息
Given the requirements for robust target classification and accurate target state estimation in visual tracking, SiamFC++ proposes a set of practical guidelines for designing high-performance general-purpose trackers ...
Given the requirements for robust target classification and accurate target state estimation in visual tracking, SiamFC++ proposes a set of practical guidelines for designing high-performance general-purpose trackers by considering the special nature of visual tracking problems. Inspired by dynamic modules, We propose an empirical method for integrating a dynamic module into the image input, which is concatenated with the template module after feature maps are extracted by the backbone network. Since the position and shape of the object can change significantly within a video sequence, the added dynamic module can better focus on the target region of the feature map to obtain better similarity maps. Extensive experiments and comparisons demonstrate that our simple and effective method achieves reliable results on the benchmarks of LaSOT, TrackingNet, and GOT-10K and provides a significant speed advantage in real-time.
This paper proposes a two-stage lossless robust watermarking scheme based on deep neural network. In the first stage, the encoder, noise layer and decoder network are combined to design an end-to-end network framework...
详细信息
暂无评论