Owing to the property of being constant to image contrast and the identification of various types of features, phase congruency (PC) model has been widely used in remote sensing applications. However, when the PC is d...
详细信息
Owing to the property of being constant to image contrast and the identification of various types of features, phase congruency (PC) model has been widely used in remote sensing applications. However, when the PC is directly applied to optical and synthetic aperture radar (SAR) image registration, it fails to handle large radiometric and geometric differences. In this paper, we propose an automatic algorithm to solve this problem. First, evenly-distributed keypoints are extracted from the optical images via the block harris method. Complementary grid points are selected in image regions with poor structure and texture information. Then a robust similarity metric based on the improved PC model is proposed. Since the two images show diverse properties, we utilize two different PC models, the traditional PC and the SAR-PC. The PC values of several directions are aggregated to construct the feature descriptors on the basis of which, as a result, a similarity metric using the normalized correlation coefficient (NCC) is obtained. We compare the proposed metric with two baselines (mutual information and NCC) and a state-of-the-art method (histogram of the oriented phase congruency, HOPC) in the case of various scenarios, the results show that our method outperforms the baselines and show comparable performance with HOPC in regions with abundant structure information and better performance in untextured regions.
Segmentation of multiple anatomical structures is of great importance in medical image analysis. In this study, we proposed a W-net to simultaneously segment both the optic disc (OD) and the exudates in retinal images...
详细信息
This study introduces an automatic method for change detection of multi-sensor remote-sensing images (e.g. optical and synthetic aperture radar (SAR) images). As object-based image analysis can effectively reduce the ...
详细信息
Beamformer with magnitude response constraint can flexibly control the response region by specified beamwidth and response ripple, which has a significant performance against steering vector mismatch. However, a high ...
详细信息
In this paper, we propose a novel deep architecture with multiple classifiers for continuous sign language recognition. Representing the sign video with a 3D convolutional residual network and a bidirectional LSTM, we...
详细信息
In this paper, we propose a novel deep architecture with multiple classifiers for continuous sign language recognition. Representing the sign video with a 3D convolutional residual network and a bidirectional LSTM, we formulate continuous sign language recognition as a grammatical-rule-based classification problem. We first split a text sentence of sign language into isolated words and n-grams, where an n-gram is a sequence of consecutive n words in a sentence. Then, we propose a word-independent classifiers (WIC) module and an n-gram classifier (NGC) module to identify the words and n-grams in a sentence, respectively. A greedy decoding algorithm is employed to integrate words and n-grams into the sentence based on the confidence scores provided by both modules. Our method is evaluated on a Chinese continuous sign language recognition benchmark, and the experimental results demonstrate its effectiveness and superiority.
Video stitching remains a challenging problem in computer vision. In this paper, we propose a novel edge-guided method to stitch multiple videos that have small overlapped regions. Our algorithm consists of three step...
ISBN:
(数字)9781728123455
ISBN:
(纸本)9781728123462
Video stitching remains a challenging problem in computer vision. In this paper, we propose a novel edge-guided method to stitch multiple videos that have small overlapped regions. Our algorithm consists of three steps: (1) spherical projection of the input video frames based on camera calibration, (2) edge detection and edge-guided feature matching for video registration, and (3) seam optimization to eliminate distortions and ghosts in the composited panoramic videos. The experimental results and user studies demonstrate that our method is robust to videos that have small overlapped regions and produces more visually pleasing panoramic videos than state-of-the-art techniques.
Objective quality assessment of stereoscopic panoramic images becomes a challenging problem owing to the rapid growth of 360-degree contents. Different from traditional 2D image quality assessment (IQA), more complex ...
Objective quality assessment of stereoscopic panoramic images becomes a challenging problem owing to the rapid growth of 360-degree contents. Different from traditional 2D image quality assessment (IQA), more complex aspects are involved in 3D omnidirectional IQA, especially unlimited field of view (FoV) and extra depth perception, which brings difficulty to evaluate the quality of experience (QoE) of 3D omnidirectional images. In this paper, we propose a multi-viewport based full-reference stereo 360 IQA model. Due to the freely changeable viewports when browsing in the head-mounted display, our proposed approach processes the image inside FoV rather than the projected one such as equirectangular projection (ERP). In addition, since overall QoE depends on both image quality and depth perception, we utilize the features estimated by the difference map between left and right views which can reflect disparity. The depth perception features along with binocular image qualities are employed to further predict the overall QoE of 3D 360 images. The experimental results on our public Stereoscopic OmnidirectionaL Image quality assessment Database (SOLID) show that the proposed method achieves a significant improvement over some well-known IQA metrics and can accurately reflect the overall QoE of perceived images.
Rapid growing intelligent applications require optimized bit allocation in image/video coding to support specific task-driven scenarios such as detection, classification, segmentation, etc. Some learning-based framewo...
详细信息
Light field image (LFI) quality assessment is becoming more and more important, which helps to better guide the acquisition, processing and application of immersive media. However, due to the inherent high dimensional...
详细信息
—Photo-realistic point cloud capture and transmission are the fundamental enablers for immersive visual communication. The coding process of dynamic point clouds, especially video-based point cloud compression (V-PCC...
详细信息
暂无评论