ASR can be improved by multi-task learning (MTL) with domain enhancing or domain adversarial training, which are two opposite objectives with the aim to increase/decrease domain variance towards domain-aware/agnostic ...
详细信息
Conventional image sensors digitize high-resolution images at fast frame rates, producing a large amount of data that needs to be transmitted off the sensor for fur-ther processing. This is challenging for perception ...
详细信息
ISBN:
(数字)9798350353006
ISBN:
(纸本)9798350353013
Conventional image sensors digitize high-resolution images at fast frame rates, producing a large amount of data that needs to be transmitted off the sensor for fur-ther processing. This is challenging for perception system operating on edge devices, because communication is power inefficient and induces latency. Fueled by innovations in stacked image sensor fabrication, emerging sensor-processors offer programmability and processing.capabilities directly on the sensor. We exploit these capabilities by developing an efficient recurrent neural network architecture, PixelRNN, that encodes spatio-temporal features on the sensor using purely binary operations. PixelRNN reduces the amount of data to be transmitted off the sensor by factors up to 256 compared to the raw sensor data while offering competitive accuracy for hand gesture recognition and lip reading tasks. We experimentally validate PixelRNN using a prototype implementation on the SCAMP-5 sensor-processor platform.
Self-supervision has shown outstanding results for natural language processing. and more recently, for imagerecognition. Simultaneously, vision transformers and its variants have emerged as a promising and scalable a...
详细信息
Bangladesh is one of the countries struggling to prevent road accidents, which is a global cause for concern. An early warning system that indicates road conditions can contribute to the prevention task. For this purp...
详细信息
Previous full-reference image quality assessment methods aim to evaluate the quality of images impaired by traditional distortions such as JPEG, white noise, Gaussian blur, and so on. However, there is a lack of resea...
详细信息
ISBN:
(纸本)9781665448994
Previous full-reference image quality assessment methods aim to evaluate the quality of images impaired by traditional distortions such as JPEG, white noise, Gaussian blur, and so on. However, there is a lack of research measuring the quality of images generated by various imageprocessing.algorithms, including super-resolution, denoising, restoration, etc. Motivated by the previous model that predicts the distortion sensitivity maps, we use the DeepQA as a baseline model on a challenge database that includes various distortions. We have further improved the baseline model by dividing it into three parts and modifying each: 1) distortion encoding network, 2) sensitivity generation network, and 3) score regression. Through rigorous experiments, the proposed model achieves better prediction accuracy on the challenge database than other methods. Also, the proposed method shows better visualization results compared to the baseline model. We submitted our model in NTIRE 2021 Perceptual image Quality Assessment Challenge and won 12th in the main score.
Burst image super-resolution is an ill-posed problem tha' aims to restore a high-resolution (HR) image from a sequence of low-resolution (LR) burst images. To restore a photo-realistic HR image using their abundan...
详细信息
ISBN:
(纸本)9781665448994
Burst image super-resolution is an ill-posed problem tha' aims to restore a high-resolution (HR) image from a sequence of low-resolution (LR) burst images. To restore a photo-realistic HR image using their abundant information, it is essential to align each burst of frames containing random hand-held motion. Some kernel prediction networks (KPNs) that are operated without external motion compensation such as optical flow estimation have been applied to burst imageprocessing.as implicit image alignment modules. However, the existing methods do not consider the interdependencies among the kernels of different sizes that have a significant effect on each pixel. In this paper, we propose a novel weighted multi-kernel prediction network (WMKPN) that can learn the discriminative features on each pixel for burst image super-resolution. Our experimental results demonstrate that WMKPN improves the visual quality of super-resolved images. To the best of our knowledge, it outperforms the state-of-the-art within kernel prediction methods and multiple frame super-resolution (MFSR) on both the Zurich RAW to RGB and BurstSR datasets.
The symbolic imagery signal decomposition is a common problem in digital signal processing. Its main purpose is to divide the symbolic imagery signal into different parts. However, in real-world applications, symbolic...
Deep learning (DL) has made extensive progress in many exploration regions. computer vision is one of the most trending fields advancing due to extensive research in developing DL models, mainly focusing on image patt...
详细信息
In order to reproduce clear scenes of visible light images in hazy weather, and effectively suppress the image contrast and clarity degradation caused by haze degradation. General defogging methods do not take into ac...
详细信息
image dehazing is an important topic in the field of computer vision. The traditional single image dehazing algorithm is susceptible to the serious halo phenomenon and color distortion. For this problem, an improved m...
详细信息
暂无评论