Adaptive block partitioning is responsible for large gains in current image and video compression systems. This method is able to compress large stationary image areas with only a few symbols, while maintaining a high...
详细信息
ISBN:
(纸本)9781728198354
Adaptive block partitioning is responsible for large gains in current image and video compression systems. This method is able to compress large stationary image areas with only a few symbols, while maintaining a high level of quality in more detailed areas. Current state-of-the-art neural-network-based image compression systems however use only one scale to transmit the latent space. In previous publications, we proposed RDONet, a scheme to transmit the latent space in multiple spatial resolutions. Following this principle, we extend a state-of-the-art compression network by a second hierarchical latent-space level to enable multi-scale processing. We extend the existing rate variability capabilities of RDONet by a gain unit. With that we are able to outperform an equivalent traditional autoencoder by 7% rate savings. Furthermore, we show that even though we add an additional latent space, the complexity only increases marginally and the decoding time can potentially even be decreased.
A variety of information of the real-time scenes is carried by the images and videos. processing these images and videos in an intelligent way helps in many domains such as computer vision, object detection, deep lear...
详细信息
A variety of information of the real-time scenes is carried by the images and videos. processing these images and videos in an intelligent way helps in many domains such as computer vision, object detection, deep learning and 3D reconstruction leaving its large usage in applications such as auto pilots, augmented reality, smart vehicles, etc. The quality of image and videos plays a vital role in case of real-time systems. One such scenario is where the images are captured without sufficient illumination. images captured in cameras where sufficient light is not present, leads to noisy and information-loss images. It is a fact that, dark images have mainly two aspects which make its study a difficult task. They are at its low dynamic range and high propensity for generating high noise levels. Hence, an approach based on deep learning based system is adopted. For this purpose, Generative Adversarial Network (GAN) based Extremely Dark video Enhancement Network (GEVE) model is proposed. The main objective of GEVE is to team the model with low /normal- light image pairs. Thus, GAN network learns the translation from light feeble images and images captured under normal illumination and automatically translate original images taken under extremely low light conditions into images of quality. It is clearly observed that the proposed GEVE outperforms the known state-of-art techniques. We are the view that the proposed system is an ideal candidate to handle dark image/video frames. (C) 2021 Published by Elsevier B.V.
Synthetic Aperture Radar (SAR) images have a wide range of applications due to their all-weather and all-day working conditions. However, SAR images with different scenarios and imaging conditions are insufficient or ...
详细信息
We address the problem of multi-object 3D pose control in image diffusion models. Instead of conditioning on a sequence of text tokens, we propose to use a set of per-object representations, Neural Assets, to control ...
This work presents a novel approach to real-time criminal detection through the use of cutting-edge face recognition technology. Accuracy and Reliability, Scalability, Environmental Variability, Camera Quality and Res...
详细信息
ISBN:
(纸本)9798350391558;9798350379990
This work presents a novel approach to real-time criminal detection through the use of cutting-edge face recognition technology. Accuracy and Reliability, Scalability, Environmental Variability, Camera Quality and Resource Constraints are the major challenges faced by this problem. In order to improve public safety and support law enforcement, the system uses the Multi-Task Cascade Neural Network (MTCNN) to reliably identify and recognize faces in difficult situations, such as low light or obscured views. Due to MTCNN's strong deep learning capabilities, people of interest may be quickly identified and perhaps prevented from committing crimes in busy or dimly light settings with high accuracy identification. Even with few reference photos, the technology can match identified faces to a database of known criminals, guaranteeing flexibility in a range of scenarios. One of its primary features is its 90% accuracy in real-time analysis of live video feeds from security cameras, which facilitates quick reactions to any threats and improves community safety. This technology, which combines face detection, identification, and real-timeprocessing, is a major step forward for law enforcement in their fight against crime and for maintaining community security.
realtime control, diversified functions, system integration and miniaturization are an important development direction of video electronics system. Embedded design based on FPGA can manage system resources more reaso...
详细信息
In recent years, with the extensive application of deep learning methods, face recognition technology has been greatly developed. Aiming at the problem of video surveillance in power network, a video surveillance meth...
详细信息
Transformer has shown outstanding performance in time-series data processing, which can definitely facilitate quality assessment of video sequences. However, the quadratic time and memory complexities of Transformer p...
详细信息
ISBN:
(数字)9781665496209
ISBN:
(纸本)9781665496209
Transformer has shown outstanding performance in time-series data processing, which can definitely facilitate quality assessment of video sequences. However, the quadratic time and memory complexities of Transformer potentially impede its application to long video sequences. In this work, we study a mechanism of sharing attention across video clips in video quality assessment (VQA) scenario. Consequently, an efficient architecture based on integrating shared multi-head attention (MHA) into Transformer is proposed for VQA, which greatly ease the time and memory complexities. A long video sequence is first divided into individual clips. The quality features derived by an image quality model on each frame in a clip are aggregated by a shared MHA layer. The aggregated features across all clips are then fed into a global Transformer encoder for quality prediction at sequence level. The proposed model with a lightweight architecture demonstrates promising performance in no-reference VQA (NR-VQA) modelling on publicly available databases. The source code can be found at https://***/junyongyou/lagt_vqa.
This project introduces a transformative object detection system designed to enhance the navigational capabilities of visually impaired individuals through the application of advanced computer vision technologies. Uti...
详细信息
ISBN:
(纸本)9798350395334;9798350395327
This project introduces a transformative object detection system designed to enhance the navigational capabilities of visually impaired individuals through the application of advanced computer vision technologies. Utilizing the You Only Look Once (YOLO) model, paired with the Comprehensive Object Collection (COCO) dataset, this system provides real-time, accurate object detection and classification. The core functionality of the application allows for the processing of both static images and live video feeds, enabling blind users to receive auditory announcements of nearby objects, thereby assisting with spatial awareness and environmental interaction. The system leverages a pre-trained YOLO model to ensure robust detection performance, achieving a peak detection accuracy of 99%. By delivering object labels and bounding box coordinates audibly, the application serves as a critical tool in improving the daily independence and quality of life for people with visual impairments. This project not only highlights the potential of deep learning in assistive technologies but also underscores the importance of adaptive solutions in inclusive technology development.
Hyperspectral imaging and artificial intelligence (AI) have transformed imaging and data processing through their ability to capture and analyze detailed spectral information. This paper explores the integration of hy...
详细信息
暂无评论