The quality of image and videos plays a vital role in case of real-time systems. images are captured without sufficient illumination, lead to low dynamic range and high propensity for generating high noise levels. The...
详细信息
Super resolution (SR) is a technique designed for increasing the spatial resolution in an image from a low resolution (LR) to high resolution (HR) size. SR technology has had a considerable demand in a wide variety of...
详细信息
ISBN:
(纸本)9781510673199;9781510673182
Super resolution (SR) is a technique designed for increasing the spatial resolution in an image from a low resolution (LR) to high resolution (HR) size. SR technology has had a considerable demand in a wide variety of applications to recover HR images, such as medicine, engineering, computer vision, pattern recognition and video production, etc. In contrast to interpolation-based algorithms that often introduce distortions or irregular borders, this study proposes an implementation that can preserve the edges and fine details of an original image through the computation of the wavelet decomposition. Different Discrete Wavelet Transform (DWT) families such as: Daubechies, Symlet, and Coiflet were evaluated. The proposed system was implemented on a Raspberry Pi 4 model B, an embedded device, to get around the PC's mobility limitations, making it possible to create an in-expensive and energy- efficient SR system, reducing their complexity in realtime applications. To investigate the visual performance, SR images have been analysed in subjective matter via human perception view, guaranteeing good perception for the images of different nature from three different datasets such as FullHD (DIV2K), medical (Raabin WBC), and remote sensing (Sentinel- 1). The experimental results of designed implementations appear to demonstrate good performance in commonly used objective criteria: execution time, SSIM, and PSNR (0.742 sec., 0.9164, and 38.72 dB), respectively for images with a super resolution size of 1356 x 2040 pixels.
The proceedings contain 39 papers. The topics discussed include: optimization method of loop detection based on shadow compensation;realtime lane detection model based on lightweight;research on image detection algor...
ISBN:
(纸本)9781450389075
The proceedings contain 39 papers. The topics discussed include: optimization method of loop detection based on shadow compensation;realtime lane detection model based on lightweight;research on image detection algorithm based on improved retinanet;a study of student learning status classification based on the detection of key objects within the visual field;an outlier detection method based on symmetry and curvature threshold;research on adaptive object detection method of kernel correlation filtering;attention enhanced multi-patch deformable network for image deblurring;recaptured image forensics based on image illumination and texture features;and using temporal convolutional networks to enable action recognition for construction equipment.
Monitoring the movement and actions of humans in video in real-time is an important task. We present a deep learning based algorithm for human action recognition for both RGB and thermal cameras. It is able to detect ...
详细信息
ISBN:
(数字)9781510661714
ISBN:
(纸本)9781510661707;9781510661714
Monitoring the movement and actions of humans in video in real-time is an important task. We present a deep learning based algorithm for human action recognition for both RGB and thermal cameras. It is able to detect and track humans and recognize four basic actions (standing, walking, running, lying) in real-time on a notebook with a NVIDIA GPU. For this, it combines state of the art components for object detection (Scaled-YoloV4), optical flow (RAFT) and pose estimation (EvoSkeleton). Qualitative experiments on a set of tunnel videos show that the proposed algorithm works robustly for both RGB and thermal video.
The deep integration of new-generation information technology and manufacturing is triggering far-reaching industrial changes. Machine vision inspection is widely used in large-scale repetitive industrial production p...
详细信息
Face recognition is used in numerous authentication applications, unfortunately they are susceptible to spoofing attacks such as paper and screen attacks. In this paper, we propose a method that is able to recognise i...
详细信息
ISBN:
(纸本)9783031510229;9783031510236
Face recognition is used in numerous authentication applications, unfortunately they are susceptible to spoofing attacks such as paper and screen attacks. In this paper, we propose a method that is able to recognise if a face detected in a video is not real and the type of attack performed on the fake video. We propose to learn the temporal features exploiting a 3D Convolution Network that is more suitable for temporal information. The 3D ConvNet, other than summarizing temporal information, allows us to build a real-time method since it is so much more efficient to analyse clips instead of analyzing single frames. The learned features are classified using a binary classifier to distinguish if the person in the clip video is real (i.e. live) or not, multi class classifier recognises if the person is real or the type of attack (screen, paper, ect.). We performed our test on 5 public datasets: Replay Attack, Replay Mobile, MSU-MSFD, Rose-Youtu, RECOD-MPAD.
With the development of science and technology and the renewal of media means in the era of cultural industry, electronic media develops rapidly, videoimage technology with digital realization as the carrier develops...
详细信息
The proceedings contain 78 papers. The topics discussed include: R-MSSIM: image quality assessment while performing object detection;some efficient algorithms for morphological operations on hexagonal lattices and reg...
ISBN:
(纸本)9781510639973
The proceedings contain 78 papers. The topics discussed include: R-MSSIM: image quality assessment while performing object detection;some efficient algorithms for morphological operations on hexagonal lattices and regular hexagonal domains;research on passenger flow statistics technology based on binocular stereo vision;optimal color selection for root-polynomial color correction;research on Beidou/GNSS wide area real-time positioning for automatic driving;research on perceptual fusion of audio and video based on deep learning;brief introduction of face image recognition method based on artificial intelligence;detecting GAN-synthesized faces based on deep alignment network;and multimodal medical image fusion based on saliency features detection in NSST domain.
In view of the problem of haze weather on the visual effect of videoimage, which causes the picture distortion, image quality degradation and definition blur of videoimage, a defogging processing method of haze vide...
详细信息
ISBN:
(纸本)9781450399449
In view of the problem of haze weather on the visual effect of videoimage, which causes the picture distortion, image quality degradation and definition blur of videoimage, a defogging processing method of haze videoimage based on optical flow threshold is proposed so as to restore the real and natural color image. Firstly, extract the image of the t frame at time t, track the characteristics of the image at time t + 1 to time t + n, extract the image of the t+n frame, then calculate the optical flow values of the t frame and the t + n frame, make a difference between the obtained optical flow values to obtain the optical flow threshold, compare the obtained optical flow threshold with the given threshold, if the value is greater than or equal to the given threshold, take the optical flow threshold intermediate frame image, and the middle frame and t+n frame images are processed by Retinex algorithm, and this operation is performed iteratively. Finally, the processed single frame video sequence is merged into a whole and output. The experiment shows that the processing speed of the algorithm is 0.07, much lower than other processing methods, which verifies the effectiveness and innovativeness of the proposed algorithm.
video classification models have become one of the most widely used topics in the computer vision field, encompassing many tasks such as medical, security, industrial, and other applications. Although deep learning mo...
详细信息
ISBN:
(纸本)9798350391893;9798350391886
video classification models have become one of the most widely used topics in the computer vision field, encompassing many tasks such as medical, security, industrial, and other applications. Although deep learning models have achieved great results in the video domain, such models are built to operate in the domain of RGB frame sequences. In such models, a prior step is required for decoding video data since the vast majority relies on compressed formats. Nevertheless, large amounts of computational resources are required for decoding, especially in real-time. Researchers have already tackled the task of building networks that work in the compressed domain with promising results but with architectures still very close to those used for the RGB domain. We propose an approach that employs Neural Architecture Search to explore and find the most effective architectures for the compressed domain. Our approach was tested on UCF101 and HMDB51 datasets, obtaining a computationally less complex architecture than similar methods.
暂无评论