The escalating ubiquity of video surveillance systems in both public and private sectors underscores an urgent need for automated mechanisms capable of identifying anomalous behaviors. Traditional methods, largely heu...
详细信息
ISBN:
(纸本)9798350349122;9798350349115
The escalating ubiquity of video surveillance systems in both public and private sectors underscores an urgent need for automated mechanisms capable of identifying anomalous behaviors. Traditional methods, largely heuristic in nature, are increasingly being supplanted by neural network-based approaches, offering a more nuanced and effective means for anomaly detection in video data. While Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have been extensively investigated, the incipient exploration of Transformer models in this domain represents a compelling research frontier. Transformer models, while beneficial, face challenges like computational intensity and data requirements. This paper presets the comparative analysis of deep learning for Abnormal activity detection. In our study, we assess the strengths and weaknesses of widely-used methods, considering both their structural features and their practical effectiveness in experiments. Additionally, we present an examination of emerging research avenues and potential future endeavors. Future directions should focus on optimizing neural architectures and transformers for real-time analytics, addressing skewed data, and establishing ethical guidelines for automated surveillance. It calls for interdisciplinary studies on the ethical implications of automated anomaly detection, setting the stage for enhancing Transformer models in real-time surveillance and evaluating their resilience against adversarial attacks.
In this paper, we propose a real-time FPGA implementation of the Semi-Global Matching (SGM) stereo vision algorithm. The designed module supports a 4K/Ultra HD (3840 x 2160 pixels @ 30 frames per second) video stream ...
详细信息
ISBN:
(纸本)9783031299698;9783031299704
In this paper, we propose a real-time FPGA implementation of the Semi-Global Matching (SGM) stereo vision algorithm. The designed module supports a 4K/Ultra HD (3840 x 2160 pixels @ 30 frames per second) video stream in a 4 pixel per clock (ppc) format and a 64-pixel disparity range. The baseline SGM implementation had to be modified to process pixels in the 4ppc format and meet the timing constrains, however, our version provides results comparable to the original design. The solution has been positively evaluated on the Xilinx VC707 development board with a Virtex-7 FPGA device.
This paper introduces a natural gas visualization and leak monitoring system based on spectral video technology. The system combines multiple modalities, including spectral video imaging, laser scanning, and visible l...
详细信息
As market is driving larger-sized TVs and display devices, video compression codec and videoimageprocessing algorithm are being developed with higher (4K to 8K) video resolution. It is critical to implement an optim...
详细信息
ISBN:
(纸本)9781665491303
As market is driving larger-sized TVs and display devices, video compression codec and videoimageprocessing algorithm are being developed with higher (4K to 8K) video resolution. It is critical to implement an optimized display system to satisfy the increasing needs for external memory bandwidth. In this paper, we propose an optimized hardware architecture of tile to raster scan line buffer. The tile to raster scan line buffer converts the videoimage data stored in tiled format to the raster scan order for display devices. Using the pattern of tile to raster scan order, the double buffering used by conventional system is removed and a simplified buffer access algorithm for real-time hardware implementation is proposed. The proposed method reduces buffer size by 50% compared to the conventional architecture.
High Efficiency video Coding (HEVC) and Multi-access Edge Computing (MEC) technologies can make real-time streaming media services available to users with reasonable bandwidth, but the computational complexity of HEVC...
详细信息
ISBN:
(纸本)9781728198354
High Efficiency video Coding (HEVC) and Multi-access Edge Computing (MEC) technologies can make real-time streaming media services available to users with reasonable bandwidth, but the computational complexity of HEVC tends to lead to increased energy consumption in these schemes. In this paper, we investigate the energy saving opportunities of utilizing a field-programmable gate array (FPGA) based HEVC encoder in edge media servers and devices. In practice, we analyze the energy impact of migrating our Kvazaar software HEVC intra encoder to Intel Arria 10 PCIe FPGA(s) on two platforms: 1) Nokia Airframe Cloud Server with 2.4 GHz dual 14-core Intel Xeon processors and 2) an embedded Jetson AGX Orin board with 2.2 GHz 12-core ARM processor. According to our experiments, FPGA encoding on these two platforms saved 76% and 86% of the energy taken up by software only encoding on Airframe, respectively. These results indicate the potential of FPGA-based video encoder acceleration in future green MEC architectures.
real-time online video super-resolution (VSR) on resource limited applications is a very challenging problem due to the constraints on complexity, latency and memory footprint, etc. Recently, a series of fast online V...
详细信息
ISBN:
(纸本)9781728198354
real-time online video super-resolution (VSR) on resource limited applications is a very challenging problem due to the constraints on complexity, latency and memory footprint, etc. Recently, a series of fast online VSR methods have been proposed to tackle this issue. In particular, attention based methods have achieved much progress by adaptively aligning or aggregating the information in preceding frames. However, these methods are still limited in network design to effectively and efficiently propagate the useful features in temporal domain. In this work, we propose a new fast online VSR algorithm with a flow-guided deformable attention propagation module, which leverages corresponding priors provided by a fast optical flow network in deformable attention computation and consequently helps propagating recurrent state information effectively and efficiently. The proposed algorithm achieves state-of-the-art results on widely-used benchmarking VSR datasets in terms of effectiveness and efficiency. Code can be found at https://***/IanYeung/FastOnlineVSR.
We propose an approach to the creation of a panorama viewport and objects detection within it in real-time on the base of the set of videos from the assembly of cameras. The task of the panorama viewport generation is...
详细信息
Lossy compression reduces the data amount in the video by sacrificing quality, which leads to severe distortion, especially when videos are overly compressed. Con-sequently, many restoration methods have been proposed...
详细信息
Weather conditions significantly influence traffic safety, with adverse conditions often leading to hazardous driving environments. This study presents the development of a weather detection model based on convolution...
详细信息
ISBN:
(纸本)9780784485514
Weather conditions significantly influence traffic safety, with adverse conditions often leading to hazardous driving environments. This study presents the development of a weather detection model based on convolutional neural networks (CNNs), specifically trained on in-vehicle dash camera videos. Using thousands of images sourced from the SHRP2 trajectories level dataset, seven distinct weather categories were identified: clear, light snow, heavy snow, light rain, heavy rain, distant fog, and near fog. For enhanced training efficiency, images were strategically segmented. An innovative approach was employed where, on average, nine images were selected from every minute of the video;these images underwent processing using various pre-trained CNN architecture, such as "AlexNet," "GoogleNet," "ResNet," among others. The predominant detection results then informed the overall weather condition for the given video segment. The model exhibited promising performance, underscoring its viability for real-time in-vehicle weather detection and consequent enhancement of road safety.
Reference Picture Resampling (RPR) is a powerful tool that allows improving video coding efficiency of next generation codecs like Versatile video Coding (VVC) or Enhanced Compression Model (ECM). This feature is well...
详细信息
ISBN:
(纸本)9781510679344;9781510679351
Reference Picture Resampling (RPR) is a powerful tool that allows improving video coding efficiency of next generation codecs like Versatile video Coding (VVC) or Enhanced Compression Model (ECM). This feature is well designed to support frame changing resolution without inserting an instantaneous decoder refresh (IDR) or intra random access picture (IRAP). video streaming and low delay scenarios can take advantage of RPR to ensure a smooth frame-based bit-rate adaptation, compared to traditional techniques that can generate bitrate leaps. This paper proposes an encoder method to select the picture resolution change parameters effectively depending on the video signal characteristics. The picture resolution change decision is based on a low complexity neural network, and it is performed before the encoding process without RD-score computations making this approach suitable for realtime and low delay implementation. The experiments under Random Access (RA) and All Intra (AI) configurations of the VVC Test Model (VTM-21.1) show that the proposed method can bring luma BD-rate gain improvement of 1.46% and 0.95% respectively compared to the VVC Test Model anchor.
暂无评论