The growing demand for real-timeimageprocessing on edge devices calls for novel approaches that balance computational efficiency with high performance. This paper introduces an integrated solution combining ShuffleN...
详细信息
video skimming involves generating a concise representation that captures all its significant information. However, conventional skimming techniques often fail to capture different shots in a video due to their inabil...
详细信息
ISBN:
(纸本)9798350376043;9798350376036
video skimming involves generating a concise representation that captures all its significant information. However, conventional skimming techniques often fail to capture different shots in a video due to their inability to detect scene modifications and incorporate the hierarchical structure of video content. This work proposes an unsupervised hierarchical method for video skimming, called Hierarchical time-aware Skimming - HieTaSkim, in which video content is modeled as a graph, and an adaptive strategy is employed to produce hierarchical graph cuts. Those cuts are used to identify the most relevant video segments or keyshots, allowing the extraction of frames' sequences that convey the video's central message and resulting in a more effective and accurate video summary. Experimental results demonstrate that the proposed approach outperforms other state-of-the-art unsupervised methods for video skimming, achieving in the SumMe dataset an F-score of 39.9 which represents an improvement of 10% at least.
Unmanned aerial vehicle (UAV) has the advantages of simple operation, sensitive response, flexible flight, long battery life and low cost, and has become a conventional way of power inspection. However, the video sign...
详细信息
In underwater exploration, Autonomous Underwater Vehicles (AUVs) face challenges due to the adverse effects of the aquatic environment on optical sensors, resulting in sub-optimal data acquisition. To overcome this, w...
详细信息
ISBN:
(纸本)9781510673854;9781510673847
In underwater exploration, Autonomous Underwater Vehicles (AUVs) face challenges due to the adverse effects of the aquatic environment on optical sensors, resulting in sub-optimal data acquisition. To overcome this, we propose a novel solution utilizing a Generative Adversarial Network (GAN) model. Rooted in the U-Net architecture, our model processes low-quality AUV camera feed, generating enhanced representations of the underwater scene. The discriminator focuses on evaluating current image patches, capturing high-frequency properties with fewer parameters, achieving a 15% improvement in model accuracy. This approach facilitates real-time preprocessing in visually-guided underwater robot autonomy pipelines, overcoming challenges associated with underwater visibility.
Drone usage is increasing significantly in our daily life, from military to delivery purposes. Although drones are also used to detect objects by using different techniques, they are limited to detecting flying small ...
详细信息
Skin color plays an important role in color imageprocessing and human-computer interaction. However, factors such as rapidly changing illumination, various color styles, and camera characteristics also make skin dete...
详细信息
Skin color plays an important role in color imageprocessing and human-computer interaction. However, factors such as rapidly changing illumination, various color styles, and camera characteristics also make skin detection a challenging task. In particular, the real-time requirement of practical applications is a challenging task in skin detection. In this paper, face detection and alignment are applied to select facial reference points for modeling the skin color distribution. Moreover, we propose the conception and detection approach of skin color model updating unit (SCMUU) according to the fact of skin color distribution remains consistent in a range of frames. The redundant operation of frame by frame updating is avoided using one model in frames of SCMUU. When no reliable faces are detected, two strategies are introduced to remedy and reduce the computational cost. It uses the corresponding model parameters if a similar previous SCMUU is found. Otherwise, we use fixed thresholds instead and increase the interval between two consecutive face detection. Besides, the time-consuming steps are accelerated using a graphic processing unit (GPU) with CUDA in this paper. Experimental results show that, compared with other existing methods, the proposed method has good realtime and accuracy for skin detection of various resolution videos under different illumination conditions.
The detection of anomalies in video data is of great importance in various applications, such as surveillance and industrial monitoring. This paper introduces a novel approach, named MAAD-GAN, for video anomaly detect...
详细信息
ISBN:
(数字)9783031585357
ISBN:
(纸本)9783031585340;9783031585357
The detection of anomalies in video data is of great importance in various applications, such as surveillance and industrial monitoring. This paper introduces a novel approach, named MAAD-GAN, for video anomaly detection (VAD) utilizing Generative Adversarial Networks (GANs). The MAAD-GAN framework combines a Wide Residual Network (WRN) in the generator with a memory module to learn the normal patterns present in the training video dataset, enabling the generation of realistic samples. To address the challenge of detecting subtle anomalies and those with motion characteristics, we propose the integration of self-attention in the discriminator model. Our proposed model MAAD-GAN enhances the ability to distinguish between real and generated samples, ensuring that anomalous samples are distorted when reconstructed. Experimental evaluations show the effectiveness of MAAD-GAN as compared to traditional methods on UCSD (University of California, San Diego) Peds2, CUHK Avenue, and ShanghaiTech datasets.
Understanding speech production both visually and kinematically can inform second language learning system designs, as well as the creation of speaking characters in video games and animations. In this work, we introd...
详细信息
Visible light communication (VLC) operates on the principle of modulating light-emitting diodes (LEDs) for data transmission at frequencies imperceptible to the human eye. In vehicular communication, VLC leverages exi...
详细信息
real-time violence detection is essential for protecting the safety and security of people, especially in college campuses that are dynamic and have crowds. Manual surveillance systems are popular but inefficient as t...
详细信息
暂无评论