检索结果-内蒙古大学图书馆

IEEE International conference on image processing (ICIP)

作者： Singh, Praneet Delp, Edward J. Reibman, Amy R. Purdue Univ Elmore Sch Elect & Comp Engn W Lafayette IN USA

ISBN: (数字)9781665496209

ISBN: (纸本)9781665496209

video analytics systems designed for computer vision tasks use deep learning models that rely on high-quality input data to maximize performance. However, in a real-world system, these inputs are often compressed using video codecs such as HEVC. video compression degrades the quality of the inputs, thereby degrading the performance of these models. Region-of-interest (ROI) coding enables bits to be allocated to improve performance;however, the method to select regions should be computationally simple since it must occur during or before the video is compressed and transmitted for further processing. In this paper, we propose a task-aware quad-tree (TA-QT) partitioning and quantization method to achieve ROI coding for HEVC and other video coding standards. TAQT uses a lightweight edge-based model to guide task-aware video encoding to improve end-stage video analytics (ESVA) performance while reducing both bit-rate and encoding time. We demonstrate the effectiveness of our approach in terms of (a) the performance of the ESVA on compressed inputs, (b) transmission bit-rates, and (c) encoding time.

关键词： video analytics computer vision deep learning HEVC HM video compression task-aware

来源：评论

学校读者我要写书评

暂无评论

Advanced image Analysis in Offshore Drilling Surveillance via Edge-Cloud Framework Leveraging Deep Reinforcement Learning

Advanced Image Analysis in Offshore Drilling Surveillance vi...

引用

International Joint conference on Neural Networks (IJCNN)

作者： Ji, Xiaofeng Gong, Faming Wang, Nuanlai Du, Chengze Zheng, Kaiwen China Univ Petr East China Coll Comp Sci & Technol Qingdao 266580 Peoples R China

ISBN: (纸本)9798350359329;9798350359312

The rapid increase in camera installations on offshore drilling platforms has intensified the challenge of highconcurrency video data processing. Traditional single-cloud server video analysis is becoming inadequate, leading to heightened processing latency and bandwidth overuse. In response, we propose an edge-cloud collaborative video object detection architecture based on a GRU-Enhanced Double Deep Q-Network (GE-DDQN). Our architecture utilizes the YOLOv8 algorithm for object detection and incorporates a GE-DDQN model for efficient task offloading between edge and cloud computing. A token bucket mechanism is employed to regulate data offloading rates from edge devices, optimizing collaborative efficiency for highperformance detection. Comparative experiments on real offshore drilling platform video data underscore the superiority of our method in managing dynamic and complex video streams. The architecture demonstrates remarkable video analysis performance in this domain, achieving a precision of 90.2% and a processing speed of 25.45 FPS, marking a significant advancement in edgecloud video analytics.

关键词： Object detection Edge-Cloud collaboration DDQN Task offloading decision Offshore drilling platforms

来源：评论

学校读者我要写书评

暂无评论

Detection of real-time Deepfakes in video Conferencing with Active Probing and Corneal Reflection 48

Detection of Real-Time Deepfakes in Video Conferencing with ...

引用

48th IEEE International conference on Acoustics, Speech and Signal processing, ICASSP 2023

作者： Guo, Hui Wang, Xin Lyu, Siwei University at Buffalo State University of New York Department of Computer Science and Engineering United States

ISBN: (纸本)9781728163277

The COVID pandemic has led to the wide adoption of online video calls in recent years. However, the increasing reliance on video calls provides opportunities for new impersonation attacks by fraudsters using the advanced real-time DeepFakes. real-time DeepFakes pose new challenges to detection methods, which have to run in real-time as a video call is ongoing. In this paper, we describe a new active forensic method to detect real-time DeepFakes. Specifically, we authenticate video calls by displaying a distinct pattern on the screen and using the corneal reflection extracted from the images of the call participant's face. This pattern can be induced by a call participant displaying on a shared screen or directly integrated into the video-call client. In either case, no specialized imaging or lighting hardware is required. Through large-scale simulations, we evaluate the reliability of this approach under a range in a variety of real-world imaging scenarios. © 2023 IEEE.

关键词： Corneal Reflection real-time DeepFake

来源：评论

学校读者我要写书评

暂无评论

MELDER: The Design and Evaluation of a real-time Silent Speech Recognizer for Mobile Devices 24

MELDER: The Design and Evaluation of a Real-time Silent Spee...

引用

CHI conference on Human Factors in Computing Sytems (CHI)

作者： Pandey, Laxmi Arif, Ahmed Sabbir Univ Calif Merced Inclus Interact Lab Merced CA 95343 USA

ISBN: (纸本)9798400703300

Silent speech is unaffected by ambient noise, increases accessibility, and enhances privacy and security. Yet current silent speech recognizers operate in a phrase-in/phrase-out manner, thus are slow, error prone, and impractical for mobile devices. We present MELDER, aMobile Lip Reader that operates in real-time by splitting the input video into smaller temporal segments to process them individually. An experiment revealed that this substantially improves computation time, making it suitable for mobile devices. We further optimize the model for everyday use by exploiting the knowledge from a high-resource vocabulary using a transfer learning model. We then compare MELDER in both stationary and mobile settings with two state-of-the-art silent speech recognizers, where MELDER demonstrated superior overall performance. Finally, we compare two visual feedback methods of MELDER with the visual feedback method of Google Assistant. The outcomes shed light on how these proposed feedback methods influence users' perceptions of the model's performance.

关键词： Silent speech digital lip reading image processing deep learning transfer learning language modeling visual feedback text input

来源：评论

学校读者我要写书评

暂无评论

Deep Learning-Based Action Classification Using One-Shot Object Detection

引用

Computers, Materials & Continua 2023年第8期76卷 1343-1359页

作者： Hyun Yoo Seo-El Lee Kyungyong Chung Contents Convergence Software Research Institute Kyonggi UniversitySuwon-si16227Korea Department of Public Safety Bigdata Kyonggi UniversitySuwon-si16227Korea Division of Computer Science and Engineering Kyonggi UniversitySuwon-si16227Korea

Deep learning-based action classification technology has been applied to various fields,such as social safety,medical services,and *** an action on a practical level requires tracking multiple human bodies in an image in real-time and simultaneously classifying their *** are various related studies on the real-time classification of actions in an ***,existing deep learning-based action classification models have prolonged response speeds,so there is a limit to real-time *** addition,it has low accuracy of action of each object ifmultiple objects appear in the ***,it needs to be improved since it has a memory overhead in processing image *** learning-based action classification using one-shot object detection is proposed to overcome the limitations of multiframe-based analysis *** proposed method uses a one-shot object detection model and a multi-object tracking algorithm to detect and track multiple objects in the ***,a deep learning-based pattern classification model is used to classify the body action of the object in the image by reducing the data for each object to an action *** to the existing studies,the constructed model shows higher accuracy of 74.95%,and in terms of speed,it offered better performance than the current studies at 0.234 s per *** proposed model makes it possible to classify some actions only through action vector learning without additional image learning because of the vector learning feature of the posterior neural ***,it is expected to contribute significantly to commercializing realistic streaming data analysis technologies,such as CCTV.

关键词： Human action classification artificial intelligence deep neural network pattern analysis video analysis

来源：评论

学校读者我要写书评

暂无评论

real-time Semantic Segmentation for Autonomous Scale Cars using Mixed real and Synthetic Data 10

Real-Time Semantic Segmentation for Autonomous Scale Cars us...

引用

10th International conference on Mechatronics and Robotics Engineering (ICMRE)

作者： Ehrenfeuchter, Steffen Corlito, Roberto Marchthaler, Reiner Enzweiler, Markus Esslingen Univ Appl Sci Inst Intelligent Syst Esslingen Germany

ISBN: (纸本)9798350394283;9798350394276

This paper presents a real-time semantic segmentation framework for camera-based environment perception of objects and infrastructure elements in autonomous scale cars. It is specifically targeted towards student competitions such as the Carolo Cup or the Bosch Future Mobility Challenge. To reduce pixel-wise manual annotation efforts, our framework involves a mixture of both synthetic and real image data, carefully tuned towards the unique requirements of the given scenario. real images are acquired from a 1:10 scale vehicle equipped with a single monocular camera and are manually annotated. Synthetic image data with automatic pixel-wise annotation is obtained via a custom Unity-based simulation pipeline. We evaluate various mixed real-synthetic data strategies to train different state-of-the-art deep neural networks with a focus on both segmentation performance and real-time capability using an NVIDIA Jetson AGX Xavier platform as in-vehicle test bed. Our experimental results show a significant improvement in semantic segmentation performance of the mixed real-synthetic data approach at real-time speeds of approximately 60 FPS on the target platform.

关键词： Mobile Robotics Autonomous Systems Robot Intelligence and Learning Signal and image processing

来源：评论

学校读者我要写书评

暂无评论

Reference picture selection scheme for robust HEVC video transmission using compressed domain saliency map

引用

JOURNAL OF real-time image processing 2023年第4期20卷 70页

作者： Xu, Jiajun Wang, Bing Peng, Qiang Southwest Jiaotong Univ Sch Comp & Artificial Intelligence Chengdu 611756 Peoples R China Chongqing Jiaotong Univ Sch Informat Sci & Engn Chongqing 40074 Peoples R China

The high level of compression achieved by high efficiency video coding (HEVC) helps reduce network traffic loads and mitigate data rate requirements. However, HEVC is vulnerable to error-prone channels where transmission errors can result in severe degradation of video quality. In this paper, a saliency-aware encoding scheme is proposed to improve the error robustness of HEVC streaming, by reducing temporal error propagation in case of packet loss. The proposed scheme firstly introduces a saliency detection model in compressed domain, based on two HEVC features derived from the depth splitting of the coding unit and the residual. Incorporated with the saliency map, an improved reference frame selection strategy is then introduced to reduce the inter-prediction mismatch that occurs at the decoder after packet loss. Specifically, the reference frames are dynamically selected based on the saliency-weighted Lagrangian optimisation, which not only reduces the number of prediction units (PUs) that depend on a single reference in saliency regions but also chooses the optimal coding mode for non-saliency regions. Finally, the most salient PUs are required to select the reference block in which most of the pixels are coded with intra-mode, for providing more robust reference to saliency regions. The simulation results show that the proposed reference picture selection scheme outperforms other reference methods with higher error robustness and a smaller loss in coding efficiency. Compared to the HEVC reference software, the proposed scheme is able to improve the quality of recovered video after packet loss, achieving average PSNR gains of up to 1.92 dB.

关键词： Robust video coding Reference picture selection Intra-coding Saliency map

来源：评论

学校读者我要写书评

暂无评论

Compressed hyperspectral imaging based on image reflection intensity and differential fusion filtering

引用

APPLIED OPTICS 2024年第27期63卷 7188-7199页

作者： Qu, Xiaorui Zhao, Jufeng Tian, Haijun Zhu, Junjie Cui, Guangmang Hangzhou Dianzi Univ Inst Carbon Neutral & New Energy Sch Elect & Informat Hangzhou 310018 Peoples R China Hangzhou Dianzi Univ Zhejiang Prov Key Lab Equipment Elect Hangzhou 310018 Peoples R China

Existing spectral imaging technology based on compressed coding requires tens of minutes or even hours to obtain higher-quality spectral data. This limits their use in real dynamic scenarios and can only be discussed theoretically. Therefore, we propose a non-iterative algorithm model based on image reflection intensity-estimation aid (IRI-EA). The algorithm studies the approximate proportional relationship between the reflection strength of the RGB diagram and the corresponding spectrum image and reconstructs high-quality spectral data within about 20 s. By solving the difference map of the corresponding spectral scene, combining it with the spectral data of the IRI method, and introducing the total guidance (TG) filter, the reconstruction error can be significantly reduced, and the spectral reconstruction quality can be improved. Compared with other advanced methods, numerous experimental results indicate the advantages of this method in reconstruction quality and efficiency. Specifically, compared with the existing advanced methods, the average efficiency of our method has improved by at least 85%. Our reconstruction model opens up the possibility of processing real-time video and accelerating other methods. (c) 2024 Optica Publishing Group. All rights, including for text and data mining (TDM), Artificial Intelligence (AI) training, and similar technologies, are reserved.

关键词： Hyperspectral imaging Imaging systems Imaging techniques Neural networks Remote sensing Spatial resolution

来源：评论

学校读者我要写书评

暂无评论

video image Technology in Digital Media Teaching Based on IoT Technology Environment

Video Image Technology in Digital Media Teaching Based on Io...

引用

International conference on Smart Applications and Sustainability in the Artificial Intelligence of Things, SAS-AIoT 2024

作者： Su, Li Li, Ruitian Weifang Engineering Vocational College Shandong Weifang262500 China

ISBN: (纸本)9783031782756

Nowadays, China’s digital media technology is relatively lagging behind, and its application in the field of teaching is the goal pursued by many scholars. A real-time teacher-student interaction environment has been constructed by combining IoT sensing technology with video imaging technology. Teachers can use intelligent devices to monitor student feedback and participation in real-time, and provide personalized guidance and interaction to improve teaching effectiveness and learning experience. In real-time interactive exploration, the average interaction frequency in the control group was 0.5 times per minute, while in the experimental group it was 2.0 times per minute;The learning effect score of the control group is 75 points, and the experimental group is 88 points. The analysis of video and image technology in digital media teaching based on the Internet of Things (IoT) technology environment helps to promote the improvement of digital media teaching level. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

关键词： video analysis

来源：评论

学校读者我要写书评

暂无评论

High-Throughput and Multiplierless Hardware Design for the AV1 Local Warped MC Interpolation 30

High-Throughput and Multiplierless Hardware Design for the A...

引用

30th IEEE International conference on image processing (ICIP)

作者： Domanski, Robson Kolodziejski, William Penny, Wagner Porto, Marcelo Zatt, Bruno Agostini, Luciano Fed Univ Pelotas UFPel Pelotas RS Brazil Sul Rio Grandense Fed Inst Pelotas RS Brazil Video Technol Res Grp ViTech Grp Architectures & Integrated Circuits GACI Pelotas RS Brazil

ISBN: (纸本)9781728198354

Most of the current video codecs support only translational motion models. However, real motion is often complex and cannot be precisely estimated using only translational models. To handle complex motions like panning, zooming, scaling, shearing and rotation, AOMedia AV1 encoder counts with two tools, called Global and Local Warped Motion Compensation (LWMC). This paper presents two dedicated hardware designs for the AV1 LWMC interpolation filters. The presented hardware can process up to UHD 8K videos at 60fps. The architecture was synthesized for 40nm TSMC standard cells, requiring 454.37K gates with a power dissipation of 189.35mW. To the best of the authors' knowledge, this is the first work in the literature targeting a dedicated hardware design for LWMC AV1 tool.

关键词： AV1 Warped Motion Compensation Hardware Design video Coding

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：