检索结果-内蒙古大学图书馆

38th AAAI conference on Artificial Intelligence (AAAI) / 36th conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence

作者： Chu, Ernie Huang, Tzuhsuan Lin, Shuo-Yen Chen, Jun-Cheng Acad Sinica Res Ctr Informat Technol Innovat 128 Acad RdSect 2 Taipei Taiwan

ISBN: (纸本)1577358872

This study introduces an efficient and effective method, MeDM, that utilizes pre-trained image Diffusion Models for video-to-video translation with consistent temporal flow. The proposed framework can render videos from scene position information, such as a normal G-buffer, or perform text-guided editing on videos captured in real-world scenarios. We employ explicit optical flows to construct a practical coding that enforces physical constraints on generated frames and mediates independent frame-wise scores. By leveraging this coding, maintaining temporal consistency in the generated videos can be framed as an optimization problem with a closed-form solution. To ensure compatibility with Stable Diffusion, we also suggest a workaround for modifying observation-space scores in latent Diffusion Models. Notably, MeDM does not require fine-tuning or test-time optimization of the Diffusion Models. Through extensive qualitative, quantitative, and subjective experiments on various benchmarks, the study demonstrates the effectiveness and superiority of the proposed approach. Our project page can be found at https://***/.

关键词： Diffusion

来源：评论

学校读者我要写书评

暂无评论

An Adaptive image Windowing Method for real-time Object Detection on Board

An Adaptive Image Windowing Method for Real-Time Object Dete...

引用

7th International conference on Electronics Technology, ICET 2024

作者： Ling, Long Lu, Zhijun Shi, Manli Wang, Jie Li, Yuqing Hu, Minghe Beijing Institute of Space Mechanics & Electricity China Academy of Space Technology Beijing China

ISBN: (纸本)9798350363951

In order to meet the real-time detection and processing requirements of on-board targets in the field of remote sensing image processing, this paper carries out relevant research from the perspective of software optimization and proposes an onboard adaptive image windowing processing method based on differential evolution algorithm to solve the problem of shortage of on-board hardware computing resources. Based on the adaptive differential evolution algorithm, combined with the engineering application scenario, the algorithm innovatively designs an individual objective function to evaluate the 'advantages and disadvantages' of the algorithm evolution results. By coordinating all the target positions to be tracked in the image plane, the algorithm optimizes the number of windows, and considers the application boundary of the algorithm, which can effectively reduce the amount of data to be processed and greatly improve the timeliness of real-time detection on the board without losing the calculation accuracy. © 2024 IEEE.

关键词： Adaptive algorithms

来源：评论

学校读者我要写书评

暂无评论

A New Benchmark and Baseline for real-time High-Resolution image Inpainting on Edge Devices

A New Benchmark and Baseline for Real-Time High-Resolution I...

引用

2025 IEEE/CVF Winter conference on Applications of Computer Vision, WACV 2025

作者： Sanchez, Marcelo Triginer, Gil Ballester, Coloma Sarasua, Ignacio Raad, Lara Crisalix United States Upf United States Nvidia United States Iie FIng UdelaR Uruguay

ISBN: (纸本)9798331510831

Existing image inpainting methods have shown impressive completion results for low-resolution images. However, most of these algorithms fail at high resolutions and require powerful hardware, limiting their deployment on edge devices. Motivated by this, we propose the first baseline for real-time High-resolution image INpainting on Edge Devices (RETHINED) that is able to inpaint at ultra-high-resolution and can run in real-time ( © 2025 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

real-time video processing in Fuzzy Posture-Based Ergonomic Analysis in a Disassembly Cell

Real-Time Video Processing in Fuzzy Posture-Based Ergonomic ...

引用

International conference on Intelligent and Fuzzy Systems (INFUS)

作者： Amirnia, Ashkan Ghorbani, Elham Keivanpour, Samira Polytech Montreal Montreal PQ H3T 1J4 Canada

ISBN: (纸本)9783031671913;9783031671920

Traditional ergonomic evaluations often overlook the dynamic and uncertain nature of human movements, leading to potential musculoskeletal disorders (MSDs) and impacting worker health, efficiency, and company costs. Disassembly cells, crucial for sustainability and circular economy efforts, pose unique challenges and opportunities for ergonomic optimization. This study introduces an innovative approach for ergonomic risk assessment in the manufacturing industry, particularly within disassembly cells, by integrating real-time video processing and fuzzy logic. Our research fills a significant gap in ergonomic assessment by utilizing a multi-camera computer vision technique to capture and analyze worker motions in real-time, allowing for dynamic ergonomic risks assessment in a disassembly cell. The fuzzy logic inference enhances the system's ability to handle the variability and subjectivity of human posture, offering a more nuanced and accurate risk assessment than binary logic systems. Experimental validation in a laboratory setting confirms the feasibility of our approach, demonstrating its potential to improve worker safety and productivity by providing a more responsive and adaptable tool for ergonomic assessment in industrial environments. This work marks a significant advancement in the field, suggesting a path forward for the development of ergonomic interventions that are both more effective and applicable in diverse manufacturing settings.

关键词： Ergonomic risk disassembly cell fuzzy logic computer vision

来源：评论

学校读者我要写书评

暂无评论

RVSRT: real-time video Super Resolution Transformer 14

RVSRT: Real-time Video Super Resolution Transformer

引用

14th International conference on Graphics and image processing (ICGIP)

作者： Ou, Linlin Chen, Yuanping Chinese Acad Sci Comp Network Informat Ctr Beijing Peoples R China Univ Chinese Acad Sci Beijing Peoples R China

ISBN: (纸本)9781510666313;9781510666320

video super-resolution is the task of converting low-resolution video to high-resolution video. Existing methods with better intuitive effects are mainly based on convolutional neural networks (CNNs), but the architecture is heavy, resulting in a slow inference structure. Aiming at this problem, this paper proposes a real-time video super-resolution Transformer (RVSRT) can quickly complete the super-resolution task while considering the visual fluency of video frame switching. Unlike traditional methods based on CNNs, this paper does not process video frames separately with different network modules in the temporal domain, but batches adjacent frames through a single UNet-style structure end-to-end Transformer network architecture. Moreover, this paper creatively sets up two-stage interpolation sampling before and after the end-to-end network to maximize the performance of the traditional CV algorithm. The experimental results show that compared with SOTA TMNet [1], RVSRT has only 20% of the network size (2.3M vs 12.3M, parameters) while ensuring comparable performance, and the speed is increased by 80% (26.2 fps vs 14.3 fps, frame size is 720*576).

关键词： video super resolution vision transformer deep learning

来源：评论

学校读者我要写书评

暂无评论

Joint video Transcoding and Representation Selection for Edge-Assisted Multi-party video Conferencing 23rd

Joint Video Transcoding and Representation Selection for Edg...

引用

23rd International conference on Algorithms and Architectures for Parallel processing (ICA3PP)

作者： Kong, Fanhao Cao, Tuo Qian, Zhuzhong Wang, Xiaoliang Zhao, Ming Wang, Liming Lin, Zhenjie Nanjing Univ State Key Lab Novel Software Technol Nanjing Peoples R China CSG China Southern Power Grid Digital Platform Te Shenzhen Peoples R China

ISBN: (纸本)9789819708338;9789819708345

Current cloud-based multi-party video conferencing suffers from heavy workloads on media servers caused by video transcoding. Emerging edge computing can assist in offloading transcoding tasks to edge nodes. However, the resource-limited nature of edge nodes poses new challenges. First, edge nodes can real-timely transcode a video into only a subset of representations, raising the video transcoding problem of what is the set of representations each participant should transcode its video stream into. Second, since participants' downlink resources are limited, one needs to solve the representation selection problem of what representation each participant should select for receiving another participant's video. Third, the above two problems are coupled and should be optimized simultaneously. Hence, this paper studies the joint video transcoding and representation selection problem for edge-assisted multiparty video conferencing, with the aim of maximizing the overall QoE under the resource and real-time video transcoding constraints. Such a problem is formulated as a non-linear integer program and is NP-hard. To solve it, we leverage the submodular optimization technique and propose a (1- 1/e) -approximate algorithm with the polynomial computation complexity. Finally, extensive trace-driven simulations are conducted to evaluate the proposed algorithm. The results show that it outperforms the alternatives by 1.5-2.5x on average in terms of overall QoE.

关键词： video Transcoding Representation Selection Multi-Party video Conferencing Edge Computing

来源：评论

学校读者我要写书评

暂无评论

image Link Through Adaptive Encoding Data Base and Optimized GPU Algorithm for real-time image processing of Artificial Intelligence

引用

JOURNAL OF WEB ENGINEERING 2022年第2期21卷 459-496页

作者： An, Byoungman Kim, Youngseop Dankook Univ Elect & Elect Engn 152 Jukjeon Ro Yongin 16890 Gyeonggi Do South Korea

This paper presents the latest Ethernet standardization of in-vehicle network and the future trends of automotive ethernet technology. The proposed system provides a design and optimization algorithm of in-vehicle networking technologies related Ethernet Audio video Bridge (AVB) technology. We present a design of in-vehicle network system as well as the optimization of AVB for automotive. A proposal of Reduced Latency of Machin to Machine (RLMM) plays a significant role in reducing the latency between devices. The approach of RLMM on realistic test cases indicated that there was a latency reduction about 30.41% It is expected that the optimized settings for the actual automotive network environment can greatly shorten the time period in the development and design process. The results achieved from the experiments on the latency present in each function are trustworthy since average values are obtained via repeated tests for several months. It would considerably benefit the industry because analyzing the delay between each function in a short period of time is tremendously significant. In addition, through the proposed real-time camera and video streaming via optimized settings of AVB system, it is expected that AI (Artificial Intelligence) algorithms in autonomous driving will be of great help in understanding and analyzing images in real time.

关键词： image link in-vehicle GPU algorithm optimization audio video bridge AVB low latency automotive multimedia artificial intelligence data base

来源：评论

学校读者我要写书评

暂无评论

A Deep Learning-Based Automatic Data Acquisition System for Medical Monitors 14

A Deep Learning-Based Automatic Data Acquisition System for ...

引用

14th International conference on Information Science and Technology, ICIST 2024

作者： Zou, Yizhi Cao, Han Cheng, Xu Yang, Lu University of Electronic Science and Technology of China Department of Automation Engineering Chengdu China West China Hospital of Sichuan University Department of Anesthesiology Chengdu China

ISBN: (纸本)9798350353334

In the cardiac operating room, several operators are essential to assist the surgeon, including the physician managing and monitoring the artificial heart-lung machine. The custodian must interpret the patient's vital signs from equipment data and make decisions, such as blood transfusion. However, the equipment lacks automated data acquisition and recording capabilities, posing significant challenges for documenting surgical information. This paper introduces a system for screen segmentation and text recognition based on visual methods. This system allows the operating equipment doctor to wear a head-mounted camera to capture real-time video similar to the doctor's perspective, and pulls the video stream of the camera through RTSP (real-time Streaming Protocol) on the PC side. We proceed by processing the video stream captured by the camera, leveraging the YOLO (You Only Look Once) algorithm and OCR (Optical Character Recognition) technology as our primary tools to do screen segmentation and text recognition. These technologies enable us to extract the information displayed on the medical equipment screens. By initially employing YOLO for detecting and segmenting the screen of interest, the process is approximately 127% faster than direct OCR processing of the entire video frame. Additionally, the accuracy rate of OCR recognition for clear pictures can also reach more than 95% by our method. © 2024 IEEE.

关键词： video streaming

来源：评论

学校读者我要写书评

暂无评论

智能行车记录仪图像去雾系统的FPGA设计

引用

上海交通大学学报 2024年第4期58卷 565-578页

作者：黄鹤胡凯益杨澜王浩高涛王会峰长安大学智慧高速公路信息融合与控制西安市重点实验室西安710064 长安大学电子与控制工程学院西安710064 长安大学信息工程学院西安710064

雾霾天气下,交通道路能见度低,导致所采集到的视频画面退化、图像信息模糊,同时考虑传统系统处理实时性不高等问题,基于ZYNQ平台设计了一种图像去雾系统,并应用于智能行车记录仪系统中.首先,针对传统暗通道去雾算法在天空区域存在失真... 详细信息

雾霾天气下,交通道路能见度低,导致所采集到的视频画面退化、图像信息模糊,同时考虑传统系统处理实时性不高等问题,基于ZYNQ平台设计了一种图像去雾系统,并应用于智能行车记录仪系统中.首先,针对传统暗通道去雾算法在天空区域存在失真等问题,提出了一种分割天空区域的策略来修正图像复原参数;然后,针对计算全局大气光值时,需对整幅图像的像素排序消耗大量资源的问题,利用现场可编程门阵列(FPGA)并行运算的优势,提出一种帧迭代方法优化求取大气光值,同时优化了引导滤波的硬件设计;最后,将双路高清多媒体接口(HDMI)资源中,一路作为视频输入,另一路作为视频处理输出,搭建实时交通图像视频处理试验平台.试验结果表明,系统针对雾霾天气下的交通视频具有较好的去雾效果,尤其是可以解决天空区域去雾的失真问题.在对分辨率为1280像素×720像素的交通视频去雾时,可以达到30帧/s的处理速度,满足实时性要求.

关键词：交通视频图像去雾 ZYNQ平台实时处理

来源：评论

学校读者我要写书评

暂无评论

A Decentralized Fog Architecture for video Preprocessing in Cloud-based video Surveillance as a Service 1

A Decentralized Fog Architecture for Video Preprocessing in ...

引用

1st IEEE International conference on Cognitive Robotics and Intelligent Systems, ICC - ROBINS 2024

作者： Divya, G. Swetha, K. Siva Shruthika, S. Guru Santhosh, P. Santhi, S. Kalaiselvi, S. National Engineering College Department of Information Technology Kovilpatti India National Engineering College Department of Computer Science and Engineering Kovilpatti India

ISBN: (数字)9798350372748

ISBN: (纸本)9798350372748

With the increasing demand for Cloud-based video Surveillance as a Service (VSaaS), the efficient processing of vast amounts of video data poses significant challenges. The framework leverages Fog computing at the network edge, enabling real-time video analytics, object detection, and event recognition to be performed closer to the data source. By offloading intensive computational tasks from the central cloud to edge fog nodes, the framework reduces bandwidth consumption and minimizes processing latency. By integrating the state-of-the-art machine learning algorithms, automated video preprocessing with higher accuracy can be achieved. Moreover, the decentralized fog architecture enhances system responsiveness and ensures data privacy by processing sensitive information locally. The primary objective of this preprocessing is to reduce the volume of data transmitted to the cloud while preserving critical information for subsequent analysis. By leveraging fog computing, background subtraction, and intelligent video compression, a substantial reduction is achieved in data transmission, leading to more cost-effective and responsive video surveillance systems. © 2024 IEEE.

关键词： Network architecture

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：