检索结果-内蒙古大学图书馆

Optimizing Real-Time Object Detection in a Multi-Neural processing Unit System

SENSORS 2025年第5期25卷 1376-1376页

作者： Oh, Sehyeon Kwon, Yongin Lee, Jemin Univ Sci & Technol Dept Artificial Intelligence Daejeon 34113 South Korea Elect & Telecommun Res Inst Daejeon 34129 South Korea

Real-time object detection demands high throughput and low latency, necessitating the use of hardware accelerators. NPU is specialized hardware designed to accelerate the calculation of deep learning models, providing better energy efficiency and parallel processing performance than existing CPUs or GPUs. In particular, it plays an important role in reducing latency and improving processing speed in applications that require real-time processing. In this paper, we construct a real-time object detection system based on YOLOv3, utilizing Neubla's Antara NPU, and propose two approaches for performance optimization. First, we ensure the continuity of NPU inference by allowing the CPU to process data in advance through double buffering. Second, in a multi-NPU environment, we distribute tasks among NPUs through queue-based processing and analyze the performance limits using Amdahl's law. Experimental results demonstrate that compared to a CPU-only environment, applying the NPU in single buffering improved throughput by 2.13 times, double buffering by 3.35 times, and in a multi-NPU environment by 4.81 times. Latency decreased by 1.6 times in single and double buffering, and by 1.18 times in the multi-NPU environment. The accuracy remained consistent, with 31.4 mAP on the CPU and 31.8 mAP on the NPU.

关键词： double buffering queue-based processing YOLOv3 neural processing unit real-time object detection

来源：评论

学校读者我要写书评

暂无评论

A queue-based block matching algorithm for video compression and motion segmentation

A queue-based block matching algorithm for video compression...

引用

Conference on Visual Communications and Image processing 2004

作者： Chiew, TK Chung-How, JTH Bull, DR Canagarajah, CN Univ Bristol Bristol BS8 1UB Avon England

ISBN: (纸本)0819452114

This paper addresses two issues related to motion estimation using the block matching algorithms (BMA): (1) determining the reliability of the motion vectors of each block, and (2) imposing smoothness constraint to the motion vector field. We introduce a new robust reliability measure to represent the confidence level of the motion vector from the cost function distribution and propose a novel algorithm that incorporates smoothness constraint into the motion vector field evaluation by implementing a priority queue structure based on the reliability measure. In this framework, a smooth motion vector field is evaluated in a single pass without going through iterations typical of many existing optical flow estimation algorithms. Hence it is fast and can easily be incorporated into real-time applications for video compression as well as image segmentation.

关键词： motion estimation block matching algorithm reliability measure queue-based processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还