检索结果-内蒙古大学图书馆

4th International conference on Data, Engineering, and Applications, IDEA 2022

作者： Adhau, Tejas P. Gadicha, Vijay B. G H Raisoni University Amravati India

ISBN: (纸本)9789819700363

video streaming is a subfield of signal processing that encompasses the pre-processing of video sequences, their contextual segmentation, application-specific feature extraction and selection, and the detection of distinct frame sequences. Researchers suggest a broad range of machine learning models to develop such streaming approaches, and each model differs in its functional subtleties, application-specific benefits, deployment-specific limits, and contextual future possibilities. Moreover, these models differ in quantitative and qualitative metrics, such as streaming bit rate, computing complexity, and streaming latency. But most of these models are either highly complex, or do not support congestion mitigation for real-time traffic scenarios. Due to which, it is difficult for video streaming designers and research to identify optimal models for their specific use cases. To overcome this ambiguity, this text reviews existing video streaming models in terms of their functional characteristics. Based on this review, readers will be able to identify models that suit their functional requirements. This text also compares these models in terms of qualitative metrics including streaming delay, computational complexity, error rate during communication, scalability, and cost of deployment under real-time scenarios. Based on this analysis, researchers will be able to identify optimal models for their application-specific use cases. This text also proposes evaluation of a novel video Streaming Rank Metric (VSRM) that combines these comparison metrics in order to identify an efficient set of optimal models for different contextual deployments. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

关键词： video streaming

来源：评论

学校读者我要写书评

暂无评论

A HW/SW Co-Design of video Dehazing Accelerator Using Decoupled Local Atmospheric Light Prior 24

A HW/SW Co-Design of Video Dehazing Accelerator Using Decoup...

引用

61st ACM/IEEE Design Automation conference, DAC 2024

作者： Tan, Yanjie Zhu, Yifu Huang, Zhaoyang Nie, Feiteng Tan, Huailiang The College of Computer Science and Electronic Engineering Hunan University Hunan Changsha China

ISBN: (纸本)9798400706011

In this paper, we introduce DLAPID, a novel decoupled parallel hardware-software co-design architecture for real-time video dehazing. From a software point of view, DLAPID isolates the atmospheric light operation from the initial transmission estimation to take full advantage of the hardware accelerators' parallelization features. For the hardware implementation, we deploy DLAPID both on FPGA and GPU platforms and validate its effectiveness. Using both real-world driving scenario testing sets and ground-truth datasets, we quantitatively and qualitatively assess the proposed method against several SOTA (state-of-the-art) video dehazing models. The outcomes of our experiments demonstrate that our approach achieves better dehazing performance with lower power consumption and has real-time processing capabilities, thereby preventing potential accidents of autonomous vehicles. © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

关键词： Integrated circuit design

来源：评论

学校读者我要写书评

暂无评论

Deep learning-based video-level view classification of two-dimensional transthoracic echocardiography

引用

BIOMEDICAL PHYSICS & ENGINEERING EXPRESS 2025年第2期11卷 025038-025038页

作者： Cheng, Hanlin Shi, Zhongqing Qi, Zhanru Wang, Xiaoxian Guo, Guanjun Fang, Aijuan Jin, Zhibin Shan, Chunjie Chen, Ruiyang Du, Yue Qian, Sunnan Luo, Shouhua Yao, Jing Southeast Univ Sch Biol Sci & Med Engn Nanjing Peoples R China Nanjing Univ Affiliated Hosp Nanjing Drum Tower Hosp Dept Ultrasound MedMed Sch Nanjing Peoples R China Nanjing Univ Affiliated Hosp Nanjing Drum Tower Hosp Med Imaging CtrMed Sch Nanjing Peoples R China Yizheng Hosp Nanjing Drum Tower Hosp Grp Yangzhou Peoples R China Nanjing Med Univ Sch Biomed Engn & Informat Nanjing Peoples R China Jiangsu Prov Official Hosp Dept Informat Off Nanjing Peoples R China

In recent years, deep learning (DL)-based automatic view classification of 2D transthoracic echocardiography (TTE) has demonstrated strong performance, but has not fully addressed key clinical requirements such as view coverage, classification accuracy, inference delay, and the need for thorough exploration of performance in real-world clinical settings. We proposed a clinical requirement-driven DL framework, TTESlowFast, for accurate and efficient video-level TTE view classification. This framework is based on the SlowFast architecture and incorporates both a sampling balance strategy and a data augmentation strategy to address class imbalance and the limited availability of labeled TTE videos, respectively. TTESlowFast achieved an overall accuracy of 0.9881, precision of 0.9870, recall of 0.9867, and F1 score of 0.9867 on the test set. After field deployment, the model's overall accuracy, precision, recall, and F1 score for view classification were 0.9607, 0.9586, 0.9499, and 0.9530, respectively. The inference time for processing a single TTE video was 105.0 +/- 50.1 ms on a desktop GPU (NVIDIA RTX 3060) and 186.0 +/- 5.2 ms on an edge computing device (Jetson Orin Nano), which basically meets the clinical demand for immediate processing following image acquisition. The TTESlowFast framework proposed in this study demonstrates effective performance in TTE view classification with low inference delay, making it well-suited for various medical scenarios and showing significant potential for practical application.

关键词： echocardiography view classification deep learning engineering deployment

来源：评论

学校读者我要写书评

暂无评论

STREAMING-CAPABLE HIGH-PERFORMANCE ARCHITECTURE OF LEARNED image COMPRESSION CODECS 29

STREAMING-CAPABLE HIGH-PERFORMANCE ARCHITECTURE OF LEARNED I...

引用

IEEE International conference on image processing (ICIP)

作者： Lin, Fangzheng Sun, Heming Katto, Jiro Waseda Univ Sch Fundamental Sci & Engn Tokyo Japan Waseda Univ Waseda Res Inst Sci & Engn Tokyo Japan PRESTO JST 4-1-8 Honcho Saitama Japan

ISBN: (数字)9781665496209

ISBN: (纸本)9781665496209

Learned image compression allows achieving state-of-the-art accuracy and compression ratios, but their relatively slow runtime performance limits their usage. While previous attempts on optimizing learned image codecs focused more on the neural model and entropy coding, we present an alternative method to improving the runtime performance of various learned image compression models. We introduce multi-threaded pipelining and an optimized memory model to enable GPU and CPU workloads' asynchronous execution, fully taking advantage of computational resources. Our architecture alone already produces excellent performance without any change to the neural model itself. We also demonstrate that combining our architecture with previous tweaks to the neural models can further improve runtime performance. We show that our implementations excel in throughput and latency compared to the baseline and demonstrate the performance of our implementations by creating a real-time video streaming encoder-decoder sample application, with the encoder running on an embedded device.

关键词： learned image compression real-time streaming high-performance pipelining

来源：评论

学校读者我要写书评

暂无评论

FPGA-based image processing algorithm design 24

FPGA-based image processing algorithm design

引用

Proceedings of the 2024 International conference on image processing, Multimedia Technology and Maching Learning

作者： Yi Hu Benyuan Chen Hubei University of Technology Wuhan Hubei China

ISBN: (纸本)9798400713880

As one of the most important data sources in machine vision system, camera is facing increasingly strict requirements in video image acquisition, processing and transmission. Through the in-depth analysis of the development trend of the camera, it can be clear that its main pursuit of three goals: improve video image processing performance, reduce costs and enhance flexibility. However, most of the existing video processing platforms have problems such as low data processing efficiency and lack of real-time performance, which is difficult to meet the high performance requirements of modern cameras for video image processing. Because of its high real-time performance, powerful computing power, high integration and flexibility, embedded system has become a promising direction to realize advanced video image processing algorithms. With the embedded platform, the camera system can realize more efficient real-time data processing and higher quality video image output. In the current market, three high-performance embedded processors stand out, namely DSP, ASIC and FPGA. Among them, FPGA chip has become the ideal hardware platform for camera video image processing with its parallel computing capability, rich interface resources and field programmable characteristics. Based on Anlogic EG4S20 FPGA platform, this paper realizes image processing functions, including Bayer format to RGB conversion, gray world algorithm and perfect reflection algorithm. This research makes full use of the powerful computing power of FPGAs to demonstrate image processing and enhancement techniques to address key performance challenges in camera systems.

关键词： FPGA

来源：评论

学校读者我要写书评

暂无评论

PositIV:A Configurable Posit Processor Architecture for image and video processing 25

PositIV:A Configurable Posit Processor Architecture for Imag...

引用

25th Euromicro conference on Digital System Design (DSD)

作者： Ramachandran, Akshat Gustafson, John Roy, Anusua Ansari, Rizwan Ahmed Daruwala, Rohin Veermata Jijabai Technol Inst Mumbai Maharashtra India Vq Res Inc Mumbai Maharashtra India

ISBN: (纸本)9781665474047

image processing is essential for applications such as robot vision, remote sensing, computational photography, augmented reality etc. In the design of dedicated hardware for such applications, IEEE Std 754T floating point (float) arithmetic units have been widely used. While float-based architectures have achieved favorable results, their hardware is complicated and requires a large silicon footprint. In this paper we propose a Posit-based image and video processor (PositIV), a completely pipelined, configurable, image processor using posit arithmetic that guarantees lower power use and smaller silicon footprint than floats. PositIV is able to effectively overlap computation with memory access and supports multidimensional addressing, virtual border handling, prefetching and buffering. It is successfully able to integrate configurability, flexibility, and ease of development with real-time performance characteristics. The performance of PositIV is validated on several image processing algorithms for different configurations and compared against state-of-the-art implementations. Additionally, we empirically demonstrate the superiority of posits in processing images for several conventional algorithms, achieving at least 35-40% improvement in image quality over standard floats.

关键词： posits vision processor image processing computer vision edge detection HDR enhancement

来源：评论

学校读者我要写书评

暂无评论

ACORN: Adaptive Compression-Reconstruction for video Services in 5G-U Industrial IoT 19

ACORN: Adaptive Compression-Reconstruction for Video Service...

引用

19th International conference on Mobility, Sensing and Networking (MSN)

作者： Lei, Jiale Yang, Peihao Kong, Linghe Ma, Yehan Lu, Xingjian Lin, Deyu Chen, Guihai Zhao, E. Shanghai Jiao Tong Univ Dept Comp Sci & Engn Shanghai Peoples R China Shanghai Jiao Tong Univ Yunnan Dali Res Inst Dali Peoples R China East China Normal Univ Coll Comp Sci & Technol Shanghai Peoples R China Aerosp Technol Holding Grp Co Ltd Beijing Peoples R China

ISBN: (纸本)9798350358261;9798350358278

IoT devices are enabled to capture and upload videos with increasing bitrates. Massive IIoT is eager for effective video processing techniques to satisfy the requirements of real-time video services. With the emergence of 5G-unlicensed (5G-U), ultra-low latency video applications become possible. However, existing encoding standards for video services in Web 2.0, such as H.265, are not naturally designed for IIoT video streaming, leading to bandwidth pressure where 5G-U coexists with various other wireless signals. To tackle this problem and to support low-latency video utilization by IIoT video sources, we propose an Adaptive Compression-Reconstruction framework named ACORN, which is based on compressed sensing and recent advances in deep learning. At end nodes, we compress multiple sequential video frames into a single frame to reduce video volume. We design a QoE-aware parameter selection mechanism to deal with volatile network environments during compression. With learnable gated convolution layers and channel-wise soft-thresholding operators, ACORN also builds a real-time reconstruction module. Experimental results reveal that video analytics can be conducted on compressed frames. The reconstruction algorithm in ACORN is with 1-4dB improvements. Moreover, both the encoding time cost and the encoded video volume are reduced by more than 4x under the ACORN framework.

关键词： industrial 5G-U compressive imaging reconstruction video compression industrial Internet of things real-time streaming

来源：评论

学校读者我要写书评

暂无评论

real-time image Super Resolution System Based on Micro-scanning Technology 7th

Real-Time Image Super Resolution System Based on Micro-scann...

引用

7th International Symposium of Space Optical Instruments and Applications, ISSOIA 2022

作者： Li, Dan Wu, Yanan Yang, Dandan Meng, Sen Li, Fengfan Maillard, F. Claeyssen, F. IAE Industrial Group Co. Ltd. Shanghai China CEDRAT TECHNOLOGIES Meylan38240 France

ISBN: (纸本)9789819940974

Different from the recent popular super resolution system based on AI technology which needs normally massive training datasets, the micro-scanning super resolution system by integrating the high-precision mechanism and the image processing system can overpass the training datasets limit to enhance the image resolution and quality considerably in real time. Such a real-time image super resolution system named Quick Demo Station (QDS) system is presented. This system is jointly developed by Shanghai IAE, China, in cooperation with Cedrat technologies, France. The system is made of two modules, one imaging module (MicroScan_VIS_Module), the other image processing module (a portable image processing workstation). The image processing module performs imaging module’s control, as well as image acquisition, image registration, image super-resolution reconstruction, image contrast enhancement. Here the algorithm used to restore the low-resolution (LR) images is the iterative reconstruction method, which could achieve high quality super-resolution (SR) results. The output is a video stream whose frame rate is higher than 25 frames per second. The system is being tested in different applications and show the excellent super-resolution results, including low light level conditions and outdoor variable lighting scenarios. Furthermore, the algorithm in image processing module can be used not only on the workstations, but also in embedded processing system. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： image reconstruction

来源：评论

学校读者我要写书评

暂无评论

Context-Aware Recognition of Elevator Buttons Using a Sequential Training Methodology 24

Context-Aware Recognition of Elevator Buttons Using a Sequen...

引用

24th International conference on Control, Automation and Systems, ICCAS 2024

作者： Ghosh, Arpan Joo, Kyeong-Jin Giraldo, Gilberto Galvis Kuc, Tae-Yong Sungkyunkwan University Department of Electrical and Computer Engineering Suwon16419 Korea Republic of Sungkyunkwan University Department of Electronic and Electrical Engineering Suwon16419 Korea Republic of

ISBN: (纸本)9788993215380

In this paper, we present a sequential training methodology aimed at improving the recognition of elevator buttons using the YOLOv5 object detection model. The methodology is structured into three distinct phases. In the first phase, we generate a synthetic dataset where elevator buttons, cropped from their original context, are placed on random image backgrounds. This phase is designed to help the model learn to identify buttons independently of their surroundings, ensuring a foundational understanding of button features without contextual distractions. In the second phase, we augment the cropped button dataset by applying various transformations such as random flips, rotations, and scaling. These augmentations increase the diversity and robustness of the training data, allowing the model to generalize better to variations in button appearances. The final phase involves training the model on images of full elevator panels. This step is crucial for helping the model understand the contextual placement and spatial relationships of the buttons within the panel, which is essential for accurate detection in real-world scenarios. Additionally, we enhance the real-time video input exposure to improve visibility under varying lighting conditions, addressing common challenges faced in practical applications. For post-processing, we integrate a Channel and Spatial Reliability Tracker (CSRT) to maintain button-tracking consistency in video sequences. This tracker helps ensure that once a button is detected, its position is reliably followed across frames, improving the overall accuracy and reliability of the system. This comprehensive approach, which combines the use of synthetic data, extensive data augmentation techniques, and contextual training on full panel images, aims to better simulate real-world scenarios. As a result, the proposed methodology significantly enhances the robustness and reliability of the YOLOv5 model in recognizing elevator buttons under diverse condi

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

Faster image Deblurring for Unmanned Aerial Vehicles 2

Faster Image Deblurring for Unmanned Aerial Vehicles

引用

2nd International conference on Unmanned Vehicle Systems (UVS-Oman)

作者： Sineglazov, Viktor Lesohorskyi, Kyrylo Chumachenko, Olena Natl Aviat Univ Dept Aeronavigat Elect & Telecommun Kiev Ukraine Natl Tech Univ Ukraine Dept Artificial Intelligence IASA Igor Sikorsky Kyiv Polytech Inst Kiev Ukraine

ISBN: (纸本)9798350372557

This work is devoted to the development of a novel deep learning encoder-decoder algorithm for real-time noise and blur elimination in video frames, received from UAV. This work improves on existing algorithms by providing a more flexible blind deblurring solution than existing kernel-based methods. The proposed method can be applied to both improve the drone operator's capabilities and to improve the performance of autonomous image processing tasks, such as object identification and visual navigation systems. Different types of blur as well as possible types of noise are presented. A brief overview of existing methods is provided. The problem of frame alignment due to the object's movement and associated noise is considered. Existing deblurring and image restoration methods are reviewed, including state-of-the-art. Their limitations are highlighted. To solve the limitations a method based on a fully convolutional encoder-decoder network with residual connections is presented. Dataset generation and training procedures are discussed. The approach is then compared to existing state-of-the-art deep learning methods. The proposed method enables up to 9 times faster blind image restoration with comparable quality in comparison to existing state-of-the-art image restoration methods.

关键词： convolutional neural networks decoder-encoder blind deblurring unmanned aerial vehicles blur kernels

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：