检索结果-内蒙古大学图书馆

Low Complexity Continuous Rain Density Estimation Using RISTA Technique for Single image Rain Streak Removal 11

Low Complexity Continuous Rain Density Estimation Using RIST...

11th IEEE International conference on Consumer Electronics - Taiwan (ICCE-Taiwan) - Empower of Innovative Consumer Technology

作者： Chao, Ko-Yang Lee, Yu-Hsuan Yuan Ze Univ Dept Elect Engn 135 Yuandong Rd Taoyuan 320 Taiwan

ISBN: (纸本)9798350386851;9798350386844

image deraining is an image processing technique to restore a rainy image back to rain-free one. Iterative Soft Thresholding Algorithm (ISTA) is an iterative procedure to estimate sparsity. ISTA demands considerable computation burdens of Floating Point Operations (FLOPs), limiting its applications for real-time requirement. The proposed Restored-Feature Iterative Soft Thresholding Algorithm (RISTA) provides a low complexity approach to reduce FLOPs. Experiment results demonstrate this work has a computation saving ratio up to 66%, while maintaining a Peak Signal-to-Noise Ratio (PSNR) error as low as 6%.

关键词： image reconstruction

来源：评论

学校读者我要写书评

暂无评论

Quality-power configurable flexible coding order hardware design for real-time 3D-HEVC intra-frame prediction

引用

JOURNAL OF real-time image processing 2022年第5期19卷 969-984页

作者： Perleberg, Murilo R. Afonso, Vladimir Borges, Vinicius A. Zatt, Bruno Agostini, Luciano V. Porto, Marcelo Video Technol Res Grp ViTech Pelotas RS Brazil Fed Univ Pelotas UFPel Pelotas RS Brazil Sul Rio Grandense Fed Inst IFSul Pelotas RS Brazil

The emerging of 3D video-capable embedded mobile devices is expected due to the popularization of multimedia services and the demand for novel immersive video technologies. Such devices require efficient hardware-friendly heuristics to deal with strict processing requirements and limited energy supply. To contribute to these requirements, this work presents a complete 3D-HEVC intra-frame prediction hardware design that supports a flexible coding order between texture and depth channels. The developed hardware employs hardware-friendly constraints and novel heuristics to explore inter-channel redundancies and to reduce the computational effort through the novel inter-channel directional structure detector heuristic. The designed 3D-HEVC intra-frame prediction system dissipates 384.6 mW while processing three HD 1080p views (texture + depth) at 30 frames per second in real-time. To the best of our knowledge, this is the first work to propose a complete 3D-HEVC intra-frame prediction system with support to flexible coding order. In addition, this is the only hardware design to process luminance and chrominance texture channels and depth channel.

关键词： Intra-frame prediction 3D-HEVC DMM DIS Flexible coding order

来源：评论

学校读者我要写书评

暂无评论

UVEB: A Large-scale Benchmark and Baseline Towards real-World Underwater video Enhancement

UVEB: A Large-scale Benchmark and Baseline Towards Real-Worl...

引用

IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR)

作者： Xie, Yaofeng Kong, Lingwei Chen, Kai Zheng, Ziqiang Yu, Xiao Yu, Zhibin Zheng, Bing Ocean Univ China Coll Elect Engn Qingdao Peoples R China Ocean Univ China Sanya Oceanog Inst Key Lab Ocean Observat & Informat Hainan Prov Qingdao Peoples R China Hong Kong Univ Sci & Technol Dept Comp Sci & Engn Hong Kong Peoples R China

ISBN: (纸本)9798350353006

Learning-based underwater image enhancement (UIE) methods have made great progress. However, the lack of large-scale and high-quality paired training samples has become the main bottleneck hindering the development of UIE. The inter-frame information in underwater videos can accelerate or optimize the UIE process. Thus, we constructed the first large-scale high-resolution underwater video enhancement benchmark (UVEB) to promote the development of underwater vision. It contains 1,308 pairs of video sequences and more than 453,000 high-resolution with 38% Ultra-High-Definition (UHD) 4K frame pairs. UVEB comes from multiple countries, containing various scenes and video degradation types to adapt to diverse and complex underwater environments. We also propose the first supervised underwater video enhancement method, UVE-Net. UVE-Net converts the current frame information into convolutional kernels and passes them to adjacent frames for efficient inter-frame information exchange. By fully utilizing the redundant degraded information of underwater videos, UVE-Net completes video enhancement better. Experiments show the effective network design and good performance of UVE-Net.

关键词： underwater video dataset underwater video enhancement underwater vision video processing

来源：评论

学校读者我要写书评

暂无评论

ParaEyes: The Smart Eye Bot System for Paralyzed Patients

引用

JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS 2025年

作者： Al Jaziri, Maryam Al Ali, Ghalia Al Swailmeen, Hind El-Moursy, Ali A. Sibai, Fadi N. Univ Sharjah Dept Comp Engn Sharjah U Arab Emirates Gulf Univ Sci & Technol Dept Elect & Comp Engn Mubarak Al Abdullah Kuwait

Locked-in Syndrome (LIS) poses significant challenges for individuals experiencing complete paralysis, who must rely solely on eye and head movements and hearing to communicate. The main issue arises from the urgent need for a customized communication solution due to the severe limitations LIS patients encounter in interacting with their surroundings. This difficulty extends beyond physical constraints, greatly impacting their overall quality of life and psychological well-being. To address this complex challenge, we have developed an innovative approach that integrates advanced eye-tracking technology, Natural Language processing (NLP) and Artificial Intelligence (AI). This holistic solution not only restores communication capabilities but also provides crucial support for the mental health and psychological well-being of LIS patients, offering a ray of hope for a better future. Beyond addressing communication challenges, our proposal also focuses on improving the mental health of LIS patients through interactive communication either with surrounding people or AI bots. Our solution, named "ParaEyes" utilizes webcam-detected eye movements to navigate the communication interface, employing real-time video image processing using Python. Additionally, users can engage with various AI bots, including ChatGPT, YouTube Bot and ParaEyes Visual Bot, based on their preferences. ParaEyes achieves a notably 2X faster average setup time and on average 10% higher accuracy compared to the open-source alternatives. This approach enhances communication skills and mental health support while also being inspired by ChatGPT's robust measures to safeguard user data and ensure user privacy through tailored interactions and responsive functionalities.

关键词： LIS natural language processing AI ChatGPT YouTube Visual Bot

来源：评论

学校读者我要写书评

暂无评论

real-time Reconstruction of 3D Space using Holographic video 15

Real-Time Reconstruction of 3D Space using Holographic Video

引用

15th International conference on Computing Communication and Networking Technologies, ICCCNT 2024

作者： Deshpande, Vivek Anandha Silambarasan, D. Kalra, Hitesh Reena, R. Gupta, Shivangi Manjunath, C. Vishwakarma Institute of Technology Department of Computer Engineering Maharashtra Pune India Karpagam Academy of Higher Education Department of Computer Science Engineering Coimbatore641021 India Chitkara University Centre of Research Impact and Outcome Punjab Rajpura140417 India Prince Shri Venkateshwara Padmavathy Engineering College Department of Information Technology Chennai127 India Quantum University Research Center Quantum University India School of Engineering and Technology Karnataka Bangalore India

ISBN: (纸本)9798350370249

The real-time Reconstruction of 3-D space and the usage of holographic video entail the capture and processing of films that have a three-dimensional shape in them. The manner of capturing the 3-dimensional shape includes: using a camera with the functionality to seize and technique the holographic photograph in a composite format, taking multiple photos, merging them into one hologram. Then, the photographs are processed in real-time with the usage of laptop imaginative and prescient algorithms for function extraction, segmentation, and 3-D object popularity. This application of the laptop, imaginative and prescient, allows the person to navigate and engage with the three-D illustration that is reconstructed in actual time. The reconstructed 3D scene is then rendered in the shape of an interactive 360-degree holographic video. This technology can provide a couple of applications in augmented and virtual reality gaming and immersive 3-D visualization of the captured environment. © 2024 IEEE.

关键词： Holograms

来源：评论

学校读者我要写书评

暂无评论

A Media-Pipe Integrated Deep Learning Model for ISL (Alphabet) Recognition and Converting Text to Sound with video Input 3rd

A Media-Pipe Integrated Deep Learning Model for ISL (Alphabe...

引用

3rd International conference on Applied Intelligence and Informatics (AII)

作者： Mukundan, T. M. Vishnu Gadhiya, Aryan Nadar, Karthik Gagrani, Rishita Basha, Niha Kamal Vellore Inst Technol Vellore 632014 Tamil Nadu India

ISBN: (纸本)9783031686382;9783031686399

The present study showcases a novel deep learning-based vision application tasked with reducing the communication gap between sign language and non-sign language users. Speech and hearing impairments are a type of disability that restricts an individual's ability to communicate with others properly. Modern-day automation tools can be used to address this communication gap and allow people to communicate ubiquitously and in a variety of situations. The method defined in the paper involves loading a video file, extracting each frame, and detecting the hand landmarks in each frame using the Media-Pipe library. Then the frame is cropped, and the region of interest is pre-processed and stored in a new data directory for training purposes. The pre-processing involves the use of Gaussian blur, edge detection, morphological transformations, and signal processing functions. Data augmentation is then performed, and images are saved in a new directory. The images are then used to train a custom CNN model, which contains four convolutional layers along with two fully connected layers. The model is compiled using the categorical cross-entropy loss function, optimised using the RMSprop optimiser, and then evaluated using the evaluation metric, accuracy. The predicted sign language alphabet is displayed on the screen and is converted to speech using the Google Text-to-Speech library. The model achieves an overall accuracy of 93.96%. The findings indicate that the proposed approach can serve as a road map to develop a real-time system capable of sign language recognition and Direct future investigations in this domain.

关键词： Augmentation Sign Language to Text conversion image processing Gesture recognition Deep Learning Text to Audio conversion

来源：评论

学校读者我要写书评

暂无评论

real-time Algorithm for Light Gray Smoke Detection in video Sequences 8th

Real-Time Algorithm for Light Gray Smoke Detection in Video ...

引用

8th International conference on Computing, Control and Industrial Engineering, CCIE 2024

作者： Adamovskiy, Y. Bohush, R. Polotsk State University Polotsk Belarus

ISBN: (纸本)9789819769339

An algorithm for video-based outdoor light gray smoke early detection has been developed by a complex set of features. This algorithm provides real-time processing for high-resolution video. For this purpose, preliminary smoke regions of interest are extracted based on motion detection and color segmentation in HSV color space. Spatio-temporal analysis is applied to the identified areas on the video sequence: calculation of parameters of high-frequency components and contrast. This approach allows us to identify areas where smoke hides background elements. The result of this step is refined regions of interest. The final step is to estimate the direction of motion in these candidate regions using the optical flow method, analyzing the change of motion vectors over time is taken into account. The results of experimental studies to evaluate the algorithm accuracy and its performance are presented. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

关键词： Optical flows

来源：评论

学校读者我要写书评

暂无评论

High-performance architecture for real-time high-definition short-wave infrared streaming video processing and its field programmable gate array prototype

引用

OPTICAL ENGINEERING 2024年第2期63卷 023103-023103页

作者： Zhou, Feng Chen, Zhiqiang Xie, Qingsheng Kong, Fanzi Chen, Yaohong Wang, Huawei Chinese Acad Sci Xian Inst Opt & Precis Mech Xian Peoples R China Univ Chinese Acad Sci Beijing Peoples R China Xian Key Lab Spacecraft Opt Imaging & Measurement Xian Peoples R China

. image detail enhancement is critical to the performance of short-wave infrared (SWIR) imaging systems. Recently, the requirement for real-time processing of high-definition (HD) SWIR video has shown rapid growth. Nevertheless, the research on field programmable gate array (FPGA) implementation of HD SWIR streaming video processing architecture is relatively few. This work proposes a real-time FPGA architecture of SWIR video enhancement by combining the difference of Gaussian filter and plateau equalization. To accelerate the algorithm and reduce memory bandwidth, two efficient key architectures, namely edge information extraction and equalization and remapping architecture, are proposed to sharpen edges and improve dynamic range. The experimental results demonstrated that the proposed architecture achieved a real-time processing of 1280 x 1024@60Hz with 2.7K lookup tables, 2.5K Slice Reg, and about 350 kb of block RAM consumption, and their utilization reached 12.5%, 19.2%, and 12.5% for the XC7A200T FPGA board, respectively. Moreover, the proposed architecture is fully pipelined and synchronized to the pixel clock of output video, meaning that it can be seamlessly integrated into diverse real-time video processing systems.

关键词： high-definition short-wave infrared video processing infrared image enhancement field programmable gate arrays

来源：评论

学校读者我要写书评

暂无评论

Flow Control Solution to Avoid Bottlenecks in Edge Computing for video Analytics 9

Flow Control Solution to Avoid Bottlenecks in Edge Computing...

引用

9th IEEE International conference on Fog and Mobile Edge Computing (FMEC)

作者： Vainio, Antero Tarkoma, Sasu Univ Helsinki Helsinki Finland

ISBN: (纸本)9798350366495;9798350366488

In this article we present a new approach to scaling edge server based real-time video analytics by utilizing a novel flow control mechanism that we call 'Pace Steering' (PS). In contrast to server-side scheduling, flow control enables the server to control the frame rate of the connected streams, thus extending its control to also the network traffic. By exploiting the fact that video analytics applications have a constant frame rate for each client, and a predictable inference time for the frame processing, Pace Steering is able to avoid server-side queueing and balance the network load in a shared wireless access link, leading to a better utilization of both the compute and the network resources. We provide a mathematical analysis to show how to synchronize video streams with delays based on server state information. We then show how queueing time of a request provides an ideal synchronization delay, and then extend this idea to consider batching for higher throughput. We evaluate our approach with benchmarks in a physical testbed using a commodity Wi-Fi. The results show that PS enables up to 80 concurrent 10 frames per second (FPS) streams to be served without latency requirement violations for 95% of the sent frames, which is twice as many streams as without PS.

关键词： Wi-Fi

来源：评论

学校读者我要写书评

暂无评论

Patient-Centric Hand-Gesture Driven Wheelchair Control Using Convolutional Neural Networks for Enhanced Healthcare Mobility 3

Patient-Centric Hand-Gesture Driven Wheelchair Control Using...

引用

3rd International conference on Disruptive Technologies, ICDT 2025

作者： Kumar, Aditya Tomar, Ritika Raghav, Divyansh Yadav, Arvind Galgotias University Department of Computer Science and Engineering Uttar Pradesh Greater Noida India Vellore Institute of Technology Department of Electrical Engineering Tamil Nadu Vellore India

ISBN: (纸本)9798331519582

This work presents a smart wheelchair system whereby users may operate it in an intuitive and hands-free manner by using a Convolutional Neural Network (CNN) for hand gesture recognition in real-time. With programmed hand signals, the smart wheelchair substitutes for conventional button-operated controllers, so translating users' mobility and freedom despite restricted physical capability. This system's basic components are an L298 motor driver to regulate wheelchair movement, a Raspberry Pi for real-time image processing, and an 8x8 LED matrix visual feedback model. Data preprocessing, supervised learning model training, and testing in many environmental settings improves the gesture recognition accuracy. Testing reveals this system offers fast response times, accurate gesture detection, and consistent control. This work supports assistive technology by giving wheelchair users a flexible, adaptive, user-friendly control mechanism. Future research should look at expanding the gesture variety, improving CNN performance for faster real-time processing, and integrating sophisticated safety elements-such as obstacle detection and voice commands. © 2025 IEEE.

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：