检索结果-内蒙古大学图书馆

European Signal processing conference (EUSIPCO)

作者： Tero Partanen Miika Kotajärvi Alexandre Mercat Jarno Vanne Ultra Video Group Tampere University Tampere Finland

ISBN: (数字)9789464593617

ISBN: (纸本)9798331519773

The huge computation burden of state-of-the-art video coding technologies can be mitigated with Region-of-Interest (ROI) techniques that limit the highest coding effort to salient regions. However, the complexity overhead of saliency detection can easily cancel out the speed gain of ROI coding. This work introduces a lightweight ROI tracking technique that can be used in place of compute-intensive ROI detection to guide a video encoder in inter coding. Low computational overhead is achieved by feeding motion vectors (MVs) of a video encoder back to our neural network that is trained for accurate estimation of ROI movement and size changes. The network training is carried out with our new dataset that is also released in this work to foster the development of head tracking techniques in applications like video conferencing. Our experimental results demonstrate substantial speedups with minimal accuracy tradeoffs over traditional salient object detection (SOD) methods. In scenarios, where a single ROI is tracked with a 64-frame detection interval, our solution obtains up to 50-fold speedup with accuracy of 87% and an average ROI center error of 16 pixels. These results confirm that our ROI tracking approach is a potential technique for low-cost and low-power streaming media applications.

关键词： video coding Training Accuracy Tracking Streaming media Signal processing Encoding Vectors Saliency detection videoconferences

来源：评论

学校读者我要写书评

暂无评论

EgpuIP: An Embedded GPU Accelerated Library for image processing 24

EgpuIP: An Embedded GPU Accelerated Library for Image Proces...

引用

24th IEEE International conference on High Performance Computing and Communications, 8th IEEE International conference on Data Science and Systems, 20th IEEE International conference on Smart City and 8th IEEE International conference on Dependability in Sensor, Cloud and Big Data Systems and Application, HPCC/DSS/SmartCity/DependSys 2022

作者： Wang, Luhan Jia, Haipeng Zhang, Yunquan Li, Kun Wei, Cunyang Institute of Computing Technology Chinese Academy of Sciences State Key Lab of Processors Beijing China School of Computer Science and Technology University of Chinese Academy of Sciences Beijing China Microsoft Research China

ISBN: (纸本)9798350319934

With the advances of embedded GPUs' programming models like GLES and OpenCL, the mobile processor has gained more parallel computing capability, which enables real-time image processing on portable devices. GLES is an excellent option to implement high-performance image processing on embedded GPUs due to its low hardware overhead. However, most GLES studies only focus on porting specific algorithms to embedded GPUs without a general optimization guide. In order to address the pending problems, this paper presents effective performance optimization chains to guide the optimization of image processing algorithms by using GLES. The image processing algorithms can be divided into three modes including data-independent, data-sharing and data-related according to their memory access and calculation characteristics. Based on this classification, our optimization chains contain four optimization directions. 1) Optimizing access to off-chip memory for the memory-bound data-independent algorithm;2) Exploiting data locality by utilizing shared memory and cache for the data-sharing algorithm;3) Redesigning the algorithm to optimize the sharing of computational results between threads for the data-related algorithm;4) Making full use of computation resource for the above three algorithms. Based on these optimization chains, we design an embedded GPU accelerated image processing library, EgpuIP. To evaluate specific optimization methods and performance improvements, we employ histogram equalization, Gaussian pyramid and integral filter as the representative algorithm from EgpuIP and adequately optimize them guided by the proposed optimization chains. Compared to the OpenCV, experiments show that the three algorithms in EgpuIP provide up to 19×88×, and 3× speedup respectively. © 2022 IEEE.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

NRspttemVQA: real-time video Quality Assessment Based on the User’s Visual Perception

NRspttemVQA: Real-Time Video Quality Assessment Based on the...

引用

International conference on image and Vision Computing New Zealand, IVCNZ

作者： Anastasia Mozhaeva Vladimir Mazin Michael J. Cree Lee Streeter The University of Waikato Moscow Technical University of Communications and Informatics

There is a strong need for non-reference video quality metrics for user-generated video content to prevent loss of video quality caused by distortion during recording, compression, and signal transmission. Here we contribute to advancing the issue of streaming quality by creating a large-scale dataset with video compression and transmission artefacts. Our final dataset consists of 4.1 million video quality perceptual thresholds by users. We also created a new first non-reference video quality metric that includes the psychophysical features of the user’s video experience, which provides stability in predicting the user’s subjective rating of a video. Our experimental results show that the proposed video quality metric achieves the most stable performance on three independent video datasets. We believe our study will expand further research into deep learning-based video quality metrics modelling.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Automatic Billing System using Machine Learning

Automatic Billing System using Machine Learning

引用

Innovations in High Speed Communication and Signal processing (IHCSP), International conference on

作者： Prajakta Musale Srushti Sapkal Rushikesh Sarawade Yash Sankh Aashish Sankalecha Sarthak Kamble Sarvesh Nandanwar Vishwakarma Institute of Technology Pune Maharashtra India

ISBN: (数字)9798350368949

ISBN: (纸本)9798350368956

In the rapidly evolving landscape of digital transactions, the efficiency and accuracy of billing systems are paramount. The checkout process, from the ordinary retail shop and online shops, demands speed and accuracy in digital transactions. AutoBill is an innovation of a new AI checkout system that will transform the current retail checkout method through machine learning with additional powers from image recognition and real-time data storage. It is built on the Raspberry Pi platform while using TensorFlow’s object detection model but uses Google’s Teachable Machine. It integrates MongoDB for dynamic data handling. AutoBill automatically removes the human factor of billing, providing a seamless and efficient solution that reduces manual intervention enhancing overall transaction accuracy. The paper details a system design, implementation, and performance evaluation that shows promise of revolutionizing contactless shopping experiences.

关键词： Weight measurement Technological innovation Accuracy image recognition Memory Machine learning Transforms Signal processing real-time systems System analysis and design

来源：评论

学校读者我要写书评

暂无评论

Tape Barcode Recognition and Plane Positioning Based on image processing

Tape Barcode Recognition and Plane Positioning Based on Imag...

引用

Internet of Things, Automation and Artificial Intelligence (IoTAAI), International conference on

作者： Wenping Qi Yong Zeng Shuai Zhang Liutong Li China Centre for Resources Satellite Data and Application Beijing China

ISBN: (数字)9798350386974

ISBN: (纸本)9798350386981

In order to achieve unattended tape storage management, this article designs a tape barcode recognition and positioning technology based on video and image. The algorithm uses the YOLOV5s network model to quickly recognize the tape barcode in the image and the QR code used to record the actual position of the plane, and then uses the ZBAR toolkit to decode it. Finally, the QR code is used as a geometric correction control point for the image, and the actual position of the plane is calculated by the pixel position of the tape barcode. Through testing on the dataset, the recognition time for each tape is 0.007s, and the detection rate refers to the proportion of tapes that can be found and correctly recognized, which is 97.13%. This achieves the goal of efficient inventory of tapes. At the same time, when the position of the camera remains unchanged, after geometric correction processing of the image, the positioning accuracy of the tape can reach 99.99%. From the experimental results, it can be seen that this article has achieved its goal well.

关键词： YOLO Wireless communication image recognition Automation Accuracy Magnetic resonance imaging Storage management QR codes Decoding Testing

来源：评论

学校读者我要写书评

暂无评论

Temporal Clustering and Temporal Reference Based Specular Detection For 1-MS Visual Feedback System

Temporal Clustering and Temporal Reference Based Specular De...

引用

IEEE International conference on image processing

作者： Tingting Hu Ryuji Fuchikami Shigekiyo Nosaka Panasonic Connect Co. Ltd. Research and Development Division Fukuoka-shi Japan

ISBN: (数字)9798350349399

ISBN: (纸本)9798350349405

The 1-ms visual feedback system is critical for seamless actuation in robotics, as any delay affects its performance in handling dynamic situations. Specular reflections cause problems in many visual technologies, making specular detection crucial in 1-ms visual feedback systems. However, existing real-time methods, which target Neumann architecture, fail to achieve the 1-ms delay due to spatial memory paths resulting from extensive frame-based processing. This research aims to develop a 1-ms specular detection system from both algorithm and architecture perspectives, proposing 1) temporal clustering and temporal reference based specular detection method, which leverages temporal domain information to address the requirements of frame-based processing; and 2) global-local integrated specular detection architecture, which enables the coexistence of local and global processing within a 1-ms stream-based architecture. The proposed methods are implemented on FPGA. The evaluation shows that the proposed system supports sensing and processing a $1000-\mathrm{fps}$ sequence with a delay of $0.941 \mathrm{~ms} /$ frame.

关键词： Visualization Parallel processing Robot sensing systems real-time systems Reflection Delays Systems support

来源：评论

学校读者我要写书评

暂无评论

An Efficient Method for real-time image Exposure Correction

An Efficient Method for Real-Time Image Exposure Correction

引用

IEEE Visual Communications and image processing (VCIP)

作者： Jie Yang Yuantong Zhang Daiqin Yang Zhenzhong Chen School of Remote Sensing and Information Engineering Wuhan University Wuhan China

Exposure errors in images, including both underexposure and overexposure, significantly diminish images’ contrast and visual appeal. Existing deep learning-based exposure correction methods either require large networks or longer processing time for inference and are thus not applicable for embedded devices and real-time applications. To address these issues, a lightweight network is proposed in this paper to correct exposure errors with limited memory occupation and inference steps. It adopts the Laplacian pyramid to incrementally recover the color and details of the image through a layer-by-layer procedure. A structural re-parameterization structure is designed to both reduce model size for inference speed up and improve performance with a multi-branch learning structure. Extensive experiments demonstrate that our method achieves a better performance-efficiency trade-off than other exposure correction methods.

关键词：

来源：评论

学校读者我要写书评

暂无评论

real-time Multi-person Eyeblink Detection in the Wild for Untrimmed video

Real-time Multi-person Eyeblink Detection in the Wild for Un...

引用

conference on Computer Vision and Pattern Recognition (CVPR)

作者： Wenzheng Zeng Yang Xiao Sicheng Wei Jinfang Gan Xintao Zhang Zhiguo Cao Zhiwen Fang Joey Tianyi Zhou Key Laboratory of Image Processing and Intelligent Control Ministry of Education School of Artificial Intelligence and Automation Huazhong University of Science and Technology Wuhan China School of Biomedical Engineering Southern Medical University Guangzhou China Department of Rehabilitation Medicine Zhujiang Hospital Southern Medical University Guangzhou China Centre for Frontier AI Research Agency for Science Technology and Research (A*STAR) Singapore Institute of High Performance Computing Agency for Science Technology and Research (A*STAR) Singapore

real-time eyeblink detection in the wild can widely serve for fatigue detection, face anti-spoofing, emotion analysis, etc. The existing research efforts generally focus on single-person cases towards trimmed video. However, multi-person scenario within untrimmed videos is also important for practical applications, which has not been well concerned yet. To address this, we shed light on this research field for the first time with essential contributions on dataset, theory, and practices. In particular, a large-scale dataset termed MPEblink that involves 686 untrimmed videos with 8748 eyeblink events is proposed under multi-person conditions. The samples are captured from uncon-strainedfilms to reveal “in the wild“ characteristics. Meanwhile, a real-time multi-person eyeblink detection method is also proposed. Being different from the existing counter-parts, our proposition runs in a one-stage spatio-temporal way with end-to-end learning capacity. Specifically, it simultaneously addresses the sub-tasks of face detection, face tracking, and human instance-level eyeblink detection. This paradigm holds 2 main advantages: (1) eyeblink features can be facilitated via the face's global context (e.g., head pose and illumination condition) with joint optimization and interaction, and (2) addressing these sub-tasks in parallel instead of sequential manner can save time remarkably to meet the real-time running requirement. Experiments on MPEblink verify the essential challenges of real-time multi-person eyeblink detection in the wild for untrimmed video. Our method also outperforms existing approaches by large margins and with a high inference speed.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Shadow-Assisted Moving Target Tracking Based on Multidiscriminant Correlation Filters Network in video SAR

引用

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS 2023年 20卷 1页

作者： Fang, Hui Liao, Guisheng Liu, Yongjun Zeng, Cao Xidian Univ Natl Key Lab Radar Signal Proc Xian 710071 Peoples R China

Moving targets always defocus and shift outside the scene in video synthetic aperture radar (video SAR) image sequences. However, the shadows of moving targets are immune to these issues and can reveal the true position of the moving targets. As such, by tracking the shadows of moving targets in the video SAR image sequence, it becomes feasible to keep track of these targets. Nevertheless, due to the small pixel size and time-varying characteristics of the target shadow, current prevailing tracking methods often prove insufficient for direct tracking of the shadow. In this letter, a shadow-assisted tracking method for moving targets based on a multilevel discriminant correlation filters network (MDCFnet) is proposed. Primarily, we designed a reverse feature pyramid network (RFPN) that integrates multiple high-level features into low-level features to obtain multiple features with higher distinguishability and resolution, thereby enhancing the final tracking accuracy and precision. Furthermore, we devised multilevel discriminant correlation filters (MDCFs) to perform filtering tracking under multiple feature maps. real dataset processing results are provided to demonstrate that the proposed method outperforms other state-of-the-art methods.

关键词： Index Terms- Convolutional neural network (CNN) discrim-inant correlation filters (DCFs) moving target tracking video synthetic aperture radar (video SAR)

来源：评论

学校读者我要写书评

暂无评论

An Effective Approach for Violence Detection using Deep Learning and Natural Language processing 7

An Effective Approach for Violence Detection using Deep Lear...

引用

7th International Multi-Topic ICT conference: IMTIC 2023

作者： Kumari, Versha Memon, Khuhed Aslam, Burhan Chowdhry, Bhawani Shankar Mehran University of Engineering and Technology Department of Electronic Engineering Jamshoro Pakistan Mehran University of Engineering and Technology NCRA-CMS Lab Jamshoro Pakistan

ISBN: (纸本)9798350338461

An effective tool for violence detection is highly demanded to examine the rise in crime rate in today's era. Artificial Intelligence can play a significant role in violence detection and monitoring to tackle various problems of security and safety concerns. This research proposes strategies to incorporate Deep Learning and Natural Language processing (NLP) to simultaneously detect anomalous objects and scenarios from videos using TensorFlow and aggressive, offensive, and hate speech from an audio channel of surveillance cameras. This research aims to automatically detect violence in real-time from surveillance footage by using TensorFlow custom object detection upon identification of firearms, robbery, fistfights, sexual harassment, and fire in successive images from the video feed. In addition, the audio channel of such surveillance cameras can also be significantly fruitful in detecting hate speech, verbal sexual abuse, and profanity. The proposed system includes an alert mechanism that detects any type of violence and automatically notifies the security administrator, enabling timely intervention to prevent potential damage to society. The developed models can be deployed on any existing surveillance system with next to negligible additional hardware and software resource requirements, thereby making it an efficient, fast, accurate, and economical solution. To train the model, custom datasets were designed for 6 categories in images and 2 categories in speech. The accuracy of the developed system was found to be 84%, with adequate performance under various luminance conditions, including night vision images. © 2023 IEEE.

关键词： Object detection

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：