检索结果-内蒙古大学图书馆

2024 International conference on image processing and Artificial Intelligence, ICIPAl 2024

作者： Liu, Rui Jia, Zhenhong Huang, Xiaohui Wang, Jiajia Zhou, Gang Shi, Fei School of Computer Science and Technology Xinjiang University Urumqi830046 China Key Laboratory of Signal Detection and Processing Xinjiang Uygur Autonomous Region Xinjiang University Urumqi830046 China

ISBN: (纸本)9781510681514

video surveillance requires simultaneous monitoring of multiple areas. Consequently, real-time automatic change detection of the monitored areas becomes very important. In the context of wide field-of-view conditions, the combination of a wide field-of-view, intricate environmental factors, and a substantial presence of random noise can lead to the degradation of visual fidelity and a diminished signal-to-noise ratio in the video images acquired through the image sensor. As a consequence, the task of detecting subtle changes becomes challenging for the surveillance system. To address the above problems, we have proposed a change detection method that leverages improved difference images and super fast and robust fuzzy c-means with constraints clustering. Initially, we employ an improved log-ratio operator and an improved mean-ratio operator to generate two distinct difference images. Subsequently, the wavelet fusion algorithm is applied to merge these two difference images, effectively integrating their distinctive features and producing a fused difference image with differentiability. Then, the new difference image is subjected to soft threshold primary classification and a cumulative distribution function normalization to obtain the difference image after primary classification. Finally, the super fast and robust fuzzy c-means with constraints clustering algorithm is employed for the ultimate classification, enabling the separation of changed and unchanged areas within the image. © 2024 SPIE.

关键词： Clustering algorithms

来源：评论

学校读者我要写书评

暂无评论

Reinforcement Learning for Adaptive video Compressive Sensing

引用

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY 2023年第5期14卷 1-21页

作者： Lu, Sidi Yuan, Xin Katsaggelos, Aggelos K. Shi, Weisong William & Mary Dept Comp Sci McGlothlin St Hall 126251 Jamestown Rd Williamsburg VA 23185 USA Westlake Univ Sch Engn 600 Dunyu Rd Hangzhou 310030 Zhejiang Peoples R China Northwestern Univ Dept Elect & Comp Engn 2145 Sheridan RdTech Room M468 Evanston IL 60208 USA Univ Delaware Dept Comp & Informat Sci Smith Hall18 Amstel Ave Newark DE 19716 USA

We apply reinforcement learning to video compressive sensing to adapt the compression ratio. Specifically, video snapshot compressive imaging (SCI), which captures high-speed video using a low-speed camera is considered in this work, in which multiple (B) video frames can be reconstructed from a snapshot measurement. One research gap in previous studies is how to adapt B in the video SCI system for different scenes. In this article, we fill this gap utilizing reinforcement learning (RL). An RL model, as well as various convolutional neural networks for reconstruction, are learned to achieve adaptive sensing of video SCI systems. Furthermore, the performance of an object detection network using directly the video SCI measurements without reconstruction is also used to perform RL-based adaptive video compressive sensing. Our proposed adaptive SCI method can thus be implemented in low cost and real time. Our work takes the technology one step further towards real applications of video SCI.

关键词： image processing compressive sensing reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

STREAMVC: real-time LOW-LATENCY VOICE CONVERSION 49

STREAMVC: REAL-TIME LOW-LATENCY VOICE CONVERSION

引用

49th IEEE International conference on Acoustics, Speech, and Signal processing (ICASSP)

作者： Yang, Yang Kartynnik, Yury Li, Yunpeng Tang, Jiuqiang Li, Xing Sung, George Grundmann, Matthias Google LLC Mountain View CA 94043 USA

ISBN: (纸本)9798350344868;9798350344851

We present StreamVC, a streaming voice conversion solution that preserves the content and prosody of any source speech while matching the voice timbre from any target speech. Unlike previous approaches, StreamVC produces the resulting waveform at low latency from the input signal even on a mobile platform, making it applicable to real-time communication scenarios like calls and video conferencing, and addressing use cases such as voice anonymization in these scenarios. Our design leverages the architecture and training strategy of the SoundStream neural audio codec for lightweight high-quality speech synthesis. We demonstrate the feasibility of learning soft speech units causally, as well as the effectiveness of supplying whitened fundamental frequency information to improve pitch stability without leaking the source timbre information.

关键词： Voice conversion On-device neural audio processing real-time voice changer

来源：评论

学校读者我要写书评

暂无评论

Application of Computer 3D image Vision Algorithm in Intelligent image Recognition System 5

Application of Computer 3D Image Vision Algorithm in Intelli...

引用

2023 5th International conference on Artificial Intelligence and Computer Applications, ICAICA 2023

作者： Li, Yuan Yu, Xin Modern Finance Industry School Shandong Institute of Commerce and Technology Shandong Jinan China

ISBN: (纸本)9798350323313

In this paper, the 3D space imaging model of machine vision is constructed. Starting from the traditional machine vision image processing algorithm flow, the image denoising process and target tracking process are optimized. The method uses the camera to collect the image and video information of the measured object, and transmits it to the controller. The controller corrects the signal obtained by the wireless sensor in the database to reproduce the position of the measured object and the 3D image. A real-time tracking method of motion trajectory based on computer vision is presented. The object autonomous capture, 3D position and motion trajectory tracking. Simulation experiments show that this method is quite different from conventional image processing methods. This method has the advantages of small computation, fast running speed and good real-time performance. It meets the needs of embedded image processing. © 2023 IEEE.

关键词： image recognition

来源：评论

学校读者我要写书评

暂无评论

Research on Adaptive Bitrate Algorithm for UAV video Transmission 9

Research on Adaptive Bitrate Algorithm for UAV Video Transmi...

引用

9th International conference on Intelligent Computing and Signal processing, ICSP 2024

作者： Ren, Anhu Chen, Yang Li, Yufei School of Electronic and Information Engineering Xi'an Technological University Xi'an China

ISBN: (纸本)9798350376548

The highly dynamic nature of mobile networks makes it difficult to guarantee the real-time and stability of UAV video transmission, which greatly affects the user's Quality of Experience (QoE). Adaptive Bitrate (ABR) is an effective improvement measure, but most of the ABR algorithms commonly used in real-time video are based on fixed rule control and often perform poorly in complex network environments in reality. Therefore, we propose an adaptive bitrate algorithm UABR based on deep reinforcement learning for UAV platforms, which estimates the optimal video bitrate by monitoring the real-time network status of the UAV and adjusts the encoding in a timely manner. The UABR algorithm introduces an entropy regularization term to improve environmental exploration capabilities, uses an importance sampling method to improve data utilization efficiency, and accelerates algorithm convergence by limiting the update range of old and new strategies. Experimental results show that compared with the GCC algorithm, the bandwidth estimation error of the UABR algorithm is reduced by 37.88%, and the average QoE is improved by 27.42%, significantly improving the quality of real-time video transmission. © 2024 IEEE.

关键词： Deep reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

VLSI Implementation of Reconfigurable Canny Edge Detection Algorithm 11th

VLSI Implementation of Reconfigurable Canny Edge Detection A...

引用

11th International conference on Big Data Analytics in Astronomy, Science, and Engineering (BDA)

作者： Senthilkumar, K. K. Avantika, E. Gayathri, B. Dhandapani, Vaithiyanathan Prince Shri Venkateshwara Padmavathy Engn Coll Chennai Tamil Nadu India Natl Inst Technol Delhi New Delhi India

ISBN: (纸本)9783031585012;9783031585029

real-time video and image processing are used in various industrial, medical, consumer electronics and embedded device applications. These applications typically demonstrate an increasing demand for computing power and system complexity. Hence, edge detection is the most common and widely used technique in image or video processing applications. Several traditional canny edge detection methods use fixed thresholding techniques to compare the pixel values. This sacrifices the edge detection performance and increases the computational complexity. Hence, the Canny Edge detection algorithm is preferred to enhance the image quality with reduced complexity. They adjust the quality of the image by manipulating the Sigma and Threshold parameters and detect the edges accurately by eliminating the noise. The reconfigurable canny edge detection algorithm presents a procedure for detecting edges without multipliers. The new algorithm uses a low-complex, non-uniform histogram gradient to compute thresholds and variable sigma values that replace the add and shift operator instead of multipliers to reduce the area and sigma. The simulation is done in the ModelSim platform using VHDL code which results in the output of bit sequences. By comparing the results of the reconfigurable canny edge detection and traditional algorithm, the new algorithm's performance can be observed with improvements of around 21% and 80% for consumed power and delay parameters respectively.

关键词： image processing Edge Detection Canny Edge Detection VLSI

来源：评论

学校读者我要写书评

暂无评论

Towards Practical Consistent video Depth Estimation 23

Towards Practical Consistent Video Depth Estimation

引用

ACM International conference on Multimedia Retrieval (ICMR)

作者： Li, Pengzhi Ding, Yikang Li, Linge Guan, Jingwei Li, Zhiheng Tsinghua Univ Beijing Peoples R China Huawei Shenzhen Peoples R China

ISBN: (纸本)9798400701788

Monocular depth estimation algorithms aim to explore the possible links between 2D and 3D data, but challenges remain for existing methods to predict consistent depth from a casual video. Relying on camera poses and the optical flow in the time-consuming testtime training phases makes these methods fail in many scenarios and cannot be used for practical applications. In this work, we present a data-driven post-processing method to overcome these challenges and achieve online processing. Based on a deep recurrent network, our method takes the adjacent original and optimized depth map as inputs to learn temporal consistency from the dataset and achieves higher depth accuracy. Our approach can be applied to multiple single-frame depth estimation models and used for various real-world scenes in real-time. In addition, to tackle the lack of a temporally consistent video depth training dataset of dynamic scenes, we propose an approach to generate the training video sequences dataset from a single image based on inferring motion field. To the best of our knowledge, this is the first datadriven plug-and-play method to improve the temporal consistency of depth estimation for casual videos. Extensive experiments on three datasets and three depth estimation models show that our method outperforms the state-of-the-art methods.

关键词： video Depth estimation Temporal consistency

来源：评论

学校读者我要写书评

暂无评论

Formation of an algorithm for determining the degree of human involvement based on the analysis of the movement of the pupils of the operator and data on the position of the head and body

Formation of an algorithm for determining the degree of huma...

引用

conference on real-time image processing and Deep Learning

作者： Semenishchev, Evgenii Zhdanova, Marina Zelensky, Aleksandr Gracheva, Inessa Lyakhov, Daniil Mitugov, Nikita Voronin, Viacheslav Tula State Univ TulSU 92 Sq Lenina Tula 300012 Tula Region Russia Moscow State Tech Univ STANKIN 1-A Vadkovsky Moscow 127055 Russia

ISBN: (数字)9781510661714

ISBN: (纸本)9781510661707;9781510661714

The article proposes an algorithm for processing parallel analysis of visual data obtained by a machine vision system, recorded information in the human visible spectrum, and information received by a range camera. An algorithm for the formation of stable features as elements of the human body, head and pupils of a person and parallel tracking of their increment is proposed. To highlight trend lines in element displacement and eliminate the high frequency component based on a combined criterion. The image is preliminarily processed to reduce the effect of the noise component based on a multi-criteria objective function. As test data used to evaluate the effectiveness, a video stream with a resolution of 1024x768 (8-bit, color image, visible range), 3D data, and expert evaluation data are used.

关键词： image smoothing activity analysis preprocessing descriptors multicriteria method

来源：评论

学校读者我要写书评

暂无评论

SPOTSECURE: Parking Reservation System with Plate Number Recognition through image processing 6

SPOTSECURE: Parking Reservation System with Plate Number Rec...

引用

6th International conference on image, video processing, and Artificial Intelligence, IVPAI 2024

作者： Austria, Yolanda D. Acerado, Jhon Kenneth A. Butac, Aloysius Atheos L. Cariño, Caryll Franz M. Marquez, Carlos Miguel T. Mirabueno, Ma. Concepcion A. Computer Engineering Department College of Engineering Adamson University 900 San Marcelino St. Ermita Manila1000 Philippines

ISBN: (纸本)9781510681781

This research addresses urban parking challenges by allowing users to reserve parking spaces via a mobile app. The system integrates automated barriers and AI-powered cameras for accurate license plate recognition, ensuring secure and seamless parking access. It includes user registration, slot selection, plate number entry, and payment, with all data securely stored in a cloud database. real-time notifications and a robust database management system enhance user experience and operational efficiency. Additionally, the system features automated bollards and a payment system to further streamline parking management. Experimental results show a 95% accuracy in license plate recognition, significantly improving the efficiency and security of parking reservations. This innovative approach combines mobile technology, machine learning, and automated solutions to provide a stress-free and secure parking experience. © 2024 SPIE.

关键词： Efficiency

来源：评论

学校读者我要写书评

暂无评论

IoT-based nano wireless sensor approach for detection of ships using mixed convolutional neural network approach

引用

SIGNAL image AND video processing 2024年第11期18卷 8185-8194页

作者： Gupta, Vishal Rahmani, Mohammad Khalid Imam NIT Delhi Dept Comp Sci Engn Delhi India Saudi Elect Univ Coll Comp & Informat Riyadh 11673 Saudi Arabia

Ships and other maritime objects are often unable to endure the harsh and dynamic sea environment. Collecting real-time data and detecting these objects using various sensors such as RADARs, Synthetic Aperture RADARs, and mounted RADARs present significant challenges due to numerous influencing factors. To address this issue, our research aims to develop an Internet of Things (IoT)-based multi-scale and multi-scene ship identification system. This system leverages a multi-scale neural network integrated with a high-response convolutional neural network (CNN)-based Kalman filter architecture. To construct this model, we selected various ship categories and initially employed a base CNN model to develop a new model with different convolutional layers. Our approach utilizes mixed methods for tracking and detecting objects, with a focus on small ships. The dataset is processed through multiple neural network layers, and we implemented the Kalman filter to estimate and predict the ships' positions. Additionally, using the YOLOv3 model, we achieved improved accuracy and reduced error rates through mathematical optimization. Our method utilizes a dataset of 5,604 samples and incorporates a hybrid approach with YOLOv3. Our model demonstrates significant improvements for both medium-sized and small ships. The proposed work provides both qualitative and quantitative advancements. Our model exceeded the best results from parallel experiments by 3.9% and 1.2% in terms of Average Precision (AP). Furthermore, YOLOv3 achieved a performance score of 97.34% across various metric parameters, while our proposed approach attained the highest scores of 97.8% and 94.87%, respectively.

关键词： Convolutional neural network Yolo-3 image processing Kalman filter IoT

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：