检索结果-内蒙古大学图书馆

2023 IEEE International conference on image processing and Computer Applications, ICIPCA 2023

作者： Xiong, Feng Jiangxi Vocational Technical College of Industry & Trade Jiangxi Nanchang330038 China

ISBN: (纸本)9798350314670

With the rapid development of computer technology, network technology and multimedia technology, multimedia data is increasing exponentially. As an important part of video multimedia data, its structure is complex and its data volume is huge. Especially from a monocular camera, it is a very meaningful work to realize the correct recognition of human movements in any visual angle. video action recognition usually refers to the process of identifying human action categories from a video sequence. This technology is widely used in multimedia content analysis, human-computer interaction, intelligent real-time monitoring and other fields. It can be realized by feature extraction of video to generate feature vectors, and classification of feature vectors by classifier. To solve the above problems, this paper starts from the spatio-temporal correlation of video data, according to the mixed model structure of convolutional neural network (CNN) and long short term memory (LSTM), introduces the attention mechanism in semantic analysis, proposes a recursive machine network based on the attention mechanism, assigns different weights to each video frame, and improves the recognition effect in the captured video clips. © 2023 IEEE.

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

A real-time Background Replacement Method Based on Machine Learning for AR Applications 48

A Real-Time Background Replacement Method Based on Machine L...

引用

48th Annual IEEE International Computers, Software, and Applications conference (COMPSAC) - Digital Development for a Better Future

作者： Tsuboki, Yoshihiro Kawakami, Tomoya Matsumoto, Satoru Yoshihisa, Tomoki Teranishi, Yuuichi Univ Fukui Grad Sch Engn Fukui Japan Osaka Univ Cybermedia Ctr Osaka Japan Shiga Univ Fac Data Sci Hikone Shiga Japan Natl Inst Informat & Commun Technol Tokyo Japan

ISBN: (纸本)9798350376975;9798350376968

Recent technological advances in Virtual reality (VR) and Augmented reality (AR) enable users to experience a high-quality virtual world. Using VR to experience the virtual world, the user's entire view becomes the virtual world, and the user's physical movement is generally limited because the user cannot see the surrounding situation in the real world. Using AR to experience the virtual world, we generally use special sensors such as LiDAR to detect the real space and superimpose the virtual world on the real space. However, it is difficult for devices without such special sensors to detect real space and superimpose a virtual world at an appropriate position. This study proposes two methods for replacing the background: a method using depth estimation and a method using semantic segmentation. This study also confirmed that the system can be used with sufficient removal accuracy and response time by using appropriate image size for the environment and that a safe and highly immersive virtual world experience can be achieved.

关键词： augmented reality virtual reality video processing dynamic background replacement mobile computing

来源：评论

学校读者我要写书评

暂无评论

Camera Movement Cancellation in video Using Phase Congruency and an FFT-Based Technique 11th

Camera Movement Cancellation in Video Using Phase Congruency...

引用

11st World conference on Information Systems and Technologies (WorldCIST)

作者： Gharahbagh, Abdorreza Alavi Hajihashemi, Vahid Machado, J. J. M. Tavares, Joao Manuel R. S. Univ Porto Fac Engn Rua Dr Roberto Frias S-N P-4200465 Porto Portugal Univ Porto Dept Engn Mecan Fac Engn Rua Dr Roberto Frias S-N P-4200465 Porto Portugal

ISBN: (纸本)9783031456503;9783031456510

One of the interesting fields in video processing is motion detection and human action detection (HAR) in video. In some applications where both objects in the scene and the camera may be moving, camera movement cancellation is very important to increase accuracy in extracting motion features. HAR systems usually use image matching/registration algorithms to remove the camera movement. In these methods, the source (fixed) image frame is compared with moved image frame, and the best match is determined geometrically. In video processing, due to the existence of a set of frames, one can correct errors using previous data, but at the same time, it is needed a fast frame registration algorithm. According to the above explanations, this article proposes a method to detect and minimize camera movement in video using phase information. In addition to having the acceptable speed and the ability to be implemented online, the proposed method, by combining texture and phase congruency (PC), can significantly increase the accuracy of detecting the objects in the scene. The proposed method was implemented on a HAR dataset, which includes camera movement, and its ability to compensate for camera motion and pre-serve object motion was verified. Finally, the speed and accuracy of the proposed method were compared with a number of the latest image registration methods, and its efficiency in terms of camera movement cancellation and execution time is discussed.

关键词： Camera Movement Cancellation Phase Congruency (PC) video image Registration Fast Fourier transform (FFT)

来源：评论

学校读者我要写书评

暂无评论

Enhancing Blurred image processing with IoT Integration for Improved Clarity 2

Enhancing Blurred Image Processing with IoT Integration for ...

引用

2nd IEEE International conference on Emerging Research in Computational Science, ICERCS 2024

作者： Tummalapalli, Geetamma Summia Parveen, H. Ramesh, S.M. Monica, P. Santhiya, B. Rajasekaran, Arun Sekar GMR Institute of Technology Department of Electronics and Communication Engineering Rajam India Sri Eshwar College of Engineering Department of Computer Science Engineering Coimbatore India KPR Institute of Engineering and Technology Department of Electronics and Communication Engineering Coimbatore India GITAM Deemed to be University Department of Electronics and Communication Engineering GST Visakhapatnam India Bannari Amman Institute of Technology Department of Information Technology Sathyamangalam India SR University Department of Electronics and Communication Engineering Warangal India

ISBN: (纸本)9798331534950

A blurred image is an image that has undergone a blurring or smoothing effect, resulting in a loss of sharpness and clarity. Blurring is a technique used in image processing to reduce noise, remove unwanted details, or create a visual effect. It involves averaging or blending neighboring pixels to create a smoother appearance. The specific processing techniques and algorithms used will depend on the nature and extent of the blur, the intended outcome, and the available software or programming tools. It's important to experiment with different methods and parameters to achieve the desired result. IoT (Internet of Things) can be used in image processing in several ways to enhance functionality, efficiency, and data processing capabilities enhances image processing by enabling real-time analysis, edge computing capabilities, integration with other IoT devices, and remote monitoring and control. These advancements facilitate applications such as object detection, image recognition, and real-time decision-making, contributing to more efficient and intelligent systems in various domains. Here is a proposed model for blurred image processing with IoT. This proposed model demonstrates the integration of IoT devices, image processing techniques, and cloud-based analytics to address the challenges of blurred image processing. It combines edge processing for real-time analysis and cloud-based processing for more complex algorithms, enabling efficient and intelligent image processing applications in various domains. © 2024 IEEE.

关键词： Edge computing

来源：评论

学校读者我要写书评

暂无评论

Refining knowledge transfer on audio-image temporal agreement for audio-text cross retrieval 32

Refining knowledge transfer on audio-image temporal agreemen...

引用

32nd European Signal processing conference (EUSIPCO)

作者： Tsubaki, Shunsuke Niizumi, Daisuke Takeuchi, Daiki Ohishi, Yasunori Harada, Noboru Imoto, Keisuke Doshisha Univ Kyoto Japan NTT Corp Chiyoda City Japan

ISBN: (纸本)9789464593617;9798331519773

The aim of this research is to refine knowledge transfer on audio-image temporal agreement for audio-text cross retrieval. To address the limited availability of paired non-speech audio-text data, learning methods for transferring the knowledge acquired from a large amount of paired audio-image data to shared audio-text representation have been investigated, suggesting the importance of how audio-image co-occurrence is learned. Conventional approaches in audio-image learning assign a single image randomly selected from the corresponding video stream to the entire audio clip, assuming their co-occurrence. However, this method may not accurately capture the temporal agreement between the target audio and image because a single image can only represent a snapshot of a scene, though the target audio changes from moment to moment. To address this problem, we propose two methods for audio and image matching that effectively capture the temporal information: (i) Nearest Match wherein an image is selected from multiple time frames based on similarity with audio, and (ii) Multiframe Match wherein audio and image pairs of multiple time frames are used. Experimental results show that method (i) improves the audio-text retrieval performance by selecting the nearest image that aligns with the audio information and transferring the learned knowledge. Conversely, method (ii) improves the performance of audio-image retrieval while not showing significant improvements in audio-text retrieval performance. These results indicate that refining audio-image temporal agreement may contribute to better knowledge transfer to audio-text retrieval.

关键词： Multimodal Learning Audio Representation Learning Cross-Modal Retrieval Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

Generation of Upright Panoramic image from Non-upright Panoramic image

Generation of Upright Panoramic Image from Non-upright Panor...

引用

IEEE/CVF Winter conference on Applications of Computer Vision (WACV)

作者： Liu, Jingguo Chen, Heyu Li, Shigang Li, Jianfeng Southwest Univ Coll Elect & Informat Engn Chongqing Peoples R China Hiroshima City Univ Grad Sch Informat Sci Hiroshima Japan

ISBN: (纸本)9798350318920;9798350318937

The inclination of a spherical camera results in nonupright panoramic images. To carry out upright adjustment, traditional methods estimate camera inclination angles firstly, and then resample the image in terms of the estimated rotation to generate upright image. Since sampling an image is a time-consuming processing, a lookup table is usually used to achieve a high processing speed;however, the content of a lookup table depends on the rotational angles and needs extra memory to store also. In this paper we propose a new approach for panorama upright adjustment, which directly generates an upright panoramic image from an input nonupright one without rotation estimation and lookup tables as an intermediate processing. The proposed approach formulates panorama upright adjustment as a pixelwise image-to-image mapping problem, and the mapping is directly generated from an input nonupright panoramic image via an end-to-end neural network. As shown in the experiment of this paper, the proposed method results in a lightweight network, as less as 163MB, with high processing speed, as great as 9ms, for a 256x512 pixel panoramic image.

关键词： 3D 3D computer vision Algorithms Algorithms Applications etc. Generative models for image video Virtual / augmented reality

来源：评论

学校读者我要写书评

暂无评论

eDashA: Edge-based Dash Cam video Analytics 7

eDashA: Edge-based Dash Cam Video Analytics

引用

7th IEEE International conference on Edge Computing and Communications (IEEE EDGE) / IEEE World Congress on Services (SERVICES)

作者： King, Jayden Lee, Young Choon Macquarie Univ Sch Comp N Ryde NSW Australia

ISBN: (纸本)9798350304831

While the real-time analysis of dash cam video is of great practical importance for improving road safety, commercial dash cams lack the resources necessary to perform such video analytics. It is impractical to use clouds for this due to high latency and high bandwidth consumption. In this paper, we present eDashA, the first edge-based system that demonstrates the potential of near real-time video analytics using a network of mobile devices, on the move. In particular, it simultaneously processes videos produced by two dash cams of different angles (outward facing and inward facing dash cams) with one or more mobile devices on the move. Further, we devise several optimization techniques and incorporated them into eDashA. These techniques are simultaneous download and analysis, scheduling, segmentation and early stopping. We have implemented eDashA as an Android app and evaluated it using two dash cams and several heterogeneous smartphones. Experiment results show the feasibility of real-time video analytics on the move.

关键词： dashboard cameras video analytics edge computing distributed processing real-time

来源：评论

学校读者我要写书评

暂无评论

Advancing Surveillance video Clarity and Transmission: A real-time video Super-Resolution Model with Background Information Awareness 7th

Advancing Surveillance Video Clarity and Transmission: A Rea...

引用

7th Chinese conference on Pattern Recognition and Computer Vision

作者： Liu, Zhifeng He, Zheng Ye, Gang Zhu, Wenqian Wuhan Univ Sch Comp Sci Wuhan 430072 Peoples R China Wuhan Univ Dept Artificial Intelligence Wuhan 430072 Peoples R China VOYAH Automobile Technol Co Ltd Wuhan 430051 Peoples R China

ISBN: (纸本)9789819786916;9789819786923

In surveillance video transmission, the quality of the video will be greatly affected when transmitted at a low bit rate due to limited bandwidth. To combat this issue and enable quicker transmission without sacrificing image quality, we've introduced SVRNet (short for Surveillance video Restoration Network), a video Super-Resolution (VSR) model tailored for enhancing downscaled and compressed videos post-transmission. It incorporates a distinct "separate-process-merge strategy" to segregate the foreground and background, which are then adaptively enhanced using different SR model and finally output the merged SR results. Furthermore, we significantly enhance video quality by incorporating a novel GTGE module as a substream architecture, employing high-resolution frames to refine the output, all while only requiring a minimal amount of network bandwidth. Extensive experiments demonstrate that our SVRNet and GTGE modules can effectively super-resolve the surveillance videos and outperform other state-of-the-art models.

关键词： video super-resolution Surveillance video Background subtraction

来源：评论

学校读者我要写书评

暂无评论

Deepfake video Detection Based on image Source Anomaly 2

Deepfake Video Detection Based on Image Source Anomaly

引用

2nd IEEE International conference on image processing and Computer Applications, ICIPCA 2024

作者： Wang, Yufei Liao, Guangjun Guangdong Police College Department of Criminal Science and Technology Guangzhou China

ISBN: (纸本)9798350360240

In order to detect the deepfake videos, most of the effective detection approaches need huge number of samples for training, including the real and fake samples. However, the fake samples are not easy to obtain. To find the solution, this paper proposes a deepfake video detection method based on image source anomaly, which only needs the real samples for training. The proposed method uses a neural network to extract features from the face region and its eight neighbor regions, and then uses another neural network to compare the similarity between the features from face region and each one of its neighbor regions. Finally, the average similarity score is utilized as the measure to detect deepfake video. The experimental results show that the proposed method has good detection performance, which achieved the HTER of 2.75% in the DFD dataset and 2.25% in the FF++ dataset, while it needs less samples for training. Moreover, the proposed method also has good cross-datasets performance. It had the HTER of 1.22% when training in FF++ dataset and testing in DFD dataset, which indicates that it can be used in many practical scenarios. © 2024 IEEE.

关键词： video analysis

来源：评论

学校读者我要写书评

暂无评论

Castle in the Sky: Dynamic Sky Replacement and Harmonization in videos

引用

IEEE TRANSACTIONS ON image processing 2022年 31卷 5067-5078页

作者： Zou, Zhengxia Zhao, Rui Shi, Tianyang Qiu, Shuang Shi, Zhenwei Beihang Univ Sch Astronaut Dept Guidance Nav & Control Beijing 100191 Peoples R China NetEase Fuxi AI Lab Hangzhou 310052 Zhejiang Peoples R China Univ Chicago Booth Sch Business Chicago IL 60637 USA Beihang Univ Image Proc Ctr Sch Astronaut Beijing Key Lab Digital Media Beijing 100191 Peoples R China Beihang Univ State Key Lab Virtual Real Technol & Syst Sch Astronaut Beijing 100191 Peoples R China

We propose a vision-based framework for dynamic sky replacement and harmonization in videos. Different from previous sky editing methods that either focus on static photos or require real-time pose signal from the camera's inertial measurement units, our method is purely vision-based, without any requirements on the capturing devices, and can be well applied to either online or offline processing scenarios. Our method runs in real-time and is free of manual interactions. We decompose the video sky replacement into several proxy tasks, including motion estimation, sky matting, and image blending. We derive the motion equation of an object at infinity on the image plane under the camera's motion, and propose "flow propagation", a novel method for robust motion estimation. We also propose a coarse-to-fine sky matting network to predict accurate sky matte and design image blending to improve the harmonization. Experiments are conducted on videos diversely captured in the wild and show high fidelity and good generalization capability of our framework in both visual quality and lighting/motion dynamics. We also introduce a new method for content-aware image augmentation and proved that this method is beneficial to visual perception in autonomous driving scenarios. Our code and animated results are available at https://***/jiupinjia/SkyAR

关键词： videos Cameras image segmentation Motion segmentation Motion estimation real-time systems Meteorology Augmented reality video sky replacement motion estimation data augmentation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：