检索结果-内蒙古大学图书馆

Proceedings of the 22nd International conference on image Analysis and processing, ICIAP 2023

ISBN: (纸本)9783031510229

The proceedings contain 92 papers. The special focus in this conference is on image Analysis and processing. The topics include: An Effective CNN-Based Super Resolution Method for video Coding;medical Transformers for Boosting Automatic Grading of Colon Carcinoma in Histological images;FERMOUTH: Facial Emotion Recognition from the MOUTH Region;consensus Ranking for Efficient Face image Retrieval: A Novel Method for Maximising Precision and Recall;towards Explainable Navigation and Recounting;towards Facial Expression Robustness in Multi-scale Wild Environments;depth Camera Face Recognition by Normalized Fractal Encodings;automatic Generation of Semantic Parts for Face image Synthesis;improved Bilinear Pooling for real-time Pose Event Camera Relocalisation;continual Source-Free Unsupervised Domain Adaptation;End-to-End Asbestos Roof Detection on Orthophotos Using Transformer-Based YOLO Deep Neural Network;OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data;UAV Multi-object Tracking by Combining Two Deep Neural Architectures;GLR: Gradient-Based Learning Rate Scheduler;a Large-scale Analysis of Athletes’ Cumulative Race time in Running Events;uncovering Lies: Deception Detection in a Rolling-Dice Experiment;active Class Selection for Dataset Acquisition in Sign Language Recognition;MC-GTA: A Synthetic Benchmark for Multi-Camera Vehicle Tracking;a Differentiable Entropy Model for Learned image Compression;learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation;self-Similarity Block for Deep image Denoising;SCENE-pathy: Capturing the Visual Selective Attention of People Towards Scene Elements;not with My Name! Inferring Artists’ Names of Input Strings Employed by Diffusion Models;benchmarking of Blind video Deblurring Methods on Long Exposure and Resource Poor Settings;LieToMe: An LSTM-Based Method for Deception Detection by Hand Movements;spatial Transformer Generative Adversarial Network for image Super

关键词：

来源：评论

学校读者我要写书评

暂无评论

Integrating Both Parallax and Latency Compensation into video See-through Head-mounted Display

引用

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023年第5期29卷 2826-2836页

作者： Ishihara, Atsushi Aga, Hiroyuki Ishihara, Yasuko Ichikawa, Hirotake Kaji, Hidetaka Kawasaki, Koichi Kobayashi, Daita Kobayashi, Toshimi Nishida, Ken Hamasaki, Takumi Mori, Hideto Morikubo, Yuki Sony Grp Corp Minato City Tokyo Japan

This work introduces a perspective-corrected video see-through mixed-reality head-mounted display with edge-preserving occlusion and low-latency capabilities. To realize the consistent spatial and temporal composition of a captured real world containing virtual objects, we perform three essential tasks: 1) to reconstruct captured images so as to match the user's view;2) to occlude virtual objects with nearer real objects, to provide users with correct depth cues;and 3) to reproject the virtual and captured scenes to be matched and to keep up with users' head motions. Captured image reconstruction and occlusion-mask generation require dense and accurate depth maps. However, estimating these maps is computationally difficult, which results in longer latencies. To obtain an acceptable balance between spatial consistency and low latency, we rapidly generated depth maps by focusing on edge smoothness and disocclusion (instead of fully accurate maps), to shorten the processing time. Our algorithm refines edges via a hybrid method involving infrared masks and color-guided filters, and it fills disocclusions using temporally cached depth maps. Our system combines these algorithms in a two-phase temporal warping architecture based upon synchronized camera pairs and displays. The first phase of warping is to reduce registration errors between the virtual and captured scenes. The second is to present virtual and captured scenes that correspond with the user's head motion. We implemented these methods on our wearable prototype and performed end-to-end measurements of its accuracy and latency. We achieved an acceptable latency due to head motion (less than 4 ms) and spatial accuracy (less than 0.1 degrees in size and less than 0.3 degrees in position) in our test environment. We anticipate that this work will help improve the realism of mixed reality systems.

关键词： image reconstruction Head Cameras image edge detection Rendering (computer graphics) Prototypes Magnetic heads video see-through mixed reality occlusion latency compensation

来源：评论

学校读者我要写书评

暂无评论

Intelligent video Surveillance using Deep-Learning Models 15

Intelligent Video Surveillance using Deep-Learning Models

引用

15th International conference on Advances in Computing, Control, and Telecommunication Technologies, ACT 2024

作者： Anupama, C.G. Kumar, Gurram Pavan Kumar, D. Chaitanya SRM Institute of Science and Technology Computational Intelligence College of Engineering Chennai India

ISBN: (纸本)9798331300579

Closed-circuit television, or CCTV, is another name for video surveillance. It is a fast-expanding sector that has been around for more than 30 years and has seen many technological advancements. In the modern world, maintaining public safety now requires the use of video surveillance. One can define security. In a number of ways, depending on the situation, including danger of explosion, theft, violence, and so forth. In busy public spaces, "security" can refer to nearly any kind of unusual event. Depending on the user's preferences, intelligent video surveillance records unforeseen activities in homes, offices, and public spaces to provide state-of- the-art smart security. The video surveillance system will actively react to detect actions ahead of time through real-time monitoring and promptly communicate data in the event of an abnormal incident. The main focus is on the application of deep learning techniques to provide motion-activated night vision technology, high-definition picture quality, and tracking of a moving target. Additionally, the system is equipped with automatic audio and visual detection, video recording initiation, and detection of suspicious activities that can activate and alert the systems in any type of weather. The data processing model design aims to employ deep learning technology to visualize data for anomalous activities. It also suggests an intelligent surveillance system to promptly and efficiently identify activities by transmitting a video image and an alert message to the web through realtime processing. Advances in computer vision, especially in deep learning techniques, have created new avenues for these systems to explore, expanding their potential and stimulating new fields of study. © Grenze Scientific Society, 2024.

关键词： Long short-term memory

来源：评论

学校读者我要写书评

暂无评论

A Research on Fast Acquisition Technology of Spatio-Temporal Data of Railway Infrastructure Based on Measurable real image

A Research on Fast Acquisition Technology of Spatio-Temporal...

引用

2024 International conference Optoelectronic Information and Optical Engineering, OIOE 2024

作者： Xu, Xiaolei Wang, Yaoyao Feng, Boqing Cui, Mengzhen Institute of Computing Technology China Academy of Railway Sciences Corporation Limited China

ISBN: (数字)9781510688209

ISBN: (纸本)9781510688193

The spatiotemporal data of railway infrastructure plays an important role in the development of railway informatization, but existing collection technologies have problems such as low efficiency, high cost, and many limitations. Starting from different business application scenarios in railways, this article first conducts a comprehensive investigation and analysis of the business requirements for spatiotemporal data of railway infrastructure. Then, by studying new surveying and mapping technologies such as GNSS+IMU combined positioning technology, laser point cloud scanning technology, and real scene video acquisition technology, a railway measurable real scene image acquisition device is developed to achieve the integrated collection of device operation trajectory positioning data, point cloud data, and real scene video data. At the same time, real scene image calculation technology is used to obtain measurable real scene image data along the railway line, thereby achieving rapid collection of railway infrastructure spatiotemporal data based on measurable real scene images. Finally, experimental verification was conducted on the circular railway line of the National Railway Test Center, successfully collecting and obtaining the spatial and mileage coordinates of various professional infrastructure along the circular railway, as well as accurately measuring the geometric dimension information of various facilities and structures. © 2025 SPIE.

关键词： Mapping

来源：评论

学校读者我要写书评

暂无评论

A blind assistant system utilizing object detection for improved spatial awareness 1st

A blind assistant system utilizing object detection for impr...

引用

1st International conference on Applications of AI in 5G and IOT, ICAAI5GI 2024

作者： Eqbal, Shahid Bharadwaj, Prashant Singh, Pratistha Tiwari, Rashmi Department of ECE Galgotias College of Engineering and Technology Uttar Pradesh Greater Noida India

ISBN: (纸本)9781032874036

A prototype of a blind assisting system that utilizes machine learning for real-time object detection and classification to help visually impaired people to navigate independently without relying on external assistance. The program employs image processing and machine learning techniques to accurately detect objects in real-time via the device camera. It subsequently notifies the users about the obstacle and its location through voice output. The prototype is based on SSD-Mobilnet-v2 and provides a viable solution for those with visual impairment, to enhance their quality of life. © 2025 the Author(s).

关键词： image processing Object detection real time SSD-Mobilenet-v2

来源：评论

学校读者我要写书评

暂无评论

Accelerated Reconstruction of Highly Undersampled 3D Cardiac MRI image Navigators

Accelerated Reconstruction of Highly Undersampled 3D Cardiac...

引用

conference on Medical Imaging - image processing

作者： Guo, Xinrui Sheagren, Calder D. Patel, Jaykumar H. Li, Liwen Wright, Graham A. Guo, Fumin Huazhong Univ Sci & Technol Wuhan Natl Lab Optoelect Wuhan 430074 Peoples R China Univ Toronto Sunnybrook Res Inst Toronto ON Canada Univ Toronto Dept Med Biophys Toronto ON Canada

ISBN: (纸本)9781510671577;9781510671560

Intraprocedural 3D real-time magnetic resonance imaging (MRI) provides a way for accurate and precise radiofrequency catheter targeting during ventricular tachycardia ablation. However, the limited data acquisition time needed to freeze cardiac motion results in highly undersampled k-space data that are challenging to reconstruct. In this work, we evaluated several deep learning (DL) based methods for real-time reconstruction of highly undersampled 3D real-time cardiac MRI. Algorithm reconstruction performance and speed were compared between classical algorithms and DL-based methods. Generative adversarial networks with attention layers in the generator were used to perform reconstructions in the image domain, which strived to balance reconstruction speed and image quality. In addition, variational networks were implemented by iterating data consistency in k-space and enforcing image smoothness via neural network-based regularization. In a preliminary study of heartbeat-resolved highly undersampled 3D cardiac MRI for 11 healthy volunteers, we observed that DL reconstruction methods provided good image quality with a significant increase in computational speed.

关键词： real-time cardiac MRI highly undersampled reconstruction deep learning

来源：评论

学校读者我要写书评

暂无评论

real-time Face Mask Detection in video Streams Using Deep Learning Technique 4

Real-Time Face Mask Detection in Video Streams Using Deep Le...

引用

4th International conference on Smart Electronics and Communication, ICOSEC 2023

作者： Asha, V. New Horizon College of Engineering Department of Master of Computer Application Bengaluru India

ISBN: (纸本)9798350300888

Face masks are necessary during the worldwide pandemic to prevent the transmission of infectious diseases. This research proposes a deep learning-based system for detecting face masks in live video feeds in real-time. The system's goal is to automatically determine if people on a video broadcast are hiding their faces behind masks. To do this, a deep convolutional neural network architecture is sued and is trained on a huge dataset of annotated photos that include both masked and unmasked faces. The existing systems struggle to handle large-scale deployment and real-time processing of video streams for face mask detection. The network was built to learn features that reliably identify masks or no masks on faces. The transfer learning method is employed for fine-tuning in order to enhance the network's ability to generalize. Further, this study employs a powerful detection pipeline that uses hardware acceleration and parallel processing approaches to deal with the real-time nature of video streams. Since the channel processes video frames in real-time, it can be used in places where fast detection is critical, such as hospitals, airports, and other public buildings. To measure the proposed system's efficacy, large trials are run over various video datasets, each with its unique combination of circumstances and camera angles. The findings show that the proposed Convolutional neural network method is effective, comparing with Dataset-1 and Dataset-2 with real-time processing speed and good accuracy. Solution is accurate 95% to 99% and efficient than the current techniques. © 2023 IEEE.

关键词： COVID-19

来源：评论

学校读者我要写书评

暂无评论

Breaking Barriers: real-time Sign Language Recognition Using LSTM Networks for Enhanced Communication Accessibility

Breaking Barriers: Real-Time Sign Language Recognition Using...

引用

International conference on Advanced Systems and Emergent Technologies (ICASET)

作者： Magri, Halah Moutacalli, Mohamed Tarik Univ Quebec Dept Math Comp Sci & Engn Rimouski PQ Canada Univ Quebec Rimouski Dept Math Comp Sci & Engn Levis PQ Canada

ISBN: (纸本)9798350384901;9798350384895

This paper presents an innovative approach to real-time sign language recognition using Long Short-Term Memory networks (LSTM), aimed at enhancing communication accessibility for the deaf and hard-of-hearing community. We address the challenge of understanding and interpreting sign language, which is critical for millions worldwide, yet restricted to those proficient in it. Our research contributes to bridging this communication gap by developing a deep learning model capable of recognizing a broad spectrum of sign language gestures and sentences with high accuracy and speed. Utilizing a rich dataset comprising diverse sign language gestures, collected in collaboration with a professional video production studio and proficient sign language users, we employ LSTM networks integrated with Dense layers to effectively capture the complex spatial and temporal patterns of sign language. The architecture of our model is specifically designed to accommodate the nuanced dynamics of sign language, with an emphasis on real-time processing. Through rigorous training and validation, our model demonstrates an outstanding accuracy rate of 92% on a comprehensive testing dataset, alongside remarkable real-time processing capabilities. The system's efficiency in recognizing a wide array of sign gestures nearly instantaneously underscores its potential applicability in various real-world scenarios, including assistive technologies and human-computer interaction. This study not only showcases the practicality and efficacy of LSTM networks in real-time sign language recognition but also marks a significant step towards more inclusive and accessible communication technologies. Our future work includes integrating this system with the Langue des Signes Quebecoise website, further advancing the goal of universal communication accessibility.

关键词： Computer Vision Sign Language Recognition real-time processing Long Short-Term Memory

来源：评论

学校读者我要写书评

暂无评论

IAUFD: A 100k images dataset for automatic football image/video analysis

引用

IET image processing 2022年第12期16卷 3133-3142页

作者： Zanganeh, Amirhosein Jampour, Mahdi Layeghi, Kamran Islamic Azad Univ North Tehran Branch Dept Comp Engn Tehran Iran Quchan Univ Technol Quchan Iran

Nowadays, analyzing football videos using computer vision techniques has attracted increasing attention. Significant events detection, football video summarization, football results predictions, statistics etc. are exciting applications in this area. On the other hand, the deep learning approaches are very successful methods for image and video analysis that need much data. Nevertheless, to the best of our knowledge, publicly available datasets in this area are small or individual, which are not enough for such deep learning-based approaches. A public dataset was collected, annotated, and prepared, namely IAUFD*, to meet this gap for researches in this direction. The IAUFD contains 100,000 real-world images from 33 football videos in 2,508 min, annotated in 10 event categories. These categories include the goal, center of the field, celebration, red card, yellow card, the ball, stadium, the referee, penalty-kick, and free-kick. It is believed that these moments are the basis and useful for any high-level action or event exploration. For a generalization of our dataset, we paid attention to various weather (e.g., sunny, rainy, cloudy etc.), season, time of day, and location. We also used two deep neural networks (VggNet-13 and ResNet-18) to evaluate our proposed dataset as the baseline for future studies and comparison.

关键词： football results predictions deep neural networks sport convolutional neural nets deep learning automatic football image analysis event exploration deep learning (artificial intelligence) video signal processing ResNet-18 football video summarization VggNet-13 computer vision public dataset football videos automatic football video analysis IAUFD image dataset event categories real-world images event detection

来源：评论

学校读者我要写书评

暂无评论

Evaluating the Use of 360° video Technology to Monitor Workers’ Unsafe Behaviour in the Construction Industry 4th

Evaluating the Use of 360° Video Technology to Monitor Work...

引用

4th International Civil Engineering and Architecture conference, CEAC 2024

作者： Mollo, Lesiba George Department of Built Environment Central University of Technology Free State Bloemfontein South Africa

ISBN: (纸本)9789819754762

The purpose of this study is to investigate the use of 360° video technology to monitor the unsafe behaviour of workers in the construction industry. To achieve this, a survey questionnaire was designed and distributed to participants working in the South African construction industry. The study's findings outline how an integrated video processing system and safety hazard factors will improve the monitoring of unsafe behaviour among construction workers. Additionally, the results showed that using 360° video technology could facilitate monitoring possible safety hazard factors, including failure to wear personal protective equipment (PPE), entering a risky area, and disobeying safety procedures. Therefore, in real-time monitoring, 360° technology would mitigate safety hazards. Reducing the safety hazards in real-time monitoring would assist in decreasing the unsafe behaviour of workers, which commonly results in accidents on construction sites. It is recommended that more research should be conducted to discover how video-processing technologies could be used to monitor workers’ unsafe behaviour on actual construction sites. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： Construction industry

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：