检索结果-内蒙古大学图书馆

30th IEEE International conference on image processing (ICIP)

作者： Brand, Fabian Kopte, Alexander Fischer, Kristian Kaup, Andre Friedrich Alexander Univ Erlangen Nurnberg Multimedia Commun & Signal Proc Cauerstr 7 D-91058 Erlangen Germany

ISBN: (纸本)9781728198354

Adaptive block partitioning is responsible for large gains in current image and video compression systems. This method is able to compress large stationary image areas with only a few symbols, while maintaining a high level of quality in more detailed areas. Current state-of-the-art neural-network-based image compression systems however use only one scale to transmit the latent space. In previous publications, we proposed RDONet, a scheme to transmit the latent space in multiple spatial resolutions. Following this principle, we extend a state-of-the-art compression network by a second hierarchical latent-space level to enable multi-scale processing. We extend the existing rate variability capabilities of RDONet by a gain unit. With that we are able to outperform an equivalent traditional autoencoder by 7% rate savings. Furthermore, we show that even though we add an additional latent space, the complexity only increases marginally and the decoding time can potentially even be decreased.

关键词： image Compression Rate Distortion Optimization Autoencoder Multi Scale

来源：评论

学校读者我要写书评

暂无评论

GEVE: A generative adversarial network for extremely dark image/video enhancement

引用

PATTERN RECOGNITION LETTERS 2022年 155卷 159-164页

作者： Anitha, C. Kumar, R. Mathusoothana S. NICHE Dept CSE Kanyakumari Tamil Nadu India NICHE Dept IT Kanyakumari Tamil Nadu India

A variety of information of the real-time scenes is carried by the images and videos. processing these images and videos in an intelligent way helps in many domains such as computer vision, object detection, deep learning and 3D reconstruction leaving its large usage in applications such as auto pilots, augmented reality, smart vehicles, etc. The quality of image and videos plays a vital role in case of real-time systems. One such scenario is where the images are captured without sufficient illumination. images captured in cameras where sufficient light is not present, leads to noisy and information-loss images. It is a fact that, dark images have mainly two aspects which make its study a difficult task. They are at its low dynamic range and high propensity for generating high noise levels. Hence, an approach based on deep learning based system is adopted. For this purpose, Generative Adversarial Network (GAN) based Extremely Dark video Enhancement Network (GEVE) model is proposed. The main objective of GEVE is to team the model with low /normal- light image pairs. Thus, GAN network learns the translation from light feeble images and images captured under normal illumination and automatically translate original images taken under extremely low light conditions into images of quality. It is clearly observed that the proposed GEVE outperforms the known state-of-art techniques. We are the view that the proposed system is an ideal candidate to handle dark image/video frames. (C) 2021 Published by Elsevier B.V.

关键词： Deep learning Dynamic range Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

TransCGAN-based Parameter Extraction Framework for SAR image Simulation 2

TransCGAN-based Parameter Extraction Framework for SAR Image...

引用

2nd IEEE International conference on Signal, Information and Data processing, ICSIDP 2024

作者： Deng, Sidan He, Jingfei Mao, Yongfei Zhao, Liangbo Chen, Liang Shi, Hao Beijing Institute of Technology National Key Laboratory of Science and Technology on Space-Born Intelligent Information Processing Chongqing Innovation Center Chongqing China Beijing Institute of Technology National Key Laboratory of Science and Technology on Space-Born Intelligent Information Processing Beijing China China Academy of Space Technology Institute of Remote Sensing Satellite Beijing China Beijing Institute of Technology National Key Laboratory of Science and Technology on Space-Born Intelligent Information Processing Chongqing Innovation Center Beijing China

ISBN: (纸本)9798331515669

Synthetic Aperture Radar (SAR) images have a wide range of applications due to their all-weather and all-day working conditions. However, SAR images with different scenarios and imaging conditions are insufficient or even rare, which is required in specific SAR image tasks. Fortunately, SAR image simulation technology can provide ample simulated images under these conditions at a low cost, addressing the scarcity of specific real SAR data. Accurate parameters are crucial for obtaining high-quality simulated images. However, it is time-consuming and labour-intensive to adjust parameters manually, and it often fails to achieve accurate simulation parameters. To tackle this problem, this paper proposes a TransCGAN-based SAR image simulation method that combines deep learning with traditional methods. By training TransCGAN, a conditional generative adversarial network (CGAN) integrated with Transformer architecture, a mapping between SAR images and simulation parameters is established. This enables the extraction of simulation parameters directly from the real SAR image, guided by the corresponding real SAR image. Ultimately, the parameters are subsequently converted into simulated SAR images via simulation software. Experimental results demonstrate that our TransCGAN-based method can effectively extract accurate simulation parameters from real SAR images, resulting in simulated images holding high similarity to real images. © 2024 IEEE.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Neural Assets: 3D-Aware Multi-Object Scene Synthesis with image Diffusion Models 38

Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Im...

引用

38th conference on Neural Information processing Systems, NeurIPS 2024

作者： Wu, Ziyi Rubanova, Yulia Kabra, Rishabh Hudson, Drew A. Gilitschenski, Igor Aytar, Yusuf van Steenkiste, Sjoerd Allen, Kelsey R. Kipf, Thomas Google DeepMind United Kingdom Google Research United States University of Toronto Canada Vector Institute Canada UCL United Kingdom

We address the problem of multi-object 3D pose control in image diffusion models. Instead of conditioning on a sequence of text tokens, we propose to use a set of per-object representations, Neural Assets, to control the 3D pose of individual objects in a scene. Neural Assets are obtained by pooling visual representations of objects from a reference image, such as a frame in a video, and are trained to reconstruct the respective objects in a different image, e.g., a later frame in the video. Importantly, we encode object visuals from the reference image while conditioning on object poses from the target frame. This enables learning disentangled appearance and pose features. Combining visual and 3D pose representations in a sequence-of-tokens format allows us to keep the text-to-image architecture of existing models, with Neural Assets in place of text tokens. By fine-tuning a pre-trained text-to-image diffusion model with this information, our approach enables fine-grained 3D pose and placement control of individual objects in a scene. We further demonstrate that Neural Assets can be transferred and recomposed across different scenes. Our model achieves state-of-the-art multi-object editing results on both synthetic 3D scene datasets, as well as two real-world video datasets (Objectron, Waymo Open). Additional details and video results are available at our project page. © 2024 Neural information processing systems foundation. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

real time Facial Recognition-based Criminal Identification using MTCNN 2

Real Time Facial Recognition-based Criminal Identification u...

引用

2nd International conference on Sustainable Computing and Smart Systems (ICSCSS)

作者： Durai, S. Sujithra, T. Satyam, Battula Vishnuwardhan Keshetty, Sai Neeraj Sagar, Chilakapati Narasimha Shruti Charan, Athmakuri Sai Vel Tech Rangarajan Dr Sagunthala R&D Inst Sci & Dept CSE Chennai Tamil Nadu India Manipal Acad Higher Educ Manipal Inst Technol Dept CSE Manipal India Vel Tech Rangarajan Dr Sagunthala R&D Inst Sci & CSE Chennai Tamil Nadu India

ISBN: (纸本)9798350391558;9798350379990

This work presents a novel approach to real-time criminal detection through the use of cutting-edge face recognition technology. Accuracy and Reliability, Scalability, Environmental Variability, Camera Quality and Resource Constraints are the major challenges faced by this problem. In order to improve public safety and support law enforcement, the system uses the Multi-Task Cascade Neural Network (MTCNN) to reliably identify and recognize faces in difficult situations, such as low light or obscured views. Due to MTCNN's strong deep learning capabilities, people of interest may be quickly identified and perhaps prevented from committing crimes in busy or dimly light settings with high accuracy identification. Even with few reference photos, the technology can match identified faces to a database of known criminals, guaranteeing flexibility in a range of scenarios. One of its primary features is its 90% accuracy in real-time analysis of live video feeds from security cameras, which facilitates quick reactions to any threats and improves community safety. This technology, which combines face detection, identification, and real-time processing, is a major step forward for law enforcement in their fight against crime and for maintaining community security.

关键词： Face detection Face Recognition Multi Task cascaded Neural Networks real time Training real time processing Law Enforcement's

来源：评论

学校读者我要写书评

暂无评论

An image correction algorithm and implementation for remote sensing camera

An image correction algorithm and implementation for remote ...

引用

2022 Applied Optics and Photonics China: Optical Sensing, Imaging, and Display Technology, AOPC 2022

作者： Ming, Liu Wei, Huang Jianzhu, Tian Liang, Lei Beijing Institute of Space Mechanic & Electricity Beijing100094 China

ISBN: (纸本)9781510662285

real time control, diversified functions, system integration and miniaturization are an important development direction of video electronics system. Embedded design based on FPGA can manage system resources more reasonably and effectively, and use limited resources to achieve more complex functions. It will become an important design method to promote the development of video electronics technology. Based on the embedded MicroBlaze microprocessor in V FPGA chip, combined with the model application background and Xilinx embedded development platform, an eight channel gray image channel inconsistency correction algorithm is implemented. On board experiments show the feasibility of applying embedded system to video electronic design, and show its advantages in interface control, storage management, process control, floating-point operation, development efficiency and so on, and it can be applied to real-time image processing of satellite remote sensing camera. © 2023 SPIE.

关键词： real time control

来源：评论

学校读者我要写书评

暂无评论

Design of a Facial Recognition System for Power Grid video Surveillance Based on Deep Learning Algorithms

Design of a Facial Recognition System for Power Grid Video S...

引用

2024 International conference on Power, Electrical Engineering, Electronics and Control, PEEEC 2024

作者： Liu, Yuan Miao, Chunyuan Mei, Lin Lv, Jin Jiang, Geli Henan Zhengzhou450000 China Nanjing Nari Information and Communication Technology Co. Ltd Jiangsu Nanjing211100 China

ISBN: (纸本)9798350378917

In recent years, with the extensive application of deep learning methods, face recognition technology has been greatly developed. Aiming at the problem of video surveillance in power network, a video surveillance method based on image processing is proposed. In this paper, an intelligent decision model based on knowledge automatic extraction is studied and optimized. In power network monitoring, face detection and recognition technology based on deep learning is a commonly used face detection and recognition technology, including: face detection, face alignment, face size alignment. Aiming at video surveillance image in power system, a new technology with strong robustness is studied. This new algorithm can solve this problem well. On this basis, a power network monitoring system based on computer vision technology is adopted. Experimental results show that the proposed method can achieve {9 5. 9%} accuracy and {9 4. 7%} stability, and can be used in real environment. © 2024 IEEE.

关键词： Power systems computer aided design

来源：评论

学校读者我要写书评

暂无评论

EFFICIENT TRANSFORMER WITH LOCALLY SHARED ATTENTION FOR video QUALITY ASSESSMENT 29

EFFICIENT TRANSFORMER WITH LOCALLY SHARED ATTENTION FOR VIDE...

引用

IEEE International conference on image processing (ICIP)

作者： You, Junyong Lin, Yuan Norwegian Res Ctr NORCE Bergen Norway Kristiania Univ Coll Bergen Norway

ISBN: (数字)9781665496209

ISBN: (纸本)9781665496209

Transformer has shown outstanding performance in time-series data processing, which can definitely facilitate quality assessment of video sequences. However, the quadratic time and memory complexities of Transformer potentially impede its application to long video sequences. In this work, we study a mechanism of sharing attention across video clips in video quality assessment (VQA) scenario. Consequently, an efficient architecture based on integrating shared multi-head attention (MHA) into Transformer is proposed for VQA, which greatly ease the time and memory complexities. A long video sequence is first divided into individual clips. The quality features derived by an image quality model on each frame in a clip are aggregated by a shared MHA layer. The aggregated features across all clips are then fed into a global Transformer encoder for quality prediction at sequence level. The proposed model with a lightweight architecture demonstrates promising performance in no-reference VQA (NR-VQA) modelling on publicly available databases. The source code can be found at https://***/junyongyou/lagt_vqa.

关键词： Attention Transformer user-generated content (UGC) video quality assessment (VQA)

来源：评论

学校读者我要写书评

暂无评论

Vision Sense: real-time Object Detection And Audio Feedback System For Visually Impaired Individuals 2

Vision Sense: Real-Time Object Detection And Audio Feedback ...

引用

2nd IEEE World conference on Communication and Computing (WCONF)

作者： Chinni, Naga Praneeth Kumar Kaamaala, Sai Pranav Reddy Vardhan, Bhasham Vishva Kishan, Adari Uday Richards, Vasimalla Sunny Vardhan, Ramadasu Nooka Harsh Puneet Lovely Profess Univ Comp Sci & Engn Jalandhar Punjab India

ISBN: (纸本)9798350395334;9798350395327

This project introduces a transformative object detection system designed to enhance the navigational capabilities of visually impaired individuals through the application of advanced computer vision technologies. Utilizing the You Only Look Once (YOLO) model, paired with the Comprehensive Object Collection (COCO) dataset, this system provides real-time, accurate object detection and classification. The core functionality of the application allows for the processing of both static images and live video feeds, enabling blind users to receive auditory announcements of nearby objects, thereby assisting with spatial awareness and environmental interaction. The system leverages a pre-trained YOLO model to ensure robust detection performance, achieving a peak detection accuracy of 99%. By delivering object labels and bounding box coordinates audibly, the application serves as a critical tool in improving the daily independence and quality of life for people with visual impairments. This project not only highlights the potential of deep learning in assistive technologies but also underscores the importance of adaptive solutions in inclusive technology development.

关键词： YOLO model COCO dataset object detection visually impaired assistance real-time processing audio feedback computer vision assistive technology

来源：评论

学校读者我要写书评

暂无评论

AI based Analysis of Hyperspectral images with Spectacular Efficacy in video Analytics

AI based Analysis of Hyperspectral Images with Spectacular E...

引用

2024 International conference on Intelligent and Innovative Practices in Engineering and Management, IIPEM 2024

作者： Jadhav, Samarth Yogesh More, Rutuja Rajatam Tingare, Bhagyashree Ashok Lakshmi, G. Prasanna Kolse, Shivam Darwante, N.K. Sanjivani College of Engineering Department of Information Technology MH Kopargaon India D Y Patil College of Engineering Akurdi Department of Artificial Intelligence and Data Science Pune India Sandip University School of Computer Science and Engineering Maharashtra Nashik India Sanjivani College of Engineering Department of Electronics and Computer Engineering MH Kopargaon India

ISBN: (纸本)9798350390049

Hyperspectral imaging and artificial intelligence (AI) have transformed imaging and data processing through their ability to capture and analyze detailed spectral information. This paper explores the integration of hyperspectral imaging with AI, focusing on its impact on video analytics. Hyperspectral imaging provides comprehensive, multi-dimensional data by capturing a wide range of spectral wavelengths, enabling precise material identification and environmental monitoring beyond traditional RGB imaging. The synergy between hyperspectral imaging and AI enhances real-time analysis and decision-making by leveraging deep learning algorithms for pattern recognition and anomaly detection. This study introduces a novel AI-based hyperspectral image analysis approach for video analytics, utilizing convolutional neural networks (CNNs) and hybrid CNN-attention models to improve object recognition and classification. The methodology is validated through experiments that measure classification accuracy, processing speed, and resilience to varying conditions. Results demonstrate significant improvements in accuracy and efficiency over traditional methods, highlighting the potential for advanced applications in surveillance, environmental monitoring, and industrial quality control. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：