检索结果-内蒙古大学图书馆

2023 International conference on image, Signal processing, and Pattern Recognition, ISPP 2023

作者： Wang, Zhengshuai Qiu, Liankui Li, Yinggang College of Information Engineering Henan University of Science and Technology Henan Luoyang471023 China

ISBN: (纸本)9781510666351

At present, most unmanned aerial vehicles (UAV) smoke detection systems transmit video back to the ground station computer for analysis to determine whether a fire has occurred, Since the image transmission process takes a certain amount of time and interferes with various interference sources, the response time of smoke detection and the calculation amount of subsequent image processing are increased. In order to reduce the response time of smoke detection, this paper proposes a smoke detection method suitable for UAVs to achieve smoke detection at the UAVs. The improved YUV color model is used to filter and block the video images acquired by the UAVs. Extract the spatiotemporal and dynamic features of smoke;These smoke features are trained and classified using a support vector machine (SVM) to detect the presence of smoke in the video image. Experimental results show that compared with the commonly used smoke detection methods, the accuracy of smoke detection is significantly improved, and the response time is greatly reduced. © 2023 SPIE.

关键词： Smoke

来源：评论

学校读者我要写书评

暂无评论

Automated Pseudo-Label Generation and Parallel Computing for Enhanced Few-Shot Medical image Segmentation

Automated Pseudo-Label Generation and Parallel Computing for...

引用

2024 Asia Pacific Signal and Information processing Association Annual Summit and conference

作者： Trong-Duc Nguyen Tien-Dung Do Thanh-Ha Do VNU Univ Sci Hanoi Vietnam Post & Telecommun Inst Technol PTIT Hanoi Vietnam

ISBN: (数字)9798350367331

ISBN: (纸本)9798350367331;9798350367348

Few-shot semantic segmentation is a technique with significant potential for medical image segmentation tasks. Most existing few-shot semantic segmentation methods require fully annotated labels for the training process. However, these methods may not be suitable for medical images, where data collection and labeling are challenging. To address this issue, this paper proposed an enhanced, few-shot semantic segmentation model with a new pre-processing step to generate pseudo-labels automatically. In this paper, parallel computing is also developed to accelerate image pre-processing. Experiments done on MRI image datasets present the effectiveness of the new approach since it outperforms conventional few-shot semantic segmentation methods.

关键词： Training Computational modeling Semantic segmentation Magnetic resonance imaging Manuals Parallel processing real-time systems Labeling Optimization Biomedical imaging

来源：评论

学校读者我要写书评

暂无评论

Laboratory equipment image data patrol device based on Internet of Things 2

Laboratory equipment image data patrol device based on Inter...

引用

2nd Asia conference on Computer Vision, image processing and Pattern Recognition (CVIPPR)

作者： Zhang, Zhangliyong Liang, Liangyanxin Meng, Mengjing Duan, Duanxiaomeng Yang, Yangyubo Zang, Zhangbo Beijing Elect Power Sci & Smart Chip Technol Co Beijing Peoples R China China Elect Power Res Inst Beijing Peoples R China

ISBN: (纸本)9798400716607

With the continuous development of Internet of Things technology, laboratory equipment management is gradually changing to the direction of intelligence and remote. In this paper, aiming at the data detection of laboratory equipment, a solution of laboratory equipment image data patrol device based on Internet of Things technology is proposed. Through the acquisition, processing and transmission of equipment image data, the real-time monitoring and evaluation of equipment operation status and performance are realized. The research in this paper has certain reference value for improving the management efficiency and operation performance of laboratory equipment.

关键词： Internet of Things (IoT) technology laboratory equipment management image data inspection remote monitoring equipment performance evaluation

来源：评论

学校读者我要写书评

暂无评论

real-time optical flow processing on embedded GPU: an hardware-aware algorithm to implementation strategy

引用

JOURNAL OF real-time image processing 2022年第2期19卷 317-329页

作者： Seznec, Mickael Gac, Nicolas Orieux, Francois Naik, Alvin Sashala Thales Res & Technol Palaiseau France Univ Paris Saclay Lab Signaux & Syst Cent Supelec CNRS Gif Sur Yvette France

Determining the optical flow of a video is a compute-intensive task essential for computer vision. For achieving this processing in real time, the whole algorithm deployment chain must be thought of for efficiency first. The development is usually divided into two parts: first, designing an algorithm that meets precision constraints, then, implementing and optimizing its execution on the targeted platform. We argue that unifying those operations enhances performance on the embedded processor. This paper is based on an industrial use case of computer vision. The objective is to determine dense optical flow in real time on an embedded GPU platform: the Nvidia AGX Xavier. The CLG (combined local-global) optical flow method, initially chosen, is analyzed to understand the convergence speed of its underlying optimization problem. The Jacobi solver is selected for implementation because of its parallel nature. The whole multi-level processing is then ported to the GPU, using several specific optimization strategies. In particular, we analyze the impact of fusing the solver's iterations with the roofline model. As a result, with a 30 W power budget, our implementation runs at 60FPS, on 640 x 512 images, with a four-level processing. Hopefully, this example should provide feedback on the issues that arise when trying to port a method to a parallel platform and serve for further implementations of computer vision algorithms on specialized hardware.

关键词： Algorithm design Optical flow GPU optimization Linear solvers image processing

来源：评论

学校读者我要写书评

暂无评论

Onboard Person Retrieval System With Model Compression: A Case Study on Nvidia Jetson Orin AGX

引用

IEEE ACCESS 2025年 13卷 8257-8269页

作者： Chaudhari, Jay N. Galiyawala, Hiren Sharma, Paawan Shukla, Pancham Raval, Mehul S. Ahmedabad Univ Sch Engn & Appl Sci Ahmadabad 380009 India RyDOT Infotech Pvt Ltd Ahmadabad 380027 India Pandit Deendayal Energy Univ Sch Technol Gandhinagar 382007 India Imperial Coll London London SW7 2AZ England

A person retrieval system (PRS) in video surveillance identifies an individual based on descriptive attributes, a task that employs several computationally intensive deep learning models. We implement and analyse a PRS for pre-recorded videos on a graphics processing unit (GPU) and Nvidia Jetson Orin AGX. This paper presents a new Person Attribute Recognition (PAR) architecture, CorPAR, using three backbone networks, ConvNext, ResNet-50, and EfficientNet-B0. It enhances the F1-score by 4.1% with ConvNeXT-Base, 1.63% with the ResNet, and by 8.07% with EfficientNet-B0, surpassing the performance of the state-of-the-art Weighted-PAR method. The proposed method uses model compression techniques like quantisation and pruning with L1 regularisation to assess their impact on person retrieval. The study reveals that the PRS utilising EfficientNet-B0, with 32-bit quantisation, achieves the best performance, delivering a throughput of 22 frames per second and a True Positive Rate of 71% on Nvidia Jetson Orin AGX matching the performance of a model implemented using GPU.

关键词： Clothing image edge detection Surveillance Quantization (signal) real-time systems Performance evaluation image color analysis Graphics processing units videos Computational modeling Edge device model compression person attribute recognition person retrieval pruning quantization surveillance

来源：评论

学校读者我要写书评

暂无评论

real-time CAPABILITY OF DLR'S BEAMFORMING SYNTHETIC APERTURE RADAR processing ARCHITECTURE 49

REAL-TIME CAPABILITY OF DLR'S BEAMFORMING SYNTHETIC APERTURE...

引用

49th IEEE International conference on Acoustics, Speech, and Signal processing (ICASSP)

作者： Schlemon, Maron Schulz, Martin Scheiber, Rolf Jaeger, Marc Oliva, Joel Amao German Aerosp Ctr DLR Microwaves & Radar Inst Cologne Germany Tech Univ Munich Chair Comp Architecture & Parallel Syst Munich Germany

ISBN: (纸本)9798350374520;9798350374513

Synthetic Aperture Radar (SAR) enables the generation of realistic and high-resolution 2D or 3D representations of landscapes. Typically, radar instruments are deployed in specially equipped, low-flying aircraft that capture a significant amount of raw data, necessitating image reconstruction processing. However, the aircraft's limited onboard processing capabilities (power, size, weight, cooling, and communication bandwidth to ground stations) and the need to generate multiple SAR products, such as slant-range and geo-coded images during a single flight, require efficient onboard processing and transmission to the ground station. This paper outlines the processing architecture of the digital beamforming SAR (DBFSAR) employed by the German Aerospace Center (DLR) and the specific measures implemented to enable onboard processing. We elucidate the essential software optimizations and their integration into the SAR onboard routines, facilitating (near) real-time capability under certain conditions. Furthermore, we share the insights gained from our work and discuss their applicability to other processing scenarios with limited resource availability.

关键词： real-time SAR Resource-Constrained Computing On-Board SAR processing

来源：评论

学校读者我要写书评

暂无评论

Simultaneous context and motion learning in video prediction

引用

SIGNAL image AND video processing 2023年第8期17卷 3933-3942页

作者： Vu, Duc-Quang Thu, Trang Phung T. Thai Nguyen Univ Educ Dept CSIS Thai Nguyen Vietnam Natl Cent Univ Dept CSIE Taoyuan Taiwan Thai Nguyen Univ Thai Nguyen Vietnam

video prediction aims to generate future frames from the past several given frames. It has many applications for abnormal action recognition, future traffic prediction, long-term planning and autonomous driving. Recently, various deep learning-based methods have been proposed to address this task. However, these methods seem only to focus on increasing the network performance and ignore the computational cost problem of them. Even, several methods require two separate networks to perform with two different input types such as RGB, temporal gradient and optical flow. This makes them more and more complex and requires a extremely huge computational cost and memory space. In this paper, we introduce a simple yet robust approach to learn simultaneous both appearance and motion features in only a network regardless diversity of input video modalities. Moreover, we also present a lightweight autoencoder network for addressing this issue. Our framework is conducted on various benchmarks such as KTH, KITTI and BAIR datasets. The experimental results have shown that our approach achieves competitive performance compared to state-of-the-art video prediction methods with only 34.24MB of memory space and 2.59GFLOPs. With a smaller model size and less computational cost, our framework can run faster with a small inference time compared to the other methods. Besides, it only with 2.934 s to predict the next frame, our framework is a promising approach to deploy on embedded or mobile devices without GPU in real time.

关键词： video prediction Simultaneous context and motion learning Future frame prediction

来源：评论

学校读者我要写书评

暂无评论

Design and implementation of polarizing optical image frame grabber

Design and implementation of polarizing optical image frame ...

引用

2024 International conference on image, Signal processing, and Pattern Recognition, ISPP 2024

作者： Huang, Shi-Zhao Wang, En-Liang Anhui Xinhua University No. 555 Wangjiang West Road Anhui Province Hefei City230088 China

ISBN: (纸本)9781510680425

A polarization image frame capture device based on Camera Link interface is proposed and implemented. The polarization image frame capture device adopts large-capacity buffer device, multi-bus switching and DSP technology, which solves the problem of multi-channel asynchronous image parallel acquisition and high-speed real-time polarization image synthesis. It can capture and synthesize the larger polarization image frames in real time. © 2024 SPIE.

关键词： Cameras

来源：评论

学校读者我要写书评

暂无评论

MaskINT: video Editing via Interpolative Non-autoregressive Masked Transformers

MaskINT: Video Editing via Interpolative Non-autoregressive ...

引用

IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR)

作者： Ma, Haoyu Mahdizadehaghdam, Shahin Wu, Bichen Fan, Zhipeng Gu, Yuchao Zhao, Wenliang Shapira, Lior Xie, Xiaohui Univ Calif Irvine Irvine CA 92697 USA Meta GenAI Menlo Pk CA USA Natl Univ Singapore Singapore Singapore

ISBN: (纸本)9798350353013;9798350353006

Recent advances in generative AI have significantly enhanced image and video editing, particularly in the context of text prompt control. State-of-the-art approaches predominantly rely on diffusion models to accomplish these tasks. However, the computational demands of diffusion-based methods are substantial, often necessitating large-scale paired datasets for training, and therefore challenging the deployment in real applications. To address these issues, this paper breaks down the text-based video editing task into two stages. First, we leverage an pre-trained text-to-image diffusion model to simultaneously edit few keyframes in an zero-shot way. Second, we introduce an efficient model called MaskINT, which is built on non-autoregressive masked generative transformers and specializes in frame interpolation between the edited keyframes, using the structural guidance from intermediate frames. Experimental results suggest that our MaskINT achieves comparable performance with diffusion-based methodologies, while significantly improve the inference time. This research offers a practical solution for text-based video editing and showcases the potential of non-autoregressive masked generative transformers in this domain.

关键词： masked transformers video editing video generation

来源：评论

学校读者我要写书评

暂无评论

Lane Detection for Autonomous vehicles using image Transformation Techniques 5

Lane Detection for Autonomous vehicles using Image Transform...

引用

5th IEEE International conference for Emerging Technology, INCET 2024

作者： Shettar, Poonam M. Jadhav, Abhijeet Karoshi, Divya Aishwarya, G. Kulkarni, Akash SoECE KLE Technological University Hubballi India

ISBN: (纸本)9798350361155

Lane detection is critical in autonomous driving and advanced driver assistance systems (ADAS), furnishing vital information for vehicle navigation and safety. The study introduces lane detection methodology leveraging established image processing techniques within the image transformation frameworks, including the study of kernels. This approach accurately detects lane markings in real-time images or video streams with the help of Gaussian blur and Canny edge detection. The system's evaluation primarily focuses on structured roads under standard conditions, demonstrating its efficacy in such environments. However, the potential for enhancing vehicle autonomy and safety across varied driving scenarios remains prominent. Leveraging image transformation capabilities and the insights gained from kernel studies, this research advances computer vision applications in the automotive sector, facilitating the evolution of more intelligent and adaptable driving ***, the study introduces a method for assessing the accuracy of the detected lanes by calculating the intersection over union (IOU). © 2024 IEEE.

关键词： Autonomous vehicles

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：