检索结果-内蒙古大学图书馆

4th International conference on Intelligent Systems and Pattern Recognition, ISPR 2024

作者： Sehairi, Kamal Bouhafs, Abdelkader Boucherit, Ibtissam Mama Bouwmans, Thierry Chouireb, Fatima Department of Physics Laboratory of Applied Sciences and Didactics École Normale Supérieure de Laghouat Laghouat Algeria Department of Electronics Telecommunication Signals and Systems Laboratory University Amar Telidji of Laghouat Laghouat Algeria Department of Computer Science Lab. Mathématiques Image et Applications Université de La Rochelle La Rochelle France

ISBN: (纸本)9783031821523

Identifying and locating objects in images and videos, including elements like traffic signs, vehicles, buildings, and people, constitutes a fundamental and demanding task in computer vision, known as object detection. Due to the higher computing complexity of this technique and the large amount of data carried by the video signal, it is nearly impossible for ordinary general-purpose processors GPPs or CPUs to run these techniques in real-time, especially for embedded systems applications. Therefore, special hardware that can acquire, control, or execute in parallel is required. These specialized hardware systems include Digital Signal Processors DSPs, Field Programmable Gate Arrays FPGAs, visual processing Units VPUs, Tensor processing Units TPUs, Neural processing Units NPUs or Graphics processing Units GPUs. This work presents the benefits of accelerating traditional object detection methods on a high-end embedded system, the Jetson Nano Developer Kit. This single computer board is equipped with the Tegra K1 System on Chip SoC, which is composed of a quad-core ARM A15 and 192 cores of Kepler-embedded GPU. Computing acceleration was ensured via the use of the CUDA OpenCV library for both the Histogram of Oriented Gradients HOG and the Haar Cascade Classifier. For VGA resolution, results reveal that the GPU implementation on this embedded system is 1.4× faster than the CPU for the HOG method and 2× for the Haar Cascade Classifier method. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： System-on-chip

来源：评论

学校读者我要写书评

暂无评论

MKD-YOLO: Multi-Scale and Knowledge-Distilling YOLO for Efficient PPE Compliance Detection

MKD-YOLO: Multi-Scale and Knowledge-Distilling YOLO for Effi...

引用

International conference on Acoustics, Speech, and Signal processing (ICASSP)

作者： Juntao Zan Yang Fang Qilie Liu Uswah Khairuddin Yan Li Kaiwei Sun School of Communications and Information Engineering Chongqing University of Posts and Telecommunications Chongqing China Chongqing Key Laboratory of Image Cognition Chongqing University of Posts and Telecommunications Chongqing China Malaysia-Japan International Institute of Technology University of Technology Malaysia Kuala Lumpur Malaysia Department of Electrical and Computer Engineering Inha University Incheon South Korea Key Laboratory of Data Engineering and Visual Computing Chongqing University of Posts and Telecommunications Chongqing China

ISBN: (数字)9798350368741

ISBN: (纸本)9798350368758

YOLO-based models are widely used for personal protective equipment (PPE) compliance detection due to their excellent detection performance and efficiency. However, most YOLO models are not competent for detection tasks in complex industrial scenarios such as remote surveillance and extremely small targets. In addition, there is a lack of effective model lightweighting and knowledge transfer approaches for industrial deployment. To this end, this paper proposes a Multi-scale and Knowledge-Distilling YOLO (MKD-YOLO) based on YOLOv8n for efficient PPE compliance detection. Specifically, in backbone stage, we design an Efficient Multi-Scale Enhanced Convolution (C2f-EMSEC) module and Large Spatial Pyramid Pooling-Fast (LSPPF) module for multi-scale and global-contextual feature learning as well as reducing model complexity. Then, in neck stage, a refined Bidirectional feature Pyramid Network (BPNet) is designated to capture fine-grained details for extremely small object detection. Moreover, we apply channel-wise knowledge distillation to facilitate model lightweighting and domain-specific knowledge transfer learning. Experiments on our proposed dataset and public datasets show that the proposed MKD-YOLO achieves a new state-of-the-art (SOTA) detection performance and efficiency for practical PPE compliance detection tasks. Codes and the dataset are available at https://***/z1Zjt/MKD-YOLO.

关键词： Personal protective equipment YOLO Representation learning Convolution Surveillance Speech enhancement Feature extraction Neck Kernel Knowledge transfer

来源：评论

学校读者我要写书评

暂无评论

A 3D MRI Brain image Segmentation and Reconstruction System Based on Augmented Reality Technology 10th

A 3D MRI Brain Image Segmentation and Reconstruction System ...

引用

10th China Health Information processing conference, CHIP 2024

作者： Zhao, Wang Lu, Peixin Hu, LianTing Lu, Long School of Management Engineering Henan University of Engineering Zhengzhou China School of Safety Science and Emergency Management Wuhan University of Technology Wuhan China Guangdong Provincial People’s Hospital Guangzhou China School of Information Management Wuhan University Wuhan China The Center for Healthcare Big Data Research The Big Data Institute Wuhan University Wuhan China Institute of Pediatrics Guangzhou Women and Children’s Medical Center Guangzhou Medical University Guangzhou China School of Public Health Wuhan University Wuhan China

ISBN: (纸本)9789819637546

Background With the rapid development of information technology and the digitization of medical devices, various diseases require the use of medical imaging equipment for diagnosis. At present, various medical imaging diagnostic equipment such as CT and nuclear magnetic resonance can provide two-dimensional planar images of diseases. Doctors urgently need to accurately determine the spatial location, size, geometry, and spatial relationship with the surrounding tissue. Therefore, it is very important to use computer technology to segment 3D MRI images, determine the location of lesions, and then perform 3D reconstruction. Method At present, automatic recognition and marking of brain images are displayed in two dimensions. Therefore, it is necessary to use 3D visualization technology for reconstruction. In addition, it can be combined with virtual and real, and some additional information is superimposed on the brain image for integrated display. In addition, a combination of virtual and real needs to be superimposed, and some additional information is superimposed on the brain image for integrated display. The research focus of this paper includes two main parts: disease segmentation and 3D reconstruction visualization. Firstly, the disease segmentation method based on 3D MRI brain image files was designed, and then the feature extraction and 3D reconstruction functions were designed. Thereby forming a complete process of disease region segmentation and three-dimensional reconstruction. Results This study is based on a three-dimensional MRI brain image segmentation algorithm. The algorithm is advanced in technology, high in accuracy, and can effectively identify the location of the disease. Then, this study used the Unity tool to implement a three-dimensional reconstruction and visual display program for brain image disease segmentation. Therefore, the doctor can quickly and intuitively grasp the spatial information inside the brain and the information of the lesion

关键词： image segmentation

来源：评论

学校读者我要写书评

暂无评论

Impact of human visual perception of color on very low bit-rate image coding

引用

Proceedings of SPIE - The International Society for Optical Engineering 1994年第p 1期2308卷 39-46页

作者： Rajala, Sarah A. North Carolina State Univ. Raleigh NC USA

ISBN: (纸本)081941638X;9780819416384

One of the keys to obtaining acceptable quality imagery/video encoded at very low bit rates is to transmit only that information which is critical to human perception. To successfully achieve this goal, one must not only understand the human visual system, but be able to utilize this information in the design of their codec. This paper will present an overview of the properties associated with color science and human visual perception, and how they could make an impact on very low bit-rate image coding.

关键词： image coding

来源：评论

学校读者我要写书评

暂无评论

Model-based image coding

引用

Proceedings of SPIE - The International Society for Optical Engineering 1994年第p 2期2308卷 1035-1049页

作者： Aizawa, Kiyoharu Univ. of Tokyo Bunkyo-ku Tokyo Jpn

image coding schemes are described from the point of view of their associated image models. Among the work related to these paradigms in the Univ. of Tokyo, 3-D model-based coding and 2-D deformable triangle based mot... 详细信息

ISBN: (纸本)081941638X;9780819416384

关键词： image coding

来源：评论

学校读者我要写书评

暂无评论

CONTOUR SIMPLIFICATION BY A NEW NONLINEAR FILTER FOR REGION-BASED CODING

CONTOUR SIMPLIFICATION BY A NEW NONLINEAR FILTER FOR REGION-...

引用

conference on visual communications and image processing 94

作者： GU, C KUNT, M SWISS FED INST TECHNOL SIGNAL PROC LABCH-1015 LAUSANNESWITZERLAND

ISBN: (纸本)081941638X

The guiding principle of this study is to find an optimum way to simplify the contours produced by a second generation coding scheme based on morphological segmentation. For this purpose, evaluations of existing methods for contour simplification are carried out first. Based on the human visual phenomenon, a new nonlinear filter by means of majority operation is designed to simplify the contours in order to obtain an optimum compromise between the cost for contour coding and visual quality. Applications for region-based still image coding and video coding are demonstrated. Experimental results have shown an average of 20% reduction of bits for contour coding while keeping good visual quality.

关键词： image segmentation image compression Nonlinear filtering image filtering visualization image processing algorithms and systems Mathematical morphology Video coding 3D image processing image quality

来源：评论

学校读者我要写书评

暂无评论

Real-time Learned image Codec on FPGA

Real-time Learned Image Codec on FPGA

引用

IEEE International conference on visual communications and image processing (VCIP)

作者： Sun, Heming Yi, Qingyang Lin, Fangzheng Yu, Lu Katto, Jiro Fujita, Masahiro Waseda Univ Tokyo Japan Zhejiang Univ Hangzhou Zhejiang Peoples R China JST PRESTO Kawaguchi Saitama Japan Univ Tokyo Tokyo Japan AIST Tsukuba Ibaraki Japan

This demo paper gives a real-time learned image codec on FPGA. By using Xilinx VCU128, the proposed system reaches 720P@30fps codec, which is 7.76x faster than prior work.

ISBN: (纸本)9781665475921

This demo paper gives a real-time learned image codec on FPGA. By using Xilinx VCU128, the proposed system reaches 720P@30fps codec, which is 7.76x faster than prior work.

关键词： image coding neural network FPGA

来源：评论

学校读者我要写书评

暂无评论

A visual MODEL FOR OPTIMIZING THE DESIGN OF image processing ALGORITHMS

A VISUAL MODEL FOR OPTIMIZING THE DESIGN OF IMAGE PROCESSING...

引用

1994 IEEE International conference on image processing (ICIP-94)

作者： DALY, S EASTMAN KODAK CO ROCHESTERNY 14650

ISBN: (纸本)0818669527

The paper describes an algorithm for the assessment of image fidelity. The algorithm includes an image processing model of the human visual system for luminance still imagery. The major components of the algorithm are described that model the visual system as three main sensitivity variations. These address the sensitivity as a function of gray level, as a function of spatial frequency, and as a function of image content. To quantify the performance of the algorithm, specific psychophysical experiments were simulated, and these results are shown. © 1994 IEEE.

关键词： image processing

来源：评论

学校读者我要写书评

暂无评论

A Marked Point Process Model For visual Perceptual Groups Extraction

A Marked Point Process Model For Visual Perceptual Groups Ex...

引用

IEEE International conference on visual communications and image processing (VCIP)

作者： Mbarki, Amal Naouai, Mohamed Univ Tunis EL MANAR Fac Sci Tunis Tunis Tunisia

ISBN: (纸本)9781728180687

Perceptual organization is the process of assigning each part of a scene to a specified association of features to be a part of the same organization. In the twenty century, Gestalt psychologists formalized how image features tend to be grouped by giving a set of organizing principles. In this paper, we propose an approach for the detection of perceptual groups in an image. We are mainly interested in features grouped by the proximity law of Gestalt. We conceive an object-based model within a stochastic framework using a marked point process (MPP). We use a Bayesian learning method to extract perceptual groups in a scene. The proposed model tested on synthetic images proves the efficient detection of perceptual groups in noisy images.

关键词： visual perception perceptual organisation image processing marked point process

来源：评论

学校读者我要写书评

暂无评论

Learning to Fly with a Video Generator

Learning to Fly with a Video Generator

引用

IEEE International conference on visual communications and image processing (VCIP) - visual communications in the Era of AI and Limited Resources

作者： Chia-Chun Chung Wen-Hsiao Peng Teng-Hu Cheng Chia-Hau Yu Natl Yang Ming Chiao Tung Univ Hsinchu Taiwan

ISBN: (纸本)9781728185514

This paper demonstrates a model-based reinforcement learning framework for training a self-flying drone. We implement the Dreamer proposed in a prior work as an environment model that responds to the action taken by the drone by predicting the next video frame as a new state signal. The Dreamer is a conditional video sequence generator. This model-based environment avoids the time-consuming interactions between the agent and the environment, speeding up largely the training process. This demonstration showcases for the first time the application of the Dreamer to train an agent that can finish the racing task in the Airsim simulator.

关键词： Training visual communication image processing Atmospheric modeling Video sequences Reinforcement learning Predictive models

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：