检索结果-内蒙古大学图书馆

2024 IEEE Andescon Conference

作者： Lima, Arthur M. Silva, Douglas L. Santos, Hercules I. A. Arias-Garcia, Janier Beserra, Gilmar S. Yudi, Jones Univ Brasilia Automat & Control Grp Brasilia DF Brazil Univ Fed Minas Gerais Dept Elect Engn Belo Horizonte MG Brazil Univ Brasilia Fac Gama Brasilia DF Brazil

ISBN: (纸本)9798350355291;9798350355284

In the advanced field of image processing and Computer vision (IP/Cv), there is a trend toward utilising parallel processing in computer architectures for enhanced efficiency, striking a balance between general-purpose capabilities and hardware-specific processes. The RISC-v standard, now backed by a wide array of compilers, frameworks, and operating systems, is paving the way for innovative cores. Our introduction of a Multi-Processor Systems on Chip (MPSoC), MPRISC-v, is a testament to this evolution. This system incorporates a Network on Chip (NoC) for robust intra-chip communication. The processing System (PS) seamlessly integrates and manages it through a user-friendly API crafted to simplify the development cycle. To ascertain its effectiveness, we tested it on a Zynq Ultrascale+ MPSoC device, deploying a Sobel-based application benchmark. By evaluating its efficiency in terms of cycles/pixels, our findings underscore its potential and spotlight areas ripe for further enhancement.

关键词： MPSoC IP/Cv RISC-v Sobel-filter FPGA

来源：评论

学校读者我要写书评

暂无评论

Motion-Oriented Diffusion Models for Facial Expression Synthesis 13

Motion-Oriented Diffusion Models for Facial Expression Synth...

引用

13th International Conference on image processing Theory Tools and applications

作者： Bouzid, Hamza Ballihi, Lahoucine Mohammed V Univ Rabat Fac Sci LRIT CNRST URAC 29 Rabat Morocco

ISBN: (纸本)9798331541859;9798331541842

Facial expression generation in computer vision is essential for improving human-computer interaction by enabling machines to interpret and respond to human emotions effectively. This area has attracted considerable research interest. In this context, we introduce a new approach for generating facial expressions from a single neutral image and a target expression label. Our method, referred to as Motion-Oriented Diffusion Model (MODM), leverages latent diffusion techniques, which are known for their ability to learn complex latent spaces and integrate controlled stochasticity to diversify generated content. The main idea of MODM is separating the embedding space into identity and motion domains, and applying diffusion to the motion latent space only. This strategy enhances our model capability to generate various facial expressions while ensuring that the identity details remain consistent across different expressions. To assess the effectiveness of MODM, we perform qualitative and quantitative evaluations using the MUG facial expression database. The preliminary results demonstrate that MODM can generate realistic videos of the six basic facial expressions, preserving the identity of the input subject while accurately representing different emotional states. Additionally, our study highlights promising directions for potential future research and improvements.

关键词： Diffusion Facial expression generation Generative

来源：评论

学校读者我要写书评

暂无评论

image Classifier Using Resource-Constrained Device and Tiny machine Learning 1st

Image Classifier Using Resource-Constrained Device and Tiny ...

引用

1st International Conference on Data Engineering and machine Intelligence, ICDEMI 2023

作者： vinod, K.S. Kanmani Ruby, E.D. Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology Chennai India

ISBN: (纸本)9789819776153

image classification is one of the main parts of computer vision, which is important in applications like self-driving automotives/vehicle systems. While working with image/video data it needs huge amount of resources including computing power, graphic processing units (GPU), memory, high end CPUs, etc. We can use the small microcontrollers to do the same task, by using high-end machines for training and building the model and converting the model so that they fit into microcontroller unit, by using the method called Transfer Learning. In our work, we use Arduino nano 33 BLE sense and Ov7675 camera module and online machine learning framework called Edge impulse for building the model. It is found that our tiny machine learning model works well and provides a real-time solution for image classification in the resource-constrained scenario. The experimental results show that the image classifier is performing with around 100% accuracy and so has got a wide scope in real-time classification applications. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Blur Patch Classification Approach to Single-image Depth Estimation 17

Blur Patch Classification Approach to Single-Image Depth Est...

引用

17th International Conference on machine vision, ICMv 2024

作者： Kim, Huijun Lee, Deokwoo Keimyung University Dalgubeol-daero Dalseo-gu Daegu1095 Korea Republic of

ISBN: (纸本)9781510688278

Depth information is useful in many image processing and computer vision applications, but in photography, depth information is lost in the process of projecting a real-world scene onto a 2D plane. Extracting depth information from such images is a challenging task. In this paper, we propose a method to train a deep neural network to classify an image patch (16x16 in size) into 15 levels based on the level of blur. Blur is related to the distance between the focal plane and the object. The input image is shifted using a sliding window technique at 8 pixel intervals and the trained blur classifier evaluates each blur level. The obtained blur maps are subjected to a refinement process to quantitatively assess their accuracy and impact on the final result, and the final blur maps are compared with the labels of the actual input data to estimate the depth map. The proposed method demonstrates that depth information can be successfully extracted from a single image by classifying the focus levels. © 2025 SPIE.

关键词： image classification

来源：评论

学校读者我要写书评

暂无评论

Training Auxiliary Prototypical Classifiers for Explainable Anomaly Detection in Medical image Segmentation 23

Training Auxiliary Prototypical Classifiers for Explainable ...

引用

23rd IEEE/CvF Winter Conference on applications of Computer vision (WACv)

作者： Cho, Wonwoo Park, Jeonghoon Choo, Jaegul IKAIST Daejeon South Korea Letsur Inc Seoul South Korea

ISBN: (纸本)9781665493468

machine learning-based algorithms using fully convolutional networks (FCNs) have been a promising option for medical image segmentation. However, such deep networks silently fail if input samples are drawn far from the training data distribution, thus causing critical problems in automatic data processing pipelines. To overcome such outof-distribution (OoD) problems, we propose a novel OoD score formulation and its regularization strategy by applying an auxiliary add-on classifier to an intermediate layer of an FCN, where the auxiliary module is helfpul for analyzing the encoder output features by taking their class information into account. Our regularization strategy train the module along with the FCN via the principle of outlier exposure so that our model can be trained to distinguish OoD samples from normal ones without modifying the original network architecture. Our extensive experiment results demonstrate that the proposed approach can successfully conduct effective OoD detection without loss of segmentation performance. In addition, our module can provide reasonable explanation maps along with OoD scores, which can enable users to analyze the reliability of predictions.

关键词： Training image segmentation machine learning algorithms Pipelines Training data Network architecture Data processing

来源：评论

学校读者我要写书评

暂无评论

Fine-tuned depth-augmented U-Net for enhanced semantic segmentation in indoor autonomous vision systems

引用

JOURNAL OF REAL-TIME image processing 2025年第1期22卷 1-12页

作者： Tran, Hoang N. Le, Thu A. N. Nguyen, Nghi v. Nguyen, Nguyen T. Nguyen, Anh D. FPT Univ Can Tho 94000 Vietnam Vietnam Natl Univ Ho Chi Minh City Univ Technol HCMUT Fac Mech Engn Dept Mechatron Engn Ho Chi Minh City Vietnam

Recent technological advancements have significantly improved indoor autonomous vision systems (IAvSs), underscoring the critical need to enhance their capability to interpret real-world environments in a manner similar to human perception. In response to this challenge, this paper introduces DEADFL-UNet, a groundbreaking framework that enhances the existing EADFL-UNet architecture. EADFL-UNet utilized the EfficientNetB3 model, supplemented by a new Super Attention Block and CBW-FL Loss Function, to tackle the significant data imbalance found in the NYUv2 dataset. Our enhancement focuses on using the MobileNetv2 model in conjunction with several fine-tuning techniques to maximize Depth characteristics in tandem with RGB ones inside the prior architecture. By applying the proposed techniques, we achieved an improvement of approximately 6% in mIoU (Mean Intersection over Union) compared to the original EADFL-UNet model, which was previously published. Furthermore, the difference between the fine-tuned and non-fine-tuned versions is 1.91% in mIoU, demonstrating the significant effectiveness of the fine-tuning technique. To confirm the real-time FPS (Frame Per Second) performance of each model, this technique has undergone extensive testing and assessment using standard metrics, not only on pre-existing datasets but also in a ROS2 (Robot Operating System) simulation environment. These proven techniques have potential for various applications in autonomous systems, such as robotic vision, GPS (Global Positioning System) position tracking, autonomous vehicles, and security, improving accuracy and efficiency.

关键词： Deep learning Autonomous systems (AS) Segmentation EADFL-UNet EfficientNet MobileNet Fine-tuning Scene detection

来源：评论

学校读者我要写书评

暂无评论

Space Imaging Point Source Detection and Characterization

引用

IEEE ACCESS 2024年 12卷 90442-90460页

作者： Ribeiro, Francisco S. F. Garcia, Paulo J. v. Silva, Miguel Cardoso, Jaime S. Univ Porto Fac Engn P-4250465 Porto Portugal Inst Super Tecn CENTRA P-1049001 Lisbon Portugal INESC TEC P-4200465 Porto Portugal

Point source detection algorithms play a pivotal role across diverse applications, influencing fields such as astronomy, biomedical imaging, environmental monitoring, and beyond. This article reviews the algorithms used for space imaging applications from ground and space telescopes. The main difficulties in detection arise from the incomplete knowledge of the impulse function of the imaging system, which depends on the aperture, atmospheric turbulence (for ground-based telescopes), and other factors, some of which are time-dependent. Incomplete knowledge of the impulse function decreases the effectiveness of the algorithms. In recent years, deep learning techniques have been employed to mitigate this problem and have the potential to outperform more traditional approaches. The success of deep learning techniques in object detection has been observed in many fields, and recent developments can further improve the accuracy. However, deep learning methods are still in the early stages of adoption and are used less frequently than traditional approaches. In this review, we discuss the main challenges of point source detection, as well as the latest developments, covering both traditional and current deep learning methods. In addition, we present a comparison between the two approaches to better demonstrate the advantages of each methodology.

关键词： Noise measurement Detectors Deep learning Apertures Signal to noise ratio Atmospheric measurements image processing machine learning Object detection image annotation digital images high-resolution imaging machine learning object detection

来源：评论

学校读者我要写书评

暂无评论

A 619-pixel machine vision enhancement chip based on two-dimensional semiconductors

引用

SCIENCE ADvANCES 2022年第31期8卷 000-000页

作者： Ma, Shunli Wu, Tianxiang Chen, Xinyu Wang, Yin Ma, Jingyi Chen, Honglei Riaud, Antoine Wan, Jing Xu, Zihan Chen, Lin Ren, Junyan Zhang, David Wei Zhou, Peng Chai, Yang Bao, Wenzhong Fudan Univ Sch Microelect State Key Lab ASIC & Syst Shanghai 200433 Peoples R China Fudan Univ Sch Informat Sci & Technol State Key Lab ASIC & Syst Shanghai 200433 Peoples R China Shenzhen Sixcarbon Technol 188 Jiangshi Rd Shenzhen 518106 Peoples R China Hong Kong Polytech Univ Dept Appl Phys Kowloon Hung Hom Hong Kong Peoples R China

The rapid development of machine vision applications demands hardware that can sense and process visual information in a single monolithic unit to avoid redundant data transfer. Here, we design and demonstrate a monolithic vision enhancement chip with light-sensing, memory, digital-to-analog conversion, and processing functions by implementing a 619-pixel with 8582 transistors and physical dimensions of 10 mm by 10 mm based on a wafer-scale two-dimensional (2D) monolayer molybdenum disulfide (MoS2). The light- sensing function with analog MoS2 transistor circuits offers low noise and high photosensitivity. Furthermore, we adopt a MoS2 analog processing circuit to dynamically adjust the photocurrent of individual imaging sensor, which yields a high dynamic light- sensing range greater than 90 decibels. The vision chip allows the applications for contrast enhancement and noise reduction of image processing. This large-scale monolithic chip based on 2D semiconductors shows multiple functions with light sensing, memory, and processing for artificial machine vision applications, exhibiting the potentials of 2D semiconductors for future electronics.

关键词： Molybdenum disulfide

来源：评论

学校读者我要写书评

暂无评论

Camera-Independent Single image Depth Estimation from Defocus Blur

Camera-Independent Single Image Depth Estimation from Defocu...

引用

IEEE/CvF Winter Conference on applications of Computer vision (WACv)

作者： Wijayasingha, Lahiru Alemzadeh, Homa Stankovic, John A. Univ Virginia Comp Sci Dept Charlottesville VA 22904 USA Univ Virginia Elect & Comp Engn Dept Charlottesville VA USA

ISBN: (纸本)9798350318920;9798350318937

Monocular depth estimation is an important step in many downstream tasks in machine vision. We address the topic of estimating monocular depth from defocus blur which can yield more accurate results than the semantic based depth estimation methods. The existing monocular depth from defocus techniques are sensitive to the particular camera that the images are taken from. We show how several camera-related parameters affect the defocus blur using optical physics equations and how they make the defocus blur depend on these parameters. The simple correction procedure we propose can alleviate this problem which does not require any retraining of the original model. We created a synthetic dataset which can be used to test the camera independent performance of depth from defocus blur models. We evaluate our model on both synthetic and real datasets (DDFF12 and NYU depth v2) obtained with different cameras and show that our methods are significantly more robust to the changes of cameras. Code: https://github. com/ sleekEagle/ defocus_ ***

关键词： 3D computer vision Algorithms Algorithms Algorithms Computational photography Datasets and evaluations image and video synthesis

来源：评论

学校读者我要写书评

暂无评论

Attention Modules Improve image-Level Anomaly Detection for Industrial Inspection: A DifferNet Case Study

Attention Modules Improve Image-Level Anomaly Detection for ...

引用

IEEE/CvF Winter Conference on applications of Computer vision (WACv)

作者： vieira E Silva, Andre Luiz Simties, Francisco Kowerko, Danny Schlosser, Tobias Battisti, Felipe Teichrieb, veronica Univ Fed Pernambuco Ctr Informat Voxar Labs Recife PE Brazil Tech Univ Chemnitz Jr Professorship Media Comp Chemnitz Germany Univ Fed Rural Pernambuco Visual Comp Lab DC Recife PE Brazil

ISBN: (纸本)9798350318920;9798350318937

Within (semi-)automated visual industrial inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on high-resolution imagery. The emergence of these often rarely occurring defect patterns explains the general need for labeled data corpora. To alleviate this issue and advance the current state of the art in unsupervised visual inspection, this work proposes a DifferNet-based solution enhanced with attention modules: AttentDifferNet. It improves image-level detection and classification capabilities on three visual anomaly detection datasets for industrial inspection: InsPLAD-fault, MvTec AD, and Semiconductor Wafer. In comparison to the state of the art, AttentDifferNet achieves improved results, which are, in turn, highlighted throughout our quali-quantitative study. Our quantitative evaluation shows an average improvement compared to DifferNet - of 1.77 +/- 0.25 percentage points in overall AUROC considering all three datasets, reaching SOTA results in InsPLAD-fault, an industrial inspection in-the-wild dataset. As our variants to AttentDifferNet show great prospects in the context of currently investigated approaches, a baseline is formulated, emphasizing the importance of attention for industrial anomaly detection both in the wild and in controlled environments.

关键词： Algorithms Algorithms and algorithms applications formulations image recognition and understanding machine learning architectures Remote Sensing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：