检索结果-内蒙古大学图书馆

2023 5th International Conference on Artificial Intelligence and Computer applications, ICAICA 2023

作者： Li, Yuan Yu, Xin Modern Finance Industry School Shandong Institute of Commerce and Technology Shandong Jinan China

ISBN: (纸本)9798350323313

In this paper, the 3D space imaging model of machine vision is constructed. Starting from the traditional machine vision image processing algorithm flow, the image denoising process and target tracking process are optimized. The method uses the camera to collect the image and video information of the measured object, and transmits it to the controller. The controller corrects the signal obtained by the wireless sensor in the database to reproduce the position of the measured object and the 3D image. A real-time tracking method of motion trajectory based on computer vision is presented. The object autonomous capture, 3D position and motion trajectory tracking. Simulation experiments show that this method is quite different from conventional image processing methods. This method has the advantages of small computation, fast running speed and good real-time performance. It meets the needs of embedded image processing. © 2023 IEEE.

关键词： image recognition

来源：评论

学校读者我要写书评

暂无评论

Is Grad-CAM Explainable in Medical images? 1

引用

8th International Conference on Computer vision and image processing (CVIP)

作者： Suara, Subhashis Jha, Aayush Sinha, Pratik Sekh, Arif Ahmed XIM Univ Bhubaneswar India

ISBN: (数字)9783031581816

ISBN: (纸本)9783031581809;9783031581816

Explainable Deep Learning has gained significant attention in the field of artificial intelligence (AI), particularly in domains such as medical imaging, where accurate and interpretable machine learning models are crucial for effective diagnosis and treatment planning. Grad-CAM is a baseline that highlights the most critical regions of an image used in a deep learning model's decision-making process, increasing interpretability and trust in the results. It is applied in many computer vision (CV) tasks such as classification and explanation. This study explores the principles of Explainable Deep Learning and its relevance to medical imaging, discusses various explainability techniques and their limitations, and examines medical imaging applications of Grad-CAM. The findings highlight the potential of Explainable Deep Learning and Grad-CAM in improving the accuracy and interpretability of deep learning models in medical imaging. The code is available in (https://***/ beasthunter758/GradEML).

关键词： Explainable Deep Learning Gradient-weighted Class Activation Mapping (Grad-CAM) Medical image Analysis

来源：评论

学校读者我要写书评

暂无评论

Model Pruning for Infrared-Visible image Fusion 2

Model Pruning for Infrared-Visible Image Fusion

引用

2nd International Conference on machine vision, image processing and Imaging Technology, MVIPIT 2024

作者： Chen, Qi Feng, Rui State Grid Beijing Electric Power Company Beijing China

ISBN: (纸本)9798331543037

Infrared-visible image fusion combines complementary information from both modalities, enhancing scene perception in applications such as surveillance and autonomous driving. However, existing deep learning-based methods are often computationally expensive. Our approach involves training an over-parameterized fusion network, applying structured pruning to reduce model complexity, and fine-tuning the pruned model to maintain performance. The pruning process leverages the L2-norm of the Restormer blocks, ensuring that less critical components are removed while preserving essential fusion quality. Experiments on the benchmark datasets demonstrate that our approach achieves high fusion quality with significantly reduced computational costs. Ablation studies further validate the effectiveness of our pruning strategy. ©2024 IEEE.

关键词： image fusion

来源：评论

学校读者我要写书评

暂无评论

Instruct Me More! Random Prompting for Visual In-Context Learning

Instruct Me More! Random Prompting for Visual In-Context Lea...

引用

IEEE/CVF Winter Conference on applications of Computer vision (WACV)

作者： Zhang, Jiahao Wang, Bowen Li, Liangzhi Nakashima, Yuta Nagahara, Hajime Osaka Univ Suita Osaka Japan

ISBN: (纸本)9798350318920;9798350318937

Large-scale models trained on extensive datasets, have emerged as the preferred approach due to their high generalizability across various tasks. In-context learning (ICL), a popular strategy in natural language processing, uses such models for different tasks by providing instructive prompts but without updating model parameters. This idea is now being explored in computer vision, where an input-output image pair (called an in-context pair) is supplied to the model with a query image as a prompt to exemplify the desired output. The efficacy of visual ICL often depends on the quality of the prompts. We thus introduce a method coined Instruct Me More (InMeMo), which augments in-context pairs with a learnable perturbation (prompt), to explore its potential. Our experiments on mainstream tasks reveal that InMeMo surpasses the current state-of-the-art performance. Specifically, compared to the baseline without learnable prompt, InMeMo boosts mIoU scores by 7.35 and 15.13 for foreground segmentation and single object detection tasks, respectively. Our findings suggest that InMeMo offers a versatile and efficient way to enhance the performance of visual ICL with lightweight training. Code is available at https://***/Jackieam/InMeMo.

关键词： Algorithms Algorithms and algorithms formulations image recognition and understanding machine learning architectures

来源：评论

学校读者我要写书评

暂无评论

Towards 360∘ image compression for machines via modulating pixel significance

引用

Multimedia Tools and applications 2024年第42期83卷 90271-90288页

作者： Zheng, Silin Shen, Xuelin Zhang, Qiudan Chen, Zhuo Yang, Wenhan Wang, Xu College of Computer Science and Software Engineering Shenzhen University Guangdong Shenzhen51800 China Guangdong Shenzhen51800 China Peng Cheng Laboratory Guangdong Shenzhen51800 China

The rapid growth of computer vision-based applications, including smart cities and autonomous driving, has created a pressing demand for efficient 360∘ image compression and computer vision analytics. In most circumstances, 360∘ image compression and computer vision face challenges arising from the oversampling inherent in the Equirectangular Projection (ERP). However, these two fields often employ divergent technological approaches. Since image compression aims to reduce redundancy, computer vision analytics attempts to compensate for the semantic distortion caused by the projection process, resulting in a potential conflict between the two objectives. This paper explores a potential route, i.e.360∘ image Coding for machine (360-ICM), which offers an image processing framework that addresses both object deformation and oversampling redundancy within a unified framework. The key innovation lies in inferring a pixel-wise significant map by jointly considering the requirements of redundancy removal and object deformation offsetting. The significance map would be subsequently fed to a deformation-aware image compression network, guiding the bit allocation process as an external condition. More specifically, we employ a deformation-aware image compression network that is characterized by the Spatial Feature Transform (SFT) layer, which is capable of performing complex affine transformations of high-level semantic features, to be essential in dealing with the deformation. The image compression network and significance inference network are jointly trained under the supervision of a 360∘ image-specified object detection network, obtaining a compact representation that is both analytics-oriented and deformation-aware. Extensive experimental results have demonstrated the superiority of the proposed method over existing state-of-the-art image codecs in terms of rate-analytics performance. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part

关键词：

来源：评论

学校读者我要写书评

暂无评论

machine Learning applications: From Computer vision to Robotics 1st

引用

2023年

作者： Indranath Chatterjee Sheetal Zalte

ISBN: (数字)9781394173358

ISBN: (纸本)9781394173327

machine Learning applications Practical resource on the importance of machine Learning and Deep Learning applications in various technologies and real-world situations machine Learning applications discusses methodological advancements of machine learning and deep learning, presents applications in image processing, including face and vehicle detection, image classification, object detection, image segmentation, and delivers real-world applications in healthcare to identify diseases and diagnosis, such as creating smart health records and medical imaging diagnosis, and provides real-world examples, case studies, use cases, and techniques to enable the reader’s active learning. Composed of 13 chapters, this book also introduces real-world applications of machine and deep learning in blockchain technology, cyber security, and climate change. An explanation of AI and robotic applications in mechanical design is also discussed, including robot-assisted surgeries, security, and space exploration. The book describes the importance of each subject area and detail why they are so important to us from a societal and human perspective. Edited by two highly qualified academics and contributed to by established thought leaders in their respective fields, machine Learning applications includes information on: Content based medical image retrieval (CBMIR), covering face and vehicle detection, multi-resolution and multisource analysis, manifold and image processing, and morphological processing Smart medicine, including machine learning and artificial intelligence in medicine, risk identification, tailored interventions, and association rules AI and robotics application for transportation and infrastructure (e.g., autonomous cars and smart cities), along with global warming and climate change Identifying diseases and diagnosis, drug discovery and manufacturing, medical imaging diagnosis, personalized medicine, and smart health records With its practical approach to the subject, Ma

关键词： Artificial Intelligence

来源：评论

学校读者我要写书评

暂无评论

Hybrid framework for correcting water-to-air image sequences

引用

APPLIED OPTICS 2024年第33期63卷 8575-8582页

作者： Cao, Yiqian Cai, Chengtao Meng, Haiyang Harbin Engn Univ Coll Intelligent Syst Sci & Engn Harbin 150001 Peoples R China Harbin Engn Univ Minist Educ Key Lab Intelligent Technol & Applicat Marine Equi Harbin 150001 Peoples R China Heilongjiang Prov Key Lab Environm Intelligent Per Harbin 150001 Peoples R China Shanghai Aerosp Control Technol Inst Shanghai 201109 Peoples R China

When an underwater camera captures aerial targets, the received light undergoes refraction at the water-air interface. In particular, the calm water compresses the image, while turbulent water causes nonlinear distortion in the captured images. However, existing methods for correcting water-to-air distortion often cause images with distortion or overall shifts. To address the above issue, we propose a multi-strategy hybrid framework to process image sequences effectively, particularly for high-precision applications. Our framework includes a spatiotemporal crossover block to transform and merge features, effectively addressing the template-free problem. Additionally, we introduce an enhancement network to produce a high-quality template in the first stage and a histogram template method to maintain high chromaticity and reduce template noise in the correction stage. Furthermore, our framework incorporates a new registration scheme to facilitate sequence transfer and processing. Compared to existing algorithms, our approach achieves a high restoration level in terms of morphology and color for publicly available image sequences. (c) 2024 Optica Publishing Group. All rights, including for text and data mining (TDM), Artificial Intelligence (AI) training, and similar technologies, are reserved.

关键词： image enhancement image metrics Light propagation machine vision Segmentation Transforms

来源：评论

学校读者我要写书评

暂无评论

Research on Abnormal Target Recognition of Full Information Mobile Monitoring Based on machine vision 6th

Research on Abnormal Target Recognition of Full Information ...

引用

6th European-Alliance-for-Innovation (EAI) International Conference on Advanced Hybrid Information processing (ADHIP)

作者： Wei, Yudong Xia, Yuhong Univ Elect Sci & Technol China Chengdu Coll Chengdu 611731 Peoples R China

ISBN: (纸本)9783031288661;9783031288678

When the color of moving object is close to the background, the accuracy of moving object recognition is affected. So the method of moving object recognition based on machine vision is designed. In order to reduce the distortion of image edge position, the moving object is calibrated and corrected by vision. In order to reduce the influence of noise to a controllable range, the full information mobile monitoring image is enhanced to preserve the image details. The edge features obtained from view and template are calculated by moment, and the similarity is obtained. Then the contour feature of moving monitoring target is extracted based on machine vision. Segmentation of the background region, according to the moving object trajectory center point information such as speed, direction and so on to determine whether the trajectory is abnormal events. The proposed method is tested on INRIA dataset and Vehicle Reld dataset, and the results show that the proposed method can improve the accuracy and recall rate and has good detection performance.

关键词： machine vision Full information Mobile monitoring Abnormal target Target identification Monitoring objectives

来源：评论

学校读者我要写书评

暂无评论

Revolutionizing machine vision: Advanced Convolutional Strategies for Rapid image processing 13

Revolutionizing Machine Vision: Advanced Convolutional Strat...

引用

13th International Conference of Information and Communication Technology, ICTech 2024

作者： Wu, Hanlei Pittsburgh Institute Sichuan University Chengdu China

ISBN: (纸本)9798350376258

This paper presents a comprehensive examination of innovative strategies aimed at enhancing machine vision technology, particularly in the context of energy efficiency and processing speed, critical factors for applications like facial recognition. The study focuses on three distinct approaches: an optimized two-dimensional convolution algorithm, a novel Field-Programmable Gate Array (FPGA) implementation, and advancements in multichannel meta-imagers. Firstly, the paper discusses an optimized algorithm for two-dimensional convolutions, a fundamental operation in machine vision. This advanced algorithm significantly reduces computational complexity. For instance, in executing a two-dimensional 3×3 cyclic convolution, the proposed method reduces the number of necessary multiplications from 81 to merely 13, offering a substantial improvement in efficiency. Secondly, the paper explores an innovative FPGA implementation of the two-dimensional convolution algorithm. This implementation is designed to minimize the use of shift registers, multipliers, and adders. As a result, it utilizes fewer Look-Up Tables (LUTs), leading to energy and time savings in executing the convolution process. The paper details the architecture of this FPGA-based approach and its implications for energy consumption and processing speed in machine vision applications. Finally, the paper introduces a novel technique called the Avg-Topk method, addressing a critical challenge in the pooling layer of convolutional neural networks. This method combines the benefits of average pooling with the advantages of max pooling, aiming to enhance the accuracy of the pooling layer without compromising on efficiency. The Avg-Topk method represents a significant step forward in optimizing the pooling process within machine vision systems. In summary, this paper delves into groundbreaking methods to improve the speed and energy efficiency of machine vision systems, offering valuable insights and potential solution

关键词： Shift registers

来源：评论

学校读者我要写书评

暂无评论

Automatic visual recognition, detection and classification of weeds in cotton fields based on machine vision

引用

CROP PROTECTION 2025年 187卷

作者： Memon, Muhammad Sohail Chen, Shuren Shen, Baoguo Liang, Runzhi Tang, Zhong Wang, Shuai Zhou, Weiwei Memon, Noreena Jiangsu Univ Key Lab Modern Agr Equipment & Technol Minist Educ Zhenjiang 212013 Jiangsu Peoples R China Jiangsu Univ Sch Agr Engn Zhenjiang 212013 Jiangsu Peoples R China Sindh Agr Univ Fac Agr Engn Dept Farm Power & Machinery Tandojam 70060 Pakistan Jiangsu Aviat Tech Coll Zhenjiang Key Lab UAV Applicat Technol Zhenjiang 212134 Peoples R China

Crops and weeds are involved in a continuous competition for equal resources, which may result in a potential decrease in crop yields by up to 31% and an increase in the costs of agricultural inputs by up to 22% of cultivation. Weeds further impact crop production, and their detection is crucial for effective crop management. In this research, we targeted common weeds of cotton field, specifically i) Digitaria sanguinalis (L.) Scop, ii) Amaranthus retroflexus L., iii) Acalypha australis, L., iv) Cephalanoplos segetum, and v) Chenopodium album L. Additionally, image processing techniques such as grayscale conversion, binarization, and Gaussian and morphological filters were also utilized. These methods are based on machine vision and facilitate rapid and straightforward weed detection by segmenting, scrutinizing, and comparing input images. The plant height and area were obtained during cotton planting within 32 days and fitted to develop the growth law concerning planting days for achieving the function of distinguishing cotton from weeds. We conducted recognition experiments by dividing images into four quadrants and categorizing weeds as either inter-row or intra-row. Meanwhile, the inter-row planting information was used to identify weeds, and the leaf pixel area and circularity were used as the identification methods for intra-row weeds, which reduced the algorithm's running time and improved real-time performance. The experimental results indicated that the inter-row weed recognition rate was 89.4%, with an average processing time of 102ms. Whereas in the case of intra-row weeds, the recognition rate was measured at 84.6%, and the overall recognition rate for cotton was 85.0%, with a mean time consumption of 437ms. Furthermore, the present research underscores recent advancements such as machine vision and high-resolution imaging, which have significantly improved the accuracy of automated weed identification in cotton fields while acknowledging ongoing challen

关键词： Weed detection Inter-row weeds Intra-row weeds machine vision algorithms Weed segmentation Cotton crop Precision farming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：