检索结果-内蒙古大学图书馆

6th International Conference on machine vision and applications, ICMVA 2023

ISBN: (纸本)9781450399531

The proceedings contain 28 papers. The topics discussed include: on-demand multiclass imaging for sample scarcity in industrial environments;a multistage framework for detection of very small objects;exploiting self-imposed constraints on RGB and LiDAR for unsupervised training;detection of fibrillatory episodes in atrial fibrillation rhythms via topology-informed machine learning;multi-scale feature enhancement network for face forgery detection;feature consistent point cloud registration in building information modeling;integrating user gaze with verbal instruction to reliably estimate robotic task parameters in a human-robot collaborative environment;recovering image information from speckle noise by image processing;road lane segmentation using vehicle trajectory tracking and lane demarcation lines;and digital holography vs. display holography - what are their differences and what do they have in common?.

关键词：

来源：评论

学校读者我要写书评

暂无评论

image Captioning for Information Generation

Image Captioning for Information Generation

引用

2023 International Conference on Computer Communication and Informatics, ICCCI 2023

作者： Vohra, Gurvansh Gupta, Lakshay Bansal, Deepika Gupta, Bhoomi Maharaja Agrasen Institute of Technology Department of IT Delhi India

ISBN: (纸本)9798350348217

In the AI applications for natural language definitions, image captioning is a field that is expanding quickly. It attempts to capture meaningful interpretations of the interactions between the acquired picture data from datasets and the sentence meanings. It combines Long Short-Term Memory type RNN over phrases and CNN (Convolutional Neural Networks) picture reading techniques so that the image can be understood by the system and output it in a common language. This study intermingles the vision of the computer and natural language processing (NLP) to create machine that enhances visual information such as photos by giving braille legible captions for those who are blind to better understand what is going on around them. © 2023 IEEE.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

A comprehensive survey on pretrained foundation models: a history from BERT to ChatGPT

引用

INTERNATIONAL JOURNAL OF machine LEARNING AND CYBERNETICS 2024年 1-65页

作者： Zhou, Ce Li, Qian Li, Chen Yu, Jun Liu, Yixin Wang, Guangjing Zhang, Kai Ji, Cheng Yan, Qiben He, Lifang Peng, Hao Li, Jianxin Wu, Jia Liu, Ziwei Xie, Pengtao Xiong, Caiming Pei, Jian Yu, Philip S. Sun, Lichao Michigan State Univ E Lansing MI 48824 USA Beihang Univ Beijing Peoples R China Lehigh Univ Bethlehem PA USA Macquarie Univ Sydney Australia Nanyang Technol Univ Singapore Singapore Univ Calif San Diego San Diego CA USA Salesforce AI Res Palo Alto CA USA Duke Univ Durham NC USA Univ Illinois Chicago IL USA

Pretrained Foundation Models (PFMs) are regarded as the foundation for various downstream tasks across different data modalities. A PFM (e.g., BERT, ChatGPT, GPT-4) is trained on large-scale data, providing a solid parameter initialization for a wide range of downstream applications. In contrast to earlier methods that use convolution and recurrent modules for feature extraction, BERT learns bidirectional encoder representations from Transformers, trained on large datasets as contextual language models. Similarly, the Generative Pretrained Transformer (GPT) method employs Transformers as feature extractors and is trained on large datasets using an autoregressive paradigm. Recently, ChatGPT has demonstrated significant success in large language models, utilizing autoregressive language models with zero-shot or few-shot prompting. The remarkable success of PFMs has driven significant breakthroughs in AI, leading to numerous studies proposing various methods, datasets, and evaluation metrics, which increases the demand for an updated survey. This study provides a comprehensive review of recent research advancements, challenges, and opportunities for PFMs in text, image, graph, and other data modalities. It covers the basic components and existing pretraining methods used in natural language processing, computer vision, and graph learning, while also exploring advanced PFMs for different data modalities and unified PFMs that address data quality and quantity. Additionally, the review discusses key aspects such as model efficiency, security, and privacy, and provides insights into future research directions and challenges in PFMs. Overall, this survey aims to shed light on the research of the PFMs on scalability, security, logical reasoning ability, cross-domain learning ability, and user-friendly interactive ability for artificial general intelligence.

关键词： Pretrained foundation models Natural language processing Computer vision Graph learning ChatGPT BERT GPT-4

来源：评论

学校读者我要写书评

暂无评论

Research on Visual Target Detection and Recognition of Shopping Robots Based on Improved YOLO Algorithm 7th

Research on Visual Target Detection and Recognition of Shopp...

引用

7th International Conference on Wireless Communications, Networking and applications, WCNA 2023

作者： Lu, Yufan Zhejiang Gongshang University Hangzhou China

ISBN: (纸本)9789819624089

This research aims to improve the visual target detection and recognition capabilities of shopping robots in various sales environments by optimizing and improving the YOLO algorithm, in order to improve accuracy and real-time performance. The research method involves embedded spatial hierarchical sampling technology and it adapts to image processing of different sizes, uses a separate convolutional neural network structure to reduce computational complexity, and cultivates a more concise network model by refining the effective data of complex models. Experimental results show that the improved YOLO algorithm performs well in weak Its average accuracy has been significantly improved under light, medium light and strong light environments, especially in the detection of small items. A study shows that improved programming significantly improved the vision of shopping assistance robots. Recognition capabilities enable robots to provide more accurate and faster services in real shopping environments. © The Author(s) 2025.

关键词： machine vision

来源：评论

学校读者我要写书评

暂无评论

machine vision enabled characterization of defects and their fatigue effects in additively manufactured steels

Machine vision enabled characterization of defects and their...

引用

applications of machine Learning 2024

作者： Cotrina, J. Amorin Uysalel, C. Olumor, I. Torresani, E. Olevsky, E. Ghazinejad, M. Dept. of Mechanical and Aerospace Engineering UC San Diego La Jolla CA92093 United States Dept. of Mechanical Engineering San Diego State University San DiegoCA92182 United States

ISBN: (纸本)9781510679368

We characterized manufacturing-induced defects in 316L stainless steels - fabricated by direct metal laser sintering (DMLS) - and investigated their roles in the fatigue behavior of steel parts. The primary defects targeted are porosities, inner cracks, and edge cracks. We used Convolutional Neural Networks (CNNs) to detect and classify these defects and moved toward a machine vision-based metrology technique for metal additive manufacturing (AM). The Moore cyclic loading method was applied to characterize the fatigue behavior of 316L samples. The results indicate a strong correlation between the quality of additive manufacturing, defect levels, and the fatigue properties of the steel samples. Specifically, samples with lower defect levels exhibited significantly higher load endurance and longer life cycles. To further explore the influence of defects on mechanical behavior, we applied image processing techniques to measure the density, size, morphology, and location of defects in the steels. The quantification of AM defects features paves the way for a deeper understanding of microstructure - macro-behavior relations and enhanced fatigue prediction models in additively manufactured steels. © 2024 SPIE.

关键词： Laser heating

来源：评论

学校读者我要写书评

暂无评论

Design and Development of Industrial vision Sensor (IVIS) for Next Generation Industrial applications 19

Design and Development of Industrial Vision Sensor (IVIS) fo...

引用

19th IEEE-India-Council International Conference (INDICON)

作者： Daniel, Jerry J. Thomas, Lijo Lajitha, C. S. Mathew, Jacob T. Jithin, S. Mohan, Anju Kumar, Kichu S. C DAC Control & Instrumentat Grp Thiruvananthapuram Kerala India

ISBN: (纸本)9781665473507

Industrial automation is undergoing a tremendous change due to the proliferation of the concepts, the Internet of Things (IoT), Cyber-Physical Systems (CPS) and tactile internet, which enables the interconnections of factory floor devices and enterprise network on a wider and fine-grained scale. vision Sensor deployments are getting great momentum in factories, as it improves the quality and productivity of the systems being inspected. Smart vision Sensors[1] removes the need of the additional infrastructures for running the image processing algorithms and vision applications, by directly running the vision logic on the device and control/monitor the various parameters on the field based on the image processing outputs. Industrial vision sensor (IVIS) is an industrial smart camera, which has a CMOS image sensor[2] and a powerful on-board processing system capable of supporting machine vision applications, for improving the product and process qualities and thereby improve the yield and profit. IVIS is capable of extracting applicationspecific information from the captured images and make decisions based on the image processing algorithms implemented on the system, to realize stand-alone intelligent and decision-making automation system. In this paper we present the design and development of IVIS, its application domains and preliminary test results.

关键词： CPS IVIS CMOS

来源：评论

学校读者我要写书评

暂无评论

Intelligent Optimization of Computer image processing Technology Analysis 3rd

Intelligent Optimization of Computer Image Processing Techno...

引用

3rd EAI International Conference on Application of Big Data, Blockchain, and Internet of Things for Education Informatization, BigIoT-EDU 2023

作者： Wei, Huayong Anhui Communications Vocational and Technical College Hefei China

ISBN: (纸本)9783031631382

Intelligent optimization algorithm is an advanced computing technology, which simulates the biological evolution process in nature or the logical thinking of human beings to find a solution to the problem. In computer image processing, intelligent optimization algorithms are widely used, mainly in image enhancement, image restoration, image segmentation, feature extraction, image recognition and so on. Intelligent optimization algorithms have developed rapidly, and many excellent algorithms with different characteristics have emerged, which have achieved good results in practical applications. image analysis is the basis for realizing machine vision, including image enhancement, image fusion, image recognition, image tracking, image retrieval and many other technologies. It has a great demand in medicine, transportation, military, aerospace and other fields. In particular, the development of many industries and fields such as intelligent robots, smart medicine, and smart cities has brought many optimization challenges to image analysis, At present, image analysis based on swarm intelligence optimization algorithm has become an important research hotspot. © ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2024.

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

Prediction of Handwritten Classification using CNN Techniques 9

Prediction of Handwritten Classification using CNN Technique...

引用

9th International Conference on Signal processing and Communication, ICSC 2023

作者： Sanu, Kumar Bhandari, Rahul Computer Science and Engineering Chandigarh University Chandigarh India

ISBN: (纸本)9798350383201

Today's computer vision industry makes extensive use of image recognition. A popular method of image recognition is digit recognition. The recognition of handwritten numbers is one of the most well-known difficulties in computer vision and machine learning applications. In essence, this model proposes an online approach for recognizing handwritten digits (HDR) by utilizing convolutional neural networks (CNN).As a training sample, the technique uses the MINST dataset, which contains centered 28x28 gray scale images of handwritten numbers. It has 10,000 test cases and 60,000 training examples. This paper shows the accuracy rate and loss of the model. Lastly, execute the trials using a variety of random handwritten 28x28 pixel digits. © 2023 IEEE.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

A Survey on Attention Mechanisms for Medical applications: are we Moving Toward Better Algorithms?

引用

IEEE ACCESS 2022年 10卷 98909-98935页

作者： Goncalves, Tiago Rio-Torto, Isabel Teixeira, Luis F. Cardoso, Jaime S. Univ Porto Inst Syst & Comp Engn Technol & Sci P-4200465 Porto Portugal

The increasing popularity of attention mechanisms in deep learning algorithms for computer vision and natural language processing made these models attractive to other research domains. In healthcare, there is a strong need for tools that may improve the routines of the clinicians and the patients. Naturally, the use of attention-based algorithms for medical applications occurred smoothly. However, being healthcare a domain that depends on high-stake decisions, the scientific community must ponder if these high-performing algorithms fit the needs of medical applications. With this motto, this paper extensively reviews the use of attention mechanisms in machine learning methods (including Transformers) for several medical applications based on the types of tasks that may integrate several works pipelines of the medical domain. This work distinguishes itself from its predecessors by proposing a critical analysis of the claims and potentialities of attention mechanisms presented in the literature through an experimental case study on medical image classification with three different use cases. These experiments focus on the integrating process of attention mechanisms into established deep learning architectures, the analysis of their predictive power, and a visual assessment of their saliency maps generated by post-hoc explanation methods. This paper concludes with a critical analysis of the claims and potentialities presented in the literature about attention mechanisms and proposes future research lines in medical applications that may benefit from these frameworks.

关键词： Biomedical imaging Computer architecture Transformers Medical services Deep learning Artificial intelligence Biomedical equipment Computer vision Artificial intelligence attention mechanisms computer vision deep learning medical applications medical image analysis transformers

来源：评论

学校读者我要写书评

暂无评论

Visual information perception system of coal mine comprehensive excavation working face for edge computing terminal

引用

IET image processing 2024年第12期18卷 3681-3698页

作者： Zhao, Dongyang Su, Guoyong Wang, Pengyu Anhui Univ Sci & Technol State Key Lab Min Response & Disaster Prevent & Co 168 Taifeng St Huainan Anhui Peoples R China Anhui Univ Sci & Technol Sch Mech & Elect Engn Huainan Peoples R China

Aiming at the problems of low detection accuracy, high computational complexity and long-time consumption of visual perception model in a complex mining environment, this research designs a visual information perception system of coal mine comprehensive excavation working face for an edge computing terminal. Firstly, the C3-Fast feature extraction module, spatial pyramid pooling with cross-stage partial connection (SPPCSPC) pooling module, bi-directional feature pyramid network and lightweight decoupled detection head are used to optimize the YOLOv5s model, so as to construct the FSBD-YOLOv5s multi-object detection model. Secondly, the pruning and distillation algorithm is used to lighten the FSBD-YOLOv5s model, and the model complexity is greatly reduced while maintaining the model detection accuracy. Further, the lightweight FSBD-YOLOv5s model is migrated and deployed to the edge computing terminal platform and the TensorRT engine is used to accelerate model inference. Finally, experiments are carried out based on the data set of the coal mine comprehensive excavation working face. The experimental results show that on the edge computing terminal platform, the parameters and computational volume of the lightweight FSBD-YOLOv5s model are reduced by 50.8% and 34.0%, while its detection accuracy and speed reach 94.0% and 43.7 fps, which can fully satisfy the requirements of the accuracy and real-time for the coal mine engineering applications. In the complex operation scene of coal mine, due to adverse environmental factors such as uneven illumination, high dust and mixed man-machine multi-target, the speed and measurement accuracy of traditional visual perception model decrease sharply. In order to solve the above problems, this study proposes to build a visual information perception system for coal mine comprehensive excavation working face for edge computing terminal and combines channel pruning algorithm, knowledge extraction algorithm and TensorRT acceleration e

关键词： computer vision convolutional neural nets embedded systems feature extraction image recognition object detection visual perception

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：