检索结果-内蒙古大学图书馆

2023 IEEE World Conference on Communication and Computing, WCONF 2023

作者： Yaswanth, Thottempudi Naga Sri Lakshmi, K.S vijaya vamsi, Kallepalli Guru Supraja, Devarapalli Suvarna vani, K. V.R.Siddhartha Engineering College Department of CSE Andhra Pradesh Vijayawada India V.R.Siddhartha Engineering College Faculty of Department of CSE Andhra Pradesh Vijayawada India

ISBN: (纸本)9798350311204

Identification of medicinal plants is essential for the development of effective medicines and the preservation of biodiversity. The goal of the artificial intelligence discipline of computer vision is to make it possible for computers to comprehend and analyze visual information from their environment, mimicking the abilities of human vision to enable applications such as image and video recognition, object detection, and autonomous navigation. Statistical algorithms and computational models are used in the artificial intelligence branch of machine learning to enable computers to learn from data and make predictions or judgements without being explicitly programmed. This research employs Computer vision and machine Learning for the classification of Medicinal leaves. This paper focuses on the classification of three medicinal plant species, namely Neem, Tulasi, and Peepal using leaf images as input and the SvM algorithm for classification. The proposed system processes input images by applying various image processing techniques such as histogram equalization, thresholding, and morphological operations to segment the leaf from the background. Several shape, and texture features were extracted from each leaf image such as its length, width, perimeter, area, aspect ratio, homogeneity and correlation. The system also extracts texture features using gray-level co-occurrence matrix (GLCM) analysis. The extracted features are then fed into a machine learning algorithm, specifically a Support vector machine (SvM) classifier, to classify the medicinal plant species. Accuracy, precision, recall, and F1 score are some of the measures that are used to assess the model's performance once the SvM algorithm has been optimised. This work lays the foundation for future research in this field and shows the potential of machine learning in the high accuracy classification of medicinal plants. © 2023 IEEE.

关键词： Support vector machines

来源：评论

学校读者我要写书评

暂无评论

Contextual Transformer Networks for visual Recognition

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND machine INTELLIGENCE 2023年第2期45卷 1489-1500页

作者： Li, Yehao Yao, Ting Pan, Yingwei Mei, Tao JD Explore Acad Beijing 101111 Peoples R China

Transformer with self-attention has led to the revolutionizing of natural language processing field, and recently inspires the emergence of Transformer-style architecture design with competitive results in numerous computer vision tasks. Nevertheless, most of existing designs directly employ self-attention over a 2D feature map to obtain the attention matrix based on pairs of isolated queries and keys at each spatial location, but leave the rich contexts among neighbor keys under-exploited. In this work, we design a novel Transformer-style module, i.e., Contextual Transformer (CoT) block, for visual recognition. Such design fully capitalizes on the contextual information among input keys to guide the learning of dynamic attention matrix and thus strengthens the capacity of visual representation. Technically, CoT block first contextually encodes input keys via a 3 x 3 convolution, leading to a static contextual representation of inputs. We further concatenate the encoded keys with input queries to learn the dynamic multi-head attention matrix through two consecutive 1 x 1 convolutions. The learnt attention matrix is multiplied by input values to achieve the dynamic contextual representation of inputs. The fusion of the static and dynamic contextual representations are finally taken as outputs. Our CoT block is appealing in the view that it can readily replace each 3 x 3 convolution in ResNet architectures, yielding a Transformer-style backbone named as Contextual Transformer Networks (CoTNet). Through extensive experiments over a wide range of applications (e.g., image recognition, object detection, instance segmentation, and semantic segmentation), we validate the superiority of CoTNet as a stronger backbone.

关键词： Transformers Convolution visualization Computer architecture Task analysis image recognition Object detection Transformer self-attention vision transformer image recognition

来源：评论

学校读者我要写书评

暂无评论

machine learning and galaxy morphology: for what purpose?

引用

MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY 2023年第3期523卷 3974-3990页

作者： Fraix-Burnet, D. Univ Grenoble Alpes CNRS IPAG F-38000 Grenoble France

Classification of galaxies is traditionally associated with their morphologies through visual inspection of images. The amount of data to come render this task, inhuman and machine Learning (mainly Deep Learning) has been called to the rescue for more than a decade. However, the results look mitigate and there seems to be a shift away from the paradigm of the traditional morphological classification of galaxies. In this paper, I want to show that the algorithms indeed are very sensitive to the features present in images, features that do not necessarily correspond to the Hubble or de vaucouleurs vision of a galaxy. However, this does not preclude to get the correct insights into the physics of galaxies. I have applied a state-of-the-art 'traditional' machine Learning clustering tool, called Fisher-EM, a latent discriminant subspace Gaussian mixture model algorithm to 4458 galaxies carefully classified into 18 types by the EFIGI project. The optimum number of clusters given by the integrated complete likelihood criterion is 47. The correspondence with the EFIGI classification is correct, but it appears that the Fisher-EM algorithm gives a great importance to the distribution of light which translates to characteristics such as the bulge to disc ratio, the inclination or the presence of foreground stars. The discrimination of some physical parameters (bulge-to-total luminosity ratio, (B-v)(T), intrinsic diameter, presence of flocculence or dust, and arm strength) is very comparable in the two classifications.

关键词： methods: data analysis methods: statistical techniques: image processing galaxies: general galaxies: structure

来源：评论

学校读者我要写书评

暂无评论

Deep Learning for Automatic vision-Based Recognition of Industrial Surface Defects: A Survey

引用

IEEE ACCESS 2023年 11卷 43370-43423页

作者： Prunella, Michela Scardigno, Roberto Maria Buongiorno, Domenico Brunetti, Antonio Longo, Nicola Carli, Raffaele Dotoli, Mariagrazia Bevilacqua, vitoantonio Polytech Univ Bari Dept Elect & Informat Engn DEI I-70126 Bari Italy Apulian Bioengn Srl I-70026 Modugno Bari Italy Comau SpA I-10095 Turin Italy

Automatic vision-based inspection systems have played a key role in product quality assessment for decades through the segmentation, detection, and classification of defects. Historically, machine learning frameworks, based on hand-crafted feature extraction, selection, and validation, counted on a combined approach of parameterized image processing algorithms and explicated human knowledge. The outstanding performance of deep learning (DL) for vision systems, in automatically discovering a feature representation suitable for the corresponding task, has exponentially increased the number of scientific articles and commercial products aiming at industrial quality assessment. In such a context, this article reviews more than 220 relevant articles from the related literature published until February 2023, covering the recent consolidation and advances in the field of fully-automatic DL-based surface defects inspection systems, deployed in various industrial applications. The analyzed papers have been classified according to a bi-dimensional taxonomy, that considers both the specific defect recognition task and the employed learning paradigm. The dependency on large and high-quality labeled datasets and the different neural architectures employed to achieve an overall perception of both well-visible and subtle defects, through the supervision of fine or/and coarse data annotations have been assessed. The results of our analysis highlight a growing research interest in defect representation power enrichment, especially by transferring pre-trained layers to an optimized network and by explaining the network decisions to suggest trustworthy retention or rejection of the products being evaluated.

关键词： Feature extraction Transfer learning Deep learning Artificial intelligence Inspection Manuals image recognition Computer vision Autonomous systems Generative adversarial networks Artificial vision auto-encoder automatic recognition feature attention mechanism convolutional neural network deep learning explainable artificial intelligence generative-adversarial network industrial surface defects transfer learning

来源：评论

学校读者我要写书评

暂无评论

vision-Based Autonomous Car Brake and Steering Assistance Using Deep Learning 1

Vision-Based Autonomous Car Brake and Steering Assistance Us...

引用

1st International Conference on AIML-applications for Engineering and Technology, ICAET 2025

作者： Anees, H. Agrawal, Pooja School of Robotics Defence Institute of advanced technology Pune India

ISBN: (纸本)9798350355611

This paper presents a novel approach for enhancing vehicle safety and navigation through an integrated system for lane detection, vehicle alignment, and automatic braking using visual feedback. Our proposed system employs advanced deep learning and computer vision techniques with real-time processing to detect the exact boundaries of lane and ensures precise vehicle movement within the lane. The system continuously analyses lane markings and modifies the vehicle's position to ensure optimal lane adherence by utilizing a combination of machine learning algorithms and camera-based image processing. Additionally, the system incorporates an adaptive braking mechanism that identifies vehicles ahead using visual inputs. Furthermore, the jerks experienced during steering alignment can be greatly reduced by the suggested steering control system. The system's efficiency in various driving conditions is evidenced by its experimental simulation results, which also show improvements in collision avoidance and lane-keeping accuracy. This approach contributes to improved driving convenience and road safety by marking a substantial advancement in autonomous driving technologies. © 2025 IEEE.

关键词： Steering

来源：评论

学校读者我要写书评

暂无评论

MSHCCT: A Multiscale Compact Convolutional Network for High-Resolution Aerial Scene Classification

引用

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS 2025年 22卷

作者： Jamali, Ali Roy, Swalpa Kumar Lu, Bing Beni, Leila Hashemi Kakhani, Nafiseh Ghamisi, Pedram Simon Fraser Univ Dept Geog Burnaby BC V5A 1S6 Canada Alipurduar Govt Engn & Management Coll Dept Comp Sci & Engn Alipurduar 736206 West Bengal India North Carolina A&T State Univ Dept Built Environm Geomat Program Greensboro NC 27411 USA Univ Tubingen Fac Sci Dept Geosci Soil Sci & Geomorphol GrpCRC ResourceCultures 107 D-72070 Tubingen Germany Univ Tubingen DFG Cluster Excellence Machine Learning D-72070 Tubingen Germany Landbanking Grp GmbH D-80333 Munich Germany Helmholtz Zentrum Dresden Rossendorf HZDR D-09599 Freiberg Germany Univ Lancaster Lancaster LA1 4YR England

The growing popularity of vision transformers (viTs) in remote sensing image classification is due to their ability to effectively capture long-range dependencies. However, their high computational cost and memory footprint limit their applicability, particularly for small-scale datasets and resource-constrained environments. To address these challenges, we propose the multiscale multihead compact convolutional transformer (MSHCCT), a lightweight yet powerful model that integrates convolutional tokenization with small-scale viTs to enhance multiscale feature representation while maintaining computational efficiency. Despite a modest increase in parameters and training time, MSHCCT achieves superior classification accuracy and robustness on high-resolution aerial scenes. Importantly, our approach eliminates the need for model pretraining, additional datasets, or multisensor data fusion, ensuring a computationally efficient and practical solution for remote sensing applications. The code will be made publicly available at https://***/aj1365/MSHCCT

关键词： Transformers Computer architecture Remote sensing Feature extraction Computer vision Computational modeling Head Adaptation models Training Scene classification Attention mechanism compact convolutional network image classification vision transformers (viTs)

来源：评论

学校读者我要写书评

暂无评论

End-to-End Multitask Learning With vision Transformer

引用

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024年第7期35卷 9579-9590页

作者： Tian, Yingjie Bai, Kunlong Univ Chinese Acad Sci Sch Econ & Management Beijing 100190 Peoples R China Chinese Acad Sci Res Ctr Fictitious Econ & Data Sci Beijing 100190 Peoples R China Chinese Acad Sci Key Lab Big Data Min & Knowledge Management Beijing 100190 Peoples R China Univ Chinese Acad Sci Sch Comp Sci & Technol Beijing 100190 Peoples R China

Multitask learning (MTL) is a challenging puzzle, particularly in the realm of computer vision (Cv). Setting up vanilla deep MTL requires either hard or soft parameter sharing schemes that employ greedy search to find the optimal network designs. Despite its widespread application, the performance of MTL models is vulnerable to under-constrained parameters. In this article, we draw on the recent success of vision transformer (viT) to propose a multitask representation learning method called multitask viT (MTviT), which proposes a multiple branch transformer to sequentially process the image patches (i.e., tokens in transformer) that are associated with various tasks. Through the proposed cross-task attention (CA) module, a task token from each task branch is regarded as a query for exchanging information with other task branches. In contrast to prior models, our proposed method extracts intrinsic features using the built-in self-attention mechanism of the viT and requires just linear time on memory and computation complexity, rather than quadratic time. Comprehensive experiments are carried out on two benchmark datasets, including NYU-Depth v2 (NYUDv2) and CityScapes, after which it is found that our proposed MTviT outperforms or is on par with existing convolutional neural network (CNN)-based MTL methods. In addition, we apply our method to a synthetic dataset in which task relatedness is controlled. Surprisingly, experimental results reveal that the MTviT exhibits excellent performance when tasks are less related.

关键词： Task analysis Transformers Neural networks visualization Biological system modeling Benchmark testing Correlation Deep neural network algorithms machine learning applications multitask learning (MTL)

来源：评论

学校读者我要写书评

暂无评论

Breaking Through Color Casts: Enhancing image Fidelity with machine Learning-Based Correction

Breaking Through Color Casts: Enhancing Image Fidelity with ...

引用

2024 International Conference on Intelligent Systems for Cybersecurity, ISCS 2024

作者： Reddy, Choppa Jeevan Sai Akshay, Bavani Bhawane, Bange Reddy, Busireddy Parvathammagari Srikanth Satish, Addanki Sree Vidyanikethan Engineering College Tirupati India School of Engineering Mohan Babu University Tirupati Dept. of Ece India

ISBN: (纸本)9798350375237

L Color cast, an aberration common in digital images, poses challenges in various image processing applications, affecting image quality and visual perception. This research investigates diverse methodologies for color cast correction, ranging from traditional algorithms to modern machine learning-based approaches. Leveraging a comprehensive dataset of original and corrected images, the present study evaluates the efficacy of each method using quantitative metrics, including ACMO, BREN, GRAS, LAPM, LAPv, LAPD, and WAvv. Results indicate that while traditional techniques like Gray World Algorithm and White Patch Retinex Algorithm demonstrate moderate effectiveness, the implemented machine learning-based algorithm showcases superior performance across multiple color cast levels. By employing linear regression on RGB values, the method efficiently corrects color cast aberrations, yielding visually appealing and perceptually accurate results. Furthermore, the research highlights the significance of robust color constancy algorithms and their role in mitigating color cast distortions in digital images. This study contributes valuable insights into the field of color cast correction, offering practitioners in image processing and computer vision a comprehensive understanding of effective correction strategies. Future research directions may explore advanced machine learning models and integration with color constancy mechanisms to further enhance color cast correction techniques. © 2024 IEEE.

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

A Comparative Study of Deep Learning Algorithms for Glaucoma Classification Using Retinal images 5th

A Comparative Study of Deep Learning Algorithms for Glaucoma...

引用

5th International Conference on Data Science, machine Learning and applications

作者： Swapna, T. varshitha, Y. Sai Raja Sudeepthi, K. L. Manavika, B. Saishree, T. G Narayanamma Inst Technol & Sci Hyderabad Telangana India

ISBN: (纸本)9789819780334;9789819780310;9789819780303

A group of eye conditions known as glaucoma impair the optic nerve, which is in charge of sending visual data from the eye to the brain. Glaucoma impacts 3.54% of adults aged 40 to 80 around the world. Early detection of glaucoma is crucial as it can prevent total optic nerve damage, which would cause irreversible vision loss. It is possible for specialists to diagnose glaucoma medically, but treatment options are either expensive or time-consuming and requires ongoing care from medical professionals. There have been numerous initiatives at streamlining all components of the glaucoma categorization process, however these models are challenging for users to comprehend the key predictors, resulting in them being unreliable for use by medical experts. The study uses eye fundus images to classify glaucoma patients using three distinct Deep Learning techniques: Convolutional neural network, visual Geometry Group 16 (vGG16), and Global Context Network (GC-Net). In addition, several data pre-processing techniques are used to avoid overfitting and achieve high accuracy. This research compares and analyses the performance of various architectures using the aforementioned techniques. The CNN model had the best accuracy of 83% when in contrast to the other deep learning models.

关键词： Biomedical image processing Glaucoma Blind ness machine learning visual geometry group 16 Convolutional neural network Global context network

来源：评论

学校读者我要写书评

暂无评论

Deep learning-based solutions for electron microscopy image analysis

Deep learning-based solutions for electron microscopy image ...

引用

作者： Nguyen, Nguyen Phuoc University of Missouri

学位级别：博士

Electron microscopy (EM) enables capturing high resolution images of very small structures in biological and non-biological specimens such as membrane proteins, viruses, subcellular structures, nanoparticles, or material surfaces. Electron microscopy plays a critical role in research, development, and diagnosis in many applications of biological, physical, chemical and material sciences. Thanks to advances in instrumentation, electron microscopy generates large amounts of complex data that is no longer feasible to analyze manually. There is a growing need for development of computational methods and tools for automated analysis of electron microscopy data generated for variety of research fields. Recent advances in artificial intelligence and machine learning, particularly in deep learning have revolutionized image processing and computer vision. In this work, we explored deep learning guided image processing and computer vision solutions to address the growing high-performance processing needs of image data acquired using electron microscopy. The proposed solutions involved novel multi-step, 2D/3D fusion approaches to address the unique challenges of complex, low-contrast, noisy electron microscopy imagery; and selfsupervised, semi-supervised, or meta-learning schemes to address the challenges caused by lack of or limited amounts of labeled training data. These image analysis solutions were used for detection, segmentation, and quantification of various biological structures of interest such as proteins, viruses, mitochondrial or neural structures; and non-biological structures of interest such as carbon nanotube forests. Experiments conducted on the proposed methods showed robust and promising results towards automated, objective, and quantitative analysis of electron microscopy image data, that is of great value for biology, medicine, and material science applications.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：