检索结果-内蒙古大学图书馆

Ensemble of vision transformer architectures for efficient Alzheimer's Disease classification

BRAIN INFORMATICS 2024年第1期11卷 25页

作者： Shaffi, Noushath viswan, vimbi Mahmud, Mufti Sultan Qaboos Univ Coll Sci Dept Comp Sci POB 36 Muscat 123 Oman Univ Technol & Appl Sci Coll Comp & Informat Sci Sohar OM311 Oman Nottingham Trent Univ Dept Comp Sci Nottingham NG11 8NS England Nottingham Trent Univ Med Technol Innovat Facil Nottingham NG11 England Nottingham Trent Univ Comp & Informat Res Ctr Nottingham NG11 8NS England

Transformers have dominated the landscape of Natural Language processing (NLP) and revolutionalized generative AI applications. vision Transformers (vT) have recently become a new state-of-the-art for computer vision applications. Motivated by the success of vTs in capturing short and long-range dependencies and their ability to handle class imbalance, this paper proposes an ensemble framework of vTs for the efficient classification of Alzheimer's Disease (AD). The framework consists of four vanilla vTs, and ensembles formed using hard and soft-voting approaches. The proposed model was tested using two popular AD datasets: OASIS and ADNI. The ADNI dataset was employed to assess the models' efficacy under imbalanced and data-scarce conditions. The ensemble of vT saw an improvement of around 2% compared to individual models. Furthermore, the results are compared with state-of-the-art and custom-built Convolutional Neural Network (CNN) architectures and machine Learning (ML) models under varying data conditions. The experimental results demonstrated an overall performance gain of 4.14% and 4.72% accuracy over the ML and CNN algorithms, respectively. The study has also identified specific limitations and proposes avenues for future research. The codes used in the study are made publicly available.

关键词： vision transformer Convolutional neural networks machine learning models Alzheimer's Disease Swin transformer Data efficient image transformers Bidirectional encoder representation from image transformers

来源：评论

学校读者我要写书评

暂无评论

Sketch-to-image synthesis via semantic masks

引用

MULTIMEDIA TOOLS AND applications 2024年第10期83卷 29047-29066页

作者： Baraheem, Samah S. Tam v. Nguyen Umm Al Qura Univ Dept Comp Sci Al Lith Saudi Arabia Univ Dayton Dept Comp Sci Dayton OH 45469 USA

Sketch-to-image is an important task to reduce the burden of creating a color image from scratch. Unlike previous sketch-to-image models, where the image is synthesized in an end-to-end manner, leading to an unnaturalistic image, we propose a method by decomposing the problem into subproblems to generate a more naturalistic and reasonable image. It first generates an intermediate output which is a semantic mask map from the input sketch through instance and semantic segmentation in two levels, background segmentation and foreground segmentation. Background segmentation is formed based on the context of the foreground objects. Then, the foreground segmentations are sequentially added to the created background segmentation. Finally, the generated mask map is fed into an image-to-image translation model to generate an image. Our proposed method works with 92 distinct classes. Compared to state-of-the-art sketch-to-image models, our proposed method outperforms the previous methods and generates better images.

关键词： Sketch-to-image generation Sketch-to-image synthesis Computer vision Generative adversarial networks Instance and semantic segmentation machine learning

来源：评论

学校读者我要写书评

暂无评论

Tensor Network Methods for Hyperparameter Optimization and Compression of Convolutional Neural Networks

引用

APPLIED SCIENCES-BASEL 2025年第4期15卷 1852-1852页

作者： Naumov, A. Melnikov, A. Perelshtein, M. Melnikov, Ar. Abronin, v. Oksanichenko, F. Terra Quantum AG Kornhausstr 25 CH-9000 St Gallen Switzerland

Neural networks have become a cornerstone of computer vision applications, with tasks ranging from image classification to object detection. However, challenges such as hyperparameter optimization (HPO) and model compression remain critical for improving performance and deploying models on resource-constrained devices. In this work, we address these challenges using Tensor Network-based methods. For HPO, we propose and evaluate the TetraOpt algorithm against various optimization algorithms. These evaluations were conducted on subsets of the NATS-Bench dataset, including CIFAR-10, CIFAR-100, and imageNet subsets. TetraOpt consistently demonstrated superior performance, effectively exploring the global optimization space and identifying configurations with higher accuracies. For model compression, we introduce a novel iterative method that combines CP, SvD, and Tucker tensor decompositions. Applied to ResNet-18 and ResNet-152, we evaluated our method on the CIFAR-10 and Tiny imageNet datasets. Our method achieved compression ratios of up to 14.5x for ResNet-18 and 2.5x for ResNet-152. Additionally, the inference time for processing an image on a CPU remained largely unaffected, demonstrating the practicality of the method.

关键词： tensortrain optimisation hyperparameter optimisation image classification computer vision and pattern recognition convolutional neural networks model compression CP decomposition SvD decomposition tucker decomposition

来源：评论

学校读者我要写书评

暂无评论

Computer vision Techniques in Manufacturing

引用

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 2023年第1期53卷 105-117页

作者： Zhou, Longfei Zhang, Lin Konz, Nicholas MIT Comp Sci & Artificial Intelligence Lab 77 Massachusetts Ave Cambridge MA 02139 USA Beihang Univ Sch Automat Sci & Elect Engn Beijing 100191 Peoples R China Duke Univ Dept Elect & Comp Engn Durham NC 27708 USA

Computer vision (Cv) techniques have played an important role in promoting the informatization, digitization, and intelligence of industrial manufacturing systems. Considering the rapid development of Cv techniques, we present a comprehensive review of the state of the art of these techniques and their applications in manufacturing industries. We survey the most common methods, including feature detection, recognition, segmentation, and three-dimensional modeling. A system framework of Cv in the manufacturing environment is proposed, consisting of a lighting module, a manufacturing system, a sensing module, Cv algorithms, a decision-making module, and an actuator. applications of Cv to different stages of the entire product life cycle are then explored, including product design, modeling and simulation, planning and scheduling, the production process, inspection and quality control, assembly, transportation, and disassembly. Challenges include algorithm implementation, data preprocessing, data labeling, and benchmarks. Future directions include building benchmarks, developing methods for nonannotated data processing, developing effective data preprocessing mechanisms, customizing Cv models, and opportunities aroused by 5G.

关键词： image edge detection image segmentation Task analysis Robot sensing systems Sensors Feature detection Three-dimensional displays Assembly computer vision (Cv) deep learning inspection machine intelligence machine learning manufacturing production robotics survey

来源：评论

学校读者我要写书评

暂无评论

Skywatch: Advanced machine Learning Techniques for Distinguishing UAvs from Birds in Airspace Security

引用

INTERNATIONAL JOURNAL OF ADvANCED COMPUTER SCIENCE AND applications 2024年第11期15卷 1065-1078页

作者： Alqaraleh, Muhyeeddin Alzboon, Mowafaq Salem Al-Batah, Mohammad Subhi Zarqa Univ Fac Informat Technol Dept Software Engn Zarqa Jordan Jadara Univ Fac Sci & Informat Technol Dept Comp Sci Irbid Jordan

This study addresses the critical challenge of distinguishing Unmanned Aerial vehicles (UAvs) from birds in real-time for airspace security in both military and civilian contexts. As UAvs become increasingly common, advanced systems must accurately identify them in dynamic environments to ensure operational safety. We evaluated several machine learning algorithms, including K-Nearest Neighbors (kNN), AdaBoost, CN2 Rule Induction, and Support vector machine (SvM), employing a comprehensive methodology that included data preprocessing steps such as image resizing, normalization, and augmentation to optimize training on the "Birds vs. Drone Dataset." The performance of each model was assessed using evaluation metrics such as accuracy, precision, recall, F1 score, and Area Under the Curve (AUC) to determine their effectiveness in distinguishing UAvs from birds. Results demonstrate that kNN, AdaBoost, and CN2 Rule Induction are particularly effective, achieving high accuracy while minimizing false positives and false negatives. These models excel in reducing operational risks and enhancing surveillance efficiency, making them suitable for real-time security applications. The integration of these algorithms into existing surveillance systems offers robust classification capabilities and real-time decision-making under challenging conditions. Additionally, the study highlights future directions for research in computational performance optimization, algorithm development, and ethical considerations related to privacy and surveillance. The findings contribute to both the technical domain of machine learning in security and broader societal impacts, such as civil aviation safety and environmental monitoring.

关键词： Unmanned Aerial vehicles (UAvs) machine learning image recognition real-time processing security computer vision image processing

来源：评论

学校读者我要写书评

暂无评论

The JPEG Pleno Learning-Based Point Cloud Coding Standard: Serving Man and machine

引用

IEEE ACCESS 2025年 13卷 43289-43315页

作者： Guarda, Andre F. R. Rodrigues, Nuno M. M. Pereira, Fernando Inst Telecomunicacoes P-1049001 Lisbon Portugal Politecn Leiria ESTG P-2411901 Leiria Portugal Univ Lisbon Inst Super Tecn P-1049001 Lisbon Portugal

Efficient point cloud coding has become increasingly critical for multiple applications such as virtual reality, autonomous driving, and digital twin systems, where rich and interactive 3D data representations may functionally make the difference. Deep learning has emerged as a powerful tool in this domain, offering advanced techniques for compressing point clouds more efficiently than conventional coding methods while also allowing effective computer vision tasks performed in the compressed domain thus, for the first time, making available a common compressed visual representation effective for both man and machine. Taking advantage of this potential, JPEG has recently finalized the JPEG Pleno Learning-based Point Cloud Coding (PCC) standard offering efficient lossy coding of static point clouds, targeting both human visualization and machine processing by leveraging deep learning models for geometry and color coding. The geometry is processed directly in its original 3D form using sparse convolutional neural networks, while the color data is projected onto 2D images and encoded using the also learning-based JPEG AI standard. The goal of this paper is to provide a complete technical description of the JPEG PCC standard, along with a thorough benchmarking of its performance against the state-of-the-art, while highlighting its main strengths and weaknesses. In terms of compression performance, JPEG PCC outperforms the conventional MPEG PCC standards, especially in geometry coding, achieving significant rate reductions. Color compression performance is less competitive but this is overcome by the power of a full learning-based coding framework for both geometry and color and the associated effective compressed domain processing.

关键词： Transform coding Point cloud compression Encoding Standards image coding Three-dimensional displays Geometry image color analysis Artificial intelligence Codecs JPEG Pleno standard learning-based coding man and machine point cloud coding

来源：评论

学校读者我要写书评

暂无评论

Exploring image Transformations with Diffusion Models: A Survey of applications and Implementation Code 9th

Exploring Image Transformations with Diffusion Models: A Sur...

引用

9th Annual Conference on machine Learning, Optimization and Data science (LOD)

作者： Arellano, Silvia Otero, Beatriz Tous, Ruben Univ Politecn Cataluna Barcelona Spain

ISBN: (纸本)9783031539657;9783031539664

Diffusion Models have become increasingly popular in recent years and their applications span a wide range of fields. This survey focuses on the use of diffusion models in computer vision, specially in the branch of image transformations. The objective of this survey is to provide an overview of state-of-the-art applications of diffusion models in image transformations, including image inpainting, super-resolution, restoration, translation, and editing. This survey presents a selection of notable papers and repositories including practical applications of diffusion models for image transformations. The applications are presented in a practical and concise manner, facilitating the understanding of concepts behind diffusion models and how they function. Additionally, it includes a curated collection of GitHub repositories featuring popular examples of these subjects.

关键词： Diffusion Models image Transformations applications Computer vision Inpainting Restoration Translation Editing Super-resolution

来源：评论

学校读者我要写书评

暂无评论

Application and Prospects of Artificial Intelligence (AI)-Based Technologies in Fruit Production Systems

引用

APPLIED FRUIT SCIENCE 2025年第1期67卷 1-18页

作者： Dutta, Sudip Kumar Bhutia, Birshika Misra, Tanuj Mishra, v. K. Singh, S. K. Patel, v. B. ICAR Res Complex Res Complex NEH Reg Sikkim Ctr Gangtok 737102 Sikkim India Rani Lakshmi Bai Cent Agr Univ Jhansi 284003 UP India ICAR Res Complex Res Complex NEH Reg Umiam 793103 Meghalaya India Indian Council Agr Res Hort Sci Div KAB 2 New Delhi 110012 India

The process of cultivating soil for crop planting and domesticating animals is known as agriculture. A growing agriculture sector indicates an improving economy. Agriculture is considered as the initial pillar that supports global food safety. Additionally, it controls the majority of the global economy. Since we depend on agriculture for survival, it needs to be regularly supervised by us. In this global era of computerization, humans depend entirely on cyberspace material as it is super-fast and takes less time as compared to humans. Hence, human vision can be replicated by computer vision. visual data and information are processed and analyzed using computer hardware and software. It covers the procedures for gathering, sending, processing, filtering, storing, and comprehending visual data. The study of computational theory can direct computer vision research, and a variety of applications offer a solid foundation and research platform. The use of machine vision has recently increased in response to the growing need for fast and precise ways to track the production of fruit. machine learning (ML) algorithms make it possible to swiftly and reliably analyze enormous amounts of data, regardless of complexity. It is already widely used in many domains, such as credit analysis, fraud detection, defect sophisticated spam filters, picture recognition patterns, prediction models, and inspection of product features. But with so many options available, it is critical to understand the unique qualities of each approach and the optimal situation in which to apply it. In this review, we have discussed in detail the use of artificial intelligence (AI) in fruit production and summarized more than 110 research applications of AI in fruit production technology. As of now, this review is the first compilation work on the application and prospects of AI-based technology in fruit production systems. This review will provide a single-point comprehensive source of information for acad

关键词： Artificial intelligence Computer vision Fruits machine learning Neural network

来源：评论

学校读者我要写书评

暂无评论

MoistNet: machine vision-based deep learning models for wood chip moisture content measurement

引用

EXPERT SYSTEMS WITH applications 2025年 259卷

作者： Rahman, Abdur Street, Jason Wooten, James Marufuzzaman, Mohammad Gude, veera G. Buchanan, Randy Wang, Haifeng Mississippi State Univ Dept Ind & Syst Engn Mississippi State MS 39762 USA Mississippi State Univ Dept Sustainable Bioprod Mississippi State MS 39762 USA Mississippi State Univ Dept Agr & Biol Engn Mississippi State MS 39762 USA Purdue Univ Northwest Purdue Univ Northwest Water Inst PWI Hammond IN 46323 USA US Army Engineer Res & Dev Ctr 3909 Halls Ferry Rd Vicksburg MS 39180 USA

Quick and reliable measurement of wood chip moisture content is an everlasting problem for numerous forest-reliant industries such as biofuel, pulp and paper, and bio-refineries. Moisture content is a critical attribute of wood chips due to its direct relationship with the final product quality. Conventional techniques for determining moisture content, such as oven-drying, possess some drawbacks in terms of their time-consuming nature, potential sample damage, and lack of real-time feasibility. Furthermore, alternative techniques, including NIR spectroscopy, electrical capacitance, X-rays, and microwaves, have demonstrated potential;nevertheless, they are still constrained by issues related to portability, precision, and the expense of the required equipment. Hence, there is a need for a moisture content determination method that is instant, portable, non-destructive, inexpensive, and precise. This study explores the use of deep learning and machine vision to predict moisture content classes from RGB images of wood chips. A large-scale image dataset comprising 1,600 RGB images of wood chips has been collected and annotated with ground truth labels, utilizing the results of the oven-drying technique. Two high-performing neural networks, MoistNetLite and MoistNetMax, have been developed leveraging Neural Architecture Search (NAS) and hyperparameter optimization. The developed models are evaluated and compared with state-of-the-art deep learning models. Results demonstrate that MoistNetLite achieves 87% accuracy with minimal computational overhead, while MoistNetMax exhibits exceptional precision with a 91% accuracy in wood chip moisture content class prediction. With improved accuracy (9.6% improvement in accuracy by MoistNetMax compared to the best baseline model ResNet152v2) and faster prediction speed (MoistNetLite being twice as fast as MobileNet), our proposed MoistNet models hold great promise for the wood chip processing industry to be efficiently deployed on p

关键词： Wood chip Moisture content Deep learning machine vision Neural architecture search Hyperparameter optimization

来源：评论

学校读者我要写书评

暂无评论

Skin image analysis for detection and quantitative assessment of dermatitis, vitiligo and alopecia areata lesions: a systematic literature review

引用

BMC MEDICAL INFORMATICS AND DECISION MAKING 2025年第1期25卷 1-17页

作者： Kallipolitis, Athanasios Moutselos, Konstantinos Zafeiriou, Argyrios Andreadis, Stelios Matonaki, Anastasia Stavropoulos, Thanos G. Maglogiannis, Ilias Univ Piraeus Dept Digital Syst Piraeus Greece Pfizer Ctr Digital Innovat Thessaloniki Greece

vitiligo, alopecia areata, atopic, and stasis dermatitis are common skin conditions that pose diagnostic and assessment challenges. Skin image analysis is a promising noninvasive approach for objective and automated detection as well as quantitative assessment of skin diseases. This review provides a systematic literature search regarding the analysis of computer vision techniques applied to these benign skin conditions, following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. The review examines deep learning architectures and image processing algorithms for segmentation, feature extraction, and classification tasks employed for disease detection. It also focuses on practical applications, emphasizing quantitative disease assessment, and the performance of various computer vision approaches for each condition while highlighting their strengths and limitations. Finally, the review denotes the need for disease-specific datasets with curated annotations and suggests future directions toward unsupervised or self-supervised approaches. Additionally, the findings underscore the importance of developing accurate, automated tools for disease severity score calculation to improve ML-based monitoring and diagnosis in dermatology.

关键词： Skin image analysis Benign skin lesions Dermatitis Alopecia Areata vitiligo machine learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：