检索结果-内蒙古大学图书馆

Semantic segmentation and deep CNN learning vision-based crack recognition system for concrete surfaces: development and implementation

引用

SIGNAL image AND VIDEO processing 2025年第4期19卷 1-15页

作者： Abbas, Yassir M. Alghamdi, Hussam King Saud Univ Coll Engn Dept Civil Engn Riyadh 12372 Saudi Arabia

The enhancement of machine learning (ML) models relies heavily on the volume and integrity of data, emphasizing the importance of efficient data collection and rigorous ground-truth labeling. While deep learning, particularly its subset convolutional neural network (CNN), has shown promise in crack detection, there remains a need for sophisticated algorithms to identify structural defects accurately. This study presents a novel deep CNN model tailored for the binary classification of concrete surfaces, addressing a significant need in infrastructure engineering. The deep CNN model was developed using a comprehensive dataset (40,000 images each measuring 227 x 227 pixels). Various metrics, including precision, sensitivity, binary accuracy, and F1 score, were utilized to evaluate the model's performance. Additionally, the model's generalization capability was assessed by testing its proficiency in accurately classifying unseen data. The study demonstrates that the model's predictive performance improves with additional epochs, indicating enhanced learning over learning cycles. Validation metrics suggest potential generalization capability despite slight accuracy declines, showcasing the model's robustness in accurately classifying positive instances. The findings reveal significant advancements in deep CNN models for concrete material classification, surpassing previous comparable models. Employing CNN models holds promising outcomes for quality control and repair processes in infrastructure engineering applications. Future research directions include exploring the application of the deep CNN model to classify alternative materials and assessing its generalization capability using larger and more diverse datasets. Overall, this study contributes to the advancement of ML techniques in infrastructure engineering, with implications for optimizing material classification processes and enhancing infrastructure repair outcomes.

关键词： Concrete surface classification Convolutional neural network Deep learning image recognition machine learning

来源：评论

学校读者我要写书评

暂无评论

MoistNet: machine vision-based deep learning models for wood chip moisture content measurement

引用

EXPERT SYSTEMS WITH applications 2025年 259卷

作者： Rahman, Abdur Street, Jason Wooten, James Marufuzzaman, Mohammad Gude, Veera G. Buchanan, Randy Wang, Haifeng Mississippi State Univ Dept Ind & Syst Engn Mississippi State MS 39762 USA Mississippi State Univ Dept Sustainable Bioprod Mississippi State MS 39762 USA Mississippi State Univ Dept Agr & Biol Engn Mississippi State MS 39762 USA Purdue Univ Northwest Purdue Univ Northwest Water Inst PWI Hammond IN 46323 USA US Army Engineer Res & Dev Ctr 3909 Halls Ferry Rd Vicksburg MS 39180 USA

Quick and reliable measurement of wood chip moisture content is an everlasting problem for numerous forest-reliant industries such as biofuel, pulp and paper, and bio-refineries. Moisture content is a critical attribute of wood chips due to its direct relationship with the final product quality. Conventional techniques for determining moisture content, such as oven-drying, possess some drawbacks in terms of their time-consuming nature, potential sample damage, and lack of real-time feasibility. Furthermore, alternative techniques, including NIR spectroscopy, electrical capacitance, X-rays, and microwaves, have demonstrated potential;nevertheless, they are still constrained by issues related to portability, precision, and the expense of the required equipment. Hence, there is a need for a moisture content determination method that is instant, portable, non-destructive, inexpensive, and precise. This study explores the use of deep learning and machine vision to predict moisture content classes from RGB images of wood chips. A large-scale image dataset comprising 1,600 RGB images of wood chips has been collected and annotated with ground truth labels, utilizing the results of the oven-drying technique. Two high-performing neural networks, MoistNetLite and MoistNetMax, have been developed leveraging Neural Architecture Search (NAS) and hyperparameter optimization. The developed models are evaluated and compared with state-of-the-art deep learning models. Results demonstrate that MoistNetLite achieves 87% accuracy with minimal computational overhead, while MoistNetMax exhibits exceptional precision with a 91% accuracy in wood chip moisture content class prediction. With improved accuracy (9.6% improvement in accuracy by MoistNetMax compared to the best baseline model ResNet152V2) and faster prediction speed (MoistNetLite being twice as fast as MobileNet), our proposed MoistNet models hold great promise for the wood chip processing industry to be efficiently deployed on p

关键词： Wood chip Moisture content Deep learning machine vision Neural architecture search Hyperparameter optimization

来源：评论

学校读者我要写书评

暂无评论

Optimizing Robotic Manipulation With Decision-RWKV: A Recurrent Sequence Modeling Approach for Lifelong Learning

引用

JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING 2025年第3期25卷 031004页

作者： Dong, Yujian Wu, Tianyu Song, Chaoyang Southern Univ Sci & Technol Sch Design 1088 Xueyuan Rd Shenzhen 518055 Peoples R China Southern Univ Sci & Technol Dept Mech & Energy Engn 1088 Xueyuan Rd Shenzhen 518055 Peoples R China

Models based on the transformer architecture have seen widespread application across fields such as natural language processing (NLP), computer vision, and robotics, with large language models (LLMs) like ChatGPT revolutionizing machine understanding of human language and demonstrating impressive memory capacity and reproduction capabilities. Traditional machine learning algorithms struggle with catastrophic forgetting, detrimental to the diverse and generalized abilities required for robotic deployment. This article investigates the receptance weighted key value (RWKV) framework, known for its advanced capabilities in efficient and effective sequence modeling, integration with the decision transformer (DT), and experience replay architectures. It focuses on potential performance enhancements in sequence decision-making and lifelong robotic learning tasks. We introduce the decision-RWKV (DRWKV) model and conduct extensive experiments using the D4RL database within the OpenAI Gym environment and on the D'Claw platform to assess the DRWKV model's performance in single-task tests and lifelong learning scenarios, showing its ability to handle multiple subtasks efficiently. The code for all algorithms, training, and image rendering in this study is available online (open source).

关键词： foundation models recurrent models lifelong learning robot learning artificial intelligence computational foundations for engineering optimization data-driven engineering engineering informatics machine learning for engineering applications multiphysics modeling and simulation

来源：评论

学校读者我要写书评

暂无评论

Chromosome analysis using a hybrid deep CNN and structural feature-based grouping model

引用

Multimedia Tools and applications 2025年 1-30页

作者： Isfahani, Farahnaz Peiravi Pourghassem, Hossein Mahdavi-Nasab, Homayoun Naghsh, Alireza Department of Electrical Engineering Najafabad Branch Islamic Azad University Najafabad Iran Digital Processing and Machine Vision Research Center Najafabad Branch Islamic Azad University Najafabad Iran

Chromosome analysis and classification are essential in clinical applications to diagnose various structural and numerical abnormalities. Recently, karyotype analysis using intelligent image processing methods, especially deep learning, has attracted significant attention as a genetic abnormality test. This paper presents a novel chromosome classification algorithm that uses high-level features extracted from deep convolutional neural networks (DCNN) along with morphological features designed to identify and modify the classes of misclassified chromosomes. Initially, chromosomes are classified using a DCNN. Some structural features, such as centromere and banding profile, are then extracted to group chromosomes again. Based on the results of the two preceding methods, a decision strategy is utilized to identify misclassified chromosomes. Here, a final DCNN-based strategy is introduced to assign misclassified chromosomes to the associated classes. The proposed method can be used in parallel with other chromosome classification methods to modify misclassified chromosomes and promote the accuracy of the classification. Evaluation results show that the proposed algorithm outperforms relevant state-of-the-art algorithms regarding the classification precision and accuracy of 99.66 and 96.52%, respectively. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Blur Patch Classification Approach to Single-image Depth Estimation 17

Blur Patch Classification Approach to Single-Image Depth Est...

引用

17th International Conference on machine vision, ICMV 2024

作者： Kim, Huijun Lee, Deokwoo Keimyung University Dalgubeol-daero Dalseo-gu Daegu1095 Korea Republic of

ISBN: (纸本)9781510688278

Depth information is useful in many image processing and computer vision applications, but in photography, depth information is lost in the process of projecting a real-world scene onto a 2D plane. Extracting depth information from such images is a challenging task. In this paper, we propose a method to train a deep neural network to classify an image patch (16x16 in size) into 15 levels based on the level of blur. Blur is related to the distance between the focal plane and the object. The input image is shifted using a sliding window technique at 8 pixel intervals and the trained blur classifier evaluates each blur level. The obtained blur maps are subjected to a refinement process to quantitatively assess their accuracy and impact on the final result, and the final blur maps are compared with the labels of the actual input data to estimate the depth map. The proposed method demonstrates that depth information can be successfully extracted from a single image by classifying the focus levels. © 2025 SPIE.

关键词： image classification

来源：评论

学校读者我要写书评

暂无评论

Optimizing zero-shot text-based segmentation of remote sensing imagery using SAM and Grounding DINO

ARTIFICIAL INTELLIGENCE IN GEOSCIENCES

引用

ARTIFICIAL INTELLIGENCE IN GEOSCIENCES 2025年第1期6卷

作者： Diab, Mohanad Kolokoussis, Polychronis Brovelli, Maria Antonia Politecn Milan Dept Civil & Environm Engn I-20133 Milan Italy Natl Tech Univ Athens Sch Rural Surveying & Geoinformat Engn Athens 15780 Greece

The use of AI technologies in remote sensing (RS) tasks has been the focus of many individuals in both the professional and academic domains. Having more accessible interfaces and tools that allow people of little or no experience to intuitively interact with RS data of multiple formats is a potential provided by this integration. However, the use of AI and AI agents to help automate RS-related tasks is still in its infancy stage, with some frameworks and interfaces built on top of well-known vision language models (VLM) such as GPT-4, segment anything model (SAM), and grounding DINO. These tools do promise and draw guidelines on the potentials and limitations of existing solutions concerning the use of said models. In this work, the state of the art AI foundation models (FM) are reviewed and used in a multi-modal manner to ingest RS imagery input and perform zero-shot object detection using natural language. The natural language input is then used to define the classes or labels the model should look for, then, both inputs are fed to the pipeline. The pipeline presented in this work makes up for the shortcomings of the general knowledge FMs by stacking pre-processing and post-processing applications on top of the FMs;these applications include tiling to produce uniform patches of the original image for faster detection, outlier rejection of redundant bounding boxes using statistical and machine learning methods. The pipeline was tested with UAV, aerial and satellite images taken over multiple areas. The accuracy for the semantic segmentation showed improvement from the original 64% to approximately 80%-99% by utilizing the pipeline and techniques proposed in this work. GitHub Repository: MohanadDiab/LangRS.

关键词： Foundation models Multi-modal models vision language models Semantic segmentation Segment anything model Earth observation Remote sensing

来源：评论

学校读者我要写书评

暂无评论

Momentum-Space Tunable Metasurfaces for Switchable image processing

引用

ADVANCED OPTICAL MATERIALS 2025年

作者： Zhang, Kai Wang, Shuo Qiu, Jumin Yang, Muyi Liu, Tingting Xiao, Shuyuan Staude, Isabelle Pertsch, Thomas Wang, Yu Zou, Chengjun Chinese Acad Sci Inst Microelect Beitucheng West Rd 3 Beijing 100029 Peoples R China Univ Chinese Acad Sci Beijing 101408 Peoples R China Nanchang Univ Sch Phys & Mat Sci Nanchang 330031 Peoples R China Friedrich Schiller Univ Jena Inst Solid State Phys D-07743 Jena Germany Friedrich Schiller Univ Jena Abbe Ctr Photon D-07745 Jena Germany Nanchang Univ Sch Informat Engn Nanchang 330031 Peoples R China Nanchang Univ Inst Adv Study Nanchang 330031 Peoples R China

The exceptional ability of optical metasurfaces to manipulate light has enabled integrated analog computing and image processing in ultracompact, energy-efficient platforms that support high speeds. To date, metasurfaces have demonstrated various analog processing functions, including differentiation, convolution, and classification. However, a fundamental limitation of existing designs is their static functionality, which restricts adaptability to diverse application scenarios. To address this challenge, momentum-space reconfigurable metasurfaces operating in the near-infrared range are experimentally demonstrated, capable of switchable image processing functions including image differentiation and bright-field imaging. These meta-devices are achieved by integrating nematic liquid crystals with silicon metasurfaces that support resonances of quasi-bound states in the continuum (quasi-BICs). The quasi-BIC modes enable further design freedom over the angular dispersion of metasurfaces. The results showcase an electrically tunable, CMOS-compatible approach to reconfigurable optical computing, offering significant potential for applications such as online training of diffractive neural networks, machine vision, and augmented reality.

关键词： dielectric metasurfaces liquid crystals momentum-space tunable metasurfaces optical image processing

来源：评论

学校读者我要写书评

暂无评论

Adversarial Encoder-Driven Filter for Targeted image Tagging: A Novel Approach to Visual Content Manipulation 6

Adversarial Encoder-Driven Filter for Targeted Image Tagging...

引用

6th IEEE International Conference on image processing, applications and Systems, IPAS 2025

作者： Mckee, Cole Flowers, Dominic Wood, Jesse Shafer, Ethan United States Military Academy Department of Electrical Engineering and Computer Science West PointNY10996 United States

ISBN: (纸本)9798331506520

Computer vision, driven by artificial intelligence, has become pervasive in diverse applications such as self-driving cars and law enforcement. However, the susceptibility of these systems to attacks has raised significant concerns among researchers. This paper addresses the vulnerability of image tagging algorithms, particularly focusing on misclassifications induced by autoencoders. We present experiments conducted on Amazon Rekognition, where we developed a specialized autoencoder to manipulate the latent space, forcing it to align with specific tags. By integrating this manipulated latent space with other images, we demonstrate the ability to increase the confidence of a specific tag on Amazon Rekognition, leading to more false positives of the chosen tag. Our study showcases a practical method to exploit Amazon's Rekognition image tagging algorithm using a black box approach. © 2025 IEEE.

关键词： Adversarial machine learning

来源：评论

学校读者我要写书评

暂无评论

Comparative Study on Evaluating the Performance of Automated Bacterial Colony Counting with Available APP and Software on Generated image Dataset

引用

SN Computer Science 2025年第4期6卷 1-16页

作者： Arora, Prachi Tewary, Suman Krishnamurthi, Srinivasan Kumari, Neelam School of Computing Indian Institute of Information Technology Una (IIIT-Una) Una 177209 India Academy of Scientific and Innovative Research (AcSIR) Ghaziabad 201002 India Thin Film Coating Facility CSIR-Central Scientific Instruments Organisation (CSIR-CSIO) Sector 30-C Chandigarh 160030 India Materials Science and Sensor Applications CSIR-Central Scientific Instruments Organisation (CSIR-CSIO) Sector 30-C Chandigarh 160030 India Advanced Materials and Processes Division CSIR-National Metallurgical Laboratory (CSIR-NML) Jamshedpur 831007 India MTCC-Gene bank CSIR-Institute of Microbial Technology (CSIR-IMTECH) Sector 39-A Chandigarh 160039 India

Recent developments in image analysis and interpretation using computer vision techniques have shown potential for novel applications in microbiology laboratories to support the task of automation, aiming for faster and more reliable detection. image processing techniques and machine learning models can be valuable tools in the screening process, helping technicians spend less time classifying no-growth results and quickly separating the categories for further analysis. In this context, creating a dataset of different bacterial strain images is a fundamental objective for developing and improving the accuracy of image processing models. Therefore, this manuscript acquired a dataset of water samples with different bacterial strain images on a petri dish following a standardized process with controlled conditions of positioning and lighting. The image acquisition device was also developed with a light-emitting diode (LED) and diffuser as a lighting source and a smartphone camera with 16 MP resolution. In addition, the present manuscript also focuses on comparing the accuracy of the proposed algorithm with the available apps and software using the custom-built imaging device. Hence, the resulting dataset consists of 100 images, which is helpful for researchers working in image processing to develop an algorithm for automated counting of bacterial colonies on petri dishes. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2025.

关键词： Bacterial colonies image datasets image processing Imaging device Segmentation Water samples

来源：评论

学校读者我要写书评

暂无评论

Design and validation of a prism-based single-camera system for multi-viewpoint stereoscopic imaging

引用

Applied Optics 2025年第13期64卷 3744-3751页

作者： Liu, Huiting Xie, Yuxin Zhou, Sicheng Zhao, Shuai Zhou, Chongqing Cai, Bolin Zhang, Lei Wang, Keyi Department of Precision Machinery and Precision Instrumentation University of Science and Technology of China Hefei230026 China National Synchrotron Radiation Laboratory University of Science and Technology of China Hefei230029 China School of Internet Anhui University Hefei230039 China

Stereoscopic imaging from multiple viewpoints offers critical advantages in machine vision and related applications, yet traditional multi-camera setups require complex synchronization mechanisms. This study introduces a prism-based single-camera stereoscopic imaging system featuring a bonded multi-sub-eye structure of prisms and plano-convex lenses. The prisms deflect light paths, while the plano-convex lenses ensure approximate parallel beam output, enabling the formation of stereo images in four distinct regions on a single image sensor. The system was experimentally validated by capturing four-viewpoint images of small objects using a single camera, achieving three-dimensional reconstruction and demonstrating its feasibility for machine vision tasks. This compact and efficient design eliminates synchronization challenges, offering a cost-effective solution for applications such as three-dimensional reconstruction and defect detection. © 2025 Optica Publishing Group.

关键词： Stereo image processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：