检索结果-内蒙古大学图书馆

Optical Metrology and Inspection for Industrial applications XI 2024

作者： Zelensky, A. Gapon, N. Zhdanova, M. voronin, v. Ilukhin, Y. Gribkov, A. Scientific-Manufacturing Complex «Technological Centre» Zelenograd Russia Don State Technical University Rostov-on-Don Russia Center for Cognitive Technology and Machine Vision Moscow State University of Technology «STANKIN» Moscow Russia

ISBN: (纸本)9781510682108

image segmentation is the critical step in different imaging and especially optical inspection applications: detection and recognition of objects, classification, analysis, and identification. Also, image gradient, as a preprocessing step, is an essential tool in image processing in many research areas, such as edge detection, segmentation, inpainting, etc. However, these tools have limitations and could be more accurate since the capture devices usually generate low-resolution images, which are primarily noisy and blurry. It is critical to receive useful gradient estimation on noisy color images while preserving the sharp edges. In the present paper, we develop a new gradient by integrating the quaternion framework with local polynomial approximation and the intersection of confidence intervals based on anisotropic gradient concepts for color image processing applications. We apply the proposed gradient technique in a modified active contour method to perform an automated segmentation for optical inspection applications. Computer simulations on the segmentation dataset for optical inspection applications show that the new adaptive quaternion anisotropic gradient exhibits fewer color artefacts than state-of-the-art techniques. © 2024 SPIE.

关键词： Polynomial approximation

来源：评论

学校读者我要写书评

暂无评论

All-optical geometric image transformations enabled by ultrathin metasurfaces

引用

NATURE COMMUNICATIONS 2023年第1期14卷 1-8页

作者： Zhang, Xingwang Zhang, Xiaojie Duan, Yao Zhang, Lidan Ni, Xingjie Penn State Univ Dept Elect Engn University Pk PA 16802 USA

image processing plays a vital role in artificial visual systems, which have diverse applications in areas such as biomedical imaging and machine vision. In particular, optical analog image processing is of great interest because of its parallel processing capability and low power consumption. Here, we present ultra-compact metasurfaces performing all-optical geometric image transformations, which are essential for image processing to correct image distortions, create special image effects, and morph one image into another. We show that our metasurfaces can realize binary image transformations by modifying the spatial relationship between pixels and converting binary images from Cartesian to log-polar coordinates with unparalleled advantages for scale- and rotation-invariant image preprocessing. Furthermore, we extend our approach to grayscale image transformations and convert an image with Gaussian intensity profile into another image with flat-top intensity profile. Our technique will potentially unlock new opportunities for various applications such as target tracking and laser manufacturing. Metasurfaces enable all-optical geometric coordinate transformations, converting images with altered pixel spatial relations, which can facilitate fast, energy-efficient preprocessing for tasks like object tracking, or aid in laser manufacturing.

关键词： Metamaterials Nanophotonics and plasmonics

来源：评论

学校读者我要写书评

暂无评论

Identification of Emotions From Facial Gestures in a Teaching Environment With the Use of machine Learning Techniques

引用

IEEE ACCESS 2023年 11卷 38010-38022页

作者： villegas-Ch, William Eduardo Garcia-Ortiz, Joselin Sanchez-viteri, Santiago Univ Las Amer Escuela Ingn Cibersegur FICA Quito 170125 Ecuador Univ Int Ecuador Dept Sistemas Quito 170411 Ecuador

Educational models currently integrate a variety of technologies and computer applications that seek to improve learning environments. With this objective, information technologies have increasingly adapted to assume the role of educational assistants that support the teacher, the students, and the areas enrolled in educational quality. One of the technologies that are gaining strength in the academic field is computer vision, which is used to monitor and identify the state of mind of students during the teaching of a subject. To do this, machine learning algorithms monitor student gestures and classify them to identify the emotions they convey in a teaching environment. These systems allow the evaluation of emotional aspects, based on two main elements, the first is the generation of an image database with the emotions generated in a learning environment such as interest, commitment, boredom, concentration, relaxation, and enthusiasm. The second is an emotion recognition system, through the recognition of facial gestures using non-invasive techniques. This work applies techniques for the recognition and processing of facial gestures and the classification of emotions focused on learning. This system helps the tutor in a modality of face-to-face education and allows him to evaluate emotional aspects and not only cognitive ones. This arises from the need to create a base of images focused on the spontaneous learning of emotions since most of the works reviewed focus on these acted-out emotions.

关键词： Education Emotion recognition Face recognition Real-time systems Object recognition image resolution Computer vision Neural networks emotion recognition neural networks teaching

来源：评论

学校读者我要写书评

暂无评论

Semantic segmentation and deep CNN learning vision-based crack recognition system for concrete surfaces: development and implementation

引用

SIGNAL image AND vIDEO processing 2025年第4期19卷 1-15页

作者： Abbas, Yassir M. Alghamdi, Hussam King Saud Univ Coll Engn Dept Civil Engn Riyadh 12372 Saudi Arabia

The enhancement of machine learning (ML) models relies heavily on the volume and integrity of data, emphasizing the importance of efficient data collection and rigorous ground-truth labeling. While deep learning, particularly its subset convolutional neural network (CNN), has shown promise in crack detection, there remains a need for sophisticated algorithms to identify structural defects accurately. This study presents a novel deep CNN model tailored for the binary classification of concrete surfaces, addressing a significant need in infrastructure engineering. The deep CNN model was developed using a comprehensive dataset (40,000 images each measuring 227 x 227 pixels). various metrics, including precision, sensitivity, binary accuracy, and F1 score, were utilized to evaluate the model's performance. Additionally, the model's generalization capability was assessed by testing its proficiency in accurately classifying unseen data. The study demonstrates that the model's predictive performance improves with additional epochs, indicating enhanced learning over learning cycles. validation metrics suggest potential generalization capability despite slight accuracy declines, showcasing the model's robustness in accurately classifying positive instances. The findings reveal significant advancements in deep CNN models for concrete material classification, surpassing previous comparable models. Employing CNN models holds promising outcomes for quality control and repair processes in infrastructure engineering applications. Future research directions include exploring the application of the deep CNN model to classify alternative materials and assessing its generalization capability using larger and more diverse datasets. Overall, this study contributes to the advancement of ML techniques in infrastructure engineering, with implications for optimizing material classification processes and enhancing infrastructure repair outcomes.

关键词： Concrete surface classification Convolutional neural network Deep learning image recognition machine learning

来源：评论

学校读者我要写书评

暂无评论

How Culturally Aware Are vision-Language Models? 6

How Culturally Aware Are Vision-Language Models?

引用

6th IEEE International Conference on image processing, applications and Systems, IPAS 2025

作者： Burda-Lassen, Olena Chadha, Aman Goswami, Shashank Jain, vinija Independent Research Scientist United States Stanford University/Amazon Inc. United States Stanford University United States

ISBN: (纸本)9798331506520

An image is often considered worth a thousand words, and certain images can tell rich and insightful stories. Can these stories be told via image captioning? images from folklore genres, such as mythology, folk dance, cultural signs, and symbols, are vital to every culture. Our research compares the performance of four popular vision-language models (GPT-4v, Gemini Pro vision, LLavA, and OpenFlamingo) in identifying culturally specific information in such images and creating accurate and culturally sensitive image captions. We also propose a new evaluation metric, the Cultural Awareness Score (CAS), which measures the degree of cultural awareness in image captions. We provide a dataset MOSAIC-1.5k labeled with ground truth for images containing cultural background and context and a labeled dataset with assigned Cultural Awareness Scores that can be used with unseen data. Creating culturally appropriate image captions is valuable for scientific research and can be beneficial for many practical applications. We envision our work will promote a deeper integration of cultural sensitivity in AI applications worldwide. By making the dataset and Cultural Awareness Score available to the public, we aim to facilitate further research in this area, encouraging the development of more culturally aware AI systems that respect and celebrate global diversity. © 2025 IEEE.

关键词： visual languages

来源：评论

学校读者我要写书评

暂无评论

Enabling Instance Segmentation: A Semi-Automatic Method for Thermal Event Annotation

引用

IEEE TRANSACTIONS ON PLASMA SCIENCE 2024年第9期52卷 3521-3527页

作者： Jablonski, Bartlomiej Makowski, Dariusz Sitjes, Aleix Puig Jabonski, Marcin Lodz Univ Technol Dept Microelect & Comp Sci PL-90924 Lodz Poland Max Planck Inst Plasma Phys D-17491 Greifswald Germany

Contemporary infrared imaging systems in thermonuclear fusion devices are preventing thermal overloads on plasma-facing components (PFCs) relying on the surface temperature. Automatic delineation and classification of thermal events would facilitate scene understanding, contributing to advanced machine protection, control, and physics exploration applications. However, the absence of image annotations, which require a significant amount of expert labor and are prone to inconsistencies, limits the use of deep learning computer vision methods in fusion devices. A semi-automatic annotation method based on deterministic infrared image processing is proposed to reduce annotation efforts while maintaining consistency. The method exploits discharge sequence properties to minimize expert involvement. It was evaluated on infrared images from the Wendelstein 7-X (W7-X) stellarator by comparing the generated annotation with manually prepared ground-truth annotations. The generated annotations have a high mean similarity to the manual annotations, measured with Sorensen-Dice coefficient (SDC), equal to 0.825 with a sample standard deviation of 0.030. Furthermore, a customized metric temperature over limit weighted SDC (tlwSDC), which weighs pixel severity based on the surface temperature relative to the PFC temperature limit, is proposed, and this mean similarity is equal to 0.904 with a sample standard deviation of 0.018. Encouraging results for an infrared image from the W Environment in Steady-state Tokamak (WEST) tokamak indicate that the method might be cross-device viable. The proposed semi-automatic method enabled the generation of an annotated image dataset and, consequently, the training of the first W7-X instance segmentation model.

关键词： Infrared image processing instance segmentation semi-automatic annotation thermal event

来源：评论

学校读者我要写书评

暂无评论

Multifocus Camera Optics with 5^x Extending the Depth of Field 8

Multifocus Camera Optics with 5<SUP>x</SUP> Extending the De...

引用

Conference on Optics, Photonics, and Digital Technologies for Imaging applications vIII

作者： Laskin, Alexander Laskin, vadim Ostrun, Aleksei AdlOpt GmbH Rudower Chaussee 29 D-12489 Berlin Germany St Petersburg Natl Res Univ Informat Technol Mech Kronverkskiy Pr 49 St Petersburg 197101 Russia

ISBN: (纸本)9781510673151;9781510673144

Extending the depth of field (DOF) of imaging optics is a longstanding challenge in machine vision, microscopy, photography and cinematography. This paper presents a method to extend DOF of camera lenses up to 5 times by using foto-foXXus - multi-focus quasi afocal optics. The foto-foXXus devices are implemented as achromatic aplanatic optical systems installed in front of camera lenses in such a way that the combined optical system has simultaneously several focuses separated along the optical axis. When applied for imaging a scene, such a combined optical system forms along the optical axis several images of each object of the extended DOF. The inevitable decrease in contrast of the common image, resulting from defocusing of some images from the plane of camera sensor (or film), can be enhanced using specific algorithms in the stage of image processing, which is nowadays an obligatory part of image capture in machine vision or microscopy. This method is very effective in capturing black-and-white objects, such as QR-codes, or in computer vision-based robotic arms for detecting the shape and size of objects. Direct measurements of the modulation transfer function (MTF) and through-focus MTF curves for a system consisting of a foto-foXXus and a state-of-the-art machine vision objective confirm the increase in depth of focus of the combined optical system and, consequently, depth of field in the Object space. The paper presents description of the foto-foXXus devices, measurements data of MTF and through-focus MTF-curves using the MTF test bench, as well as examples of imaging real objects demonstrating effective extending depth of field.

关键词： extended depth of field DOF imaging camera optics machine vision microscopy industrial inspection photography cinematography

来源：评论

学校读者我要写书评

暂无评论

Automated image and video object detection based on hybrid heuristic-based U-net segmentation and faster region-convolutional neural network-enabled learning

引用

MULTIMEDIA TOOLS AND applications 2023年第3期82卷 3459-3484页

作者： Palle, Rajashekar Reddy Boda, Ravi Koneru Lakshmaiah Educ Fdn KLEF Dept ECE Hyderabad Telangana India

Object detection is one of the major areas of computer vision, which adopts machine learning approaches in diverse contributions. Nowadays, the machine learning field has been directed through Deep Neural Networks (DNNs) that takes eminent features of progressions in data availability and computing power. In all the cases, the quality of images and videos are biased and noisy, and thus, the distributions of data are also considered as imbalanced and disturbed. Different techniques are developed for solving the abovementioned challenges, which are mostly considered based on deep learning and computer vision. Though, traditional algorithms constantly offer poor detection for dense and small objects and yet fail the detection of objects through random geometric transformations. One of the categories of deep learning called Convolutional Neural Network (CNN) is famous and well-matched method for image-related tasks, in which the network is trained for discovering the numerous features like colour differences, corners, and edges in the images and videos that are combined into more complex shapes. This proposal intends to develop improved object detection in images and videos with the advancements of deep learning models. The three main phases of the proposed object detection model are (a) pre-processing, (b) segmentation, and (c) detection. Once the pre-processing of the image is performed by median filtering approach, the adaptive U-Net segmentation is performed for the object segmentation using the newly proposed Sun Flower-Deer Hunting Optimization Algorithm (SF-DHOA). The maximization of segmentation accuracy and dice coefficient is considered as the main objective of the proposed segmentation. The hybrid meta-heuristic algorithm termed SF-DHOA is proposed with Sun Flower Optimization (SFO) and Deer Hunting Optimization Algorithm (DHOA), which is used for optimally tuning the U-Net by optimizing the encoder depth and the number of epoch. Further, the detection is per

关键词： Object detection Deep learning Convolutional neural network Adaptive U-net segmentation Sun flower-deer hunting optimization algorithm Modified faster region-convolutional neural network

来源：评论

学校读者我要写书评

暂无评论

Guest Editorial: Special Issue on Security and Privacy in machine Learning

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on Artificial Intelligence 2023年第5期4卷 986-987页

作者： Luo, Wenjian Jin, Yaochu Huang, Catherine Harbin Institute of Technology Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies School of Computer Science and Technology Shenzhen China Bielefeld University Chair of Nature Inspired Computing and Engineer Faculty of Technology Bielefeld Germany Google Inc. Mountain ViewCA United States

machine learning plays an increasingly important role in the field of artificial intelligence, and obtains fantastic performance in various real-world applications, including image classification, computer vision, natural language processing, and recommendation systems, among many others. Meanwhile, in the era of Big Data, both security and privacy are of paramount importance. machine learning vulnerabilities and privacy-preserving machine learning have attracted growing interest in the fields of artificial intelligence, information security, and data privacy.

关键词： Special Issues And Sections Security Privacy machine Learning image Classification Computer vision Data Privacy Information Security

来源：评论

学校读者我要写书评

暂无评论

ECGConvT: A Hybrid CNN and vision Transformer Model for Enhanced 12-Lead ECG images Classification

引用

IEEE ACCESS 2024年 12卷 193043-193056页

作者： Khalid, Mudassar Pluempitiwiriyawej, Charnchai Abdulkadhem, Abdulkadhem A. Afzal, Imran Truong, Tien Chulalongkorn Univ Dept Elect Engn Bangkok 10330 Thailand Al Mustaqbal Univ Coll Sci Dept Cyber Secur Babylon 51001 Hillah Iraq Tianjin Univ Sch Elect & Informat Engn Tianjin 300072 Peoples R China Univ Calif Berkeley Sch Econ & Cognit Sci Berkeley CA 94720 USA

Cardiovascular diseases, which are currently the major causes of death globally, can be largely ameliorated through early detection and categorization. Electrocardiogram (ECG) tests have emerged as widely employed, low-cost and non-invasive procedures for evaluating electrical activities of the heart and diagnosing cardiovascular ailments. In this research, by using deep learning techniques to detect specific cardiac disorders like cardiac myocardial infarction(MI), arrhythmia, past history of myocardial infarction(PMI) and normal ECG patterns on a dataset containing patients with heart disease. We propose ECGConvT framework that combines Convolutional Neural Network (CNN) module for extracting local features, and vision Transformer (viT) module for capturing global features. The final classification is achieved by combining the two using Multilayer Perceptron (MLP) module. The experimental results indicate promise of ECGConvT in ECG image classification where it outperforms other approaches showing an average accuracy of 98.5%, F1-score: 98.7%, Recall: 98.8% and Precision: 98.5%. In order to meet the practical needs of clinical applications, we implemented a lightweight post-processing step to reduce the size of the model.

关键词： Electrocardiography Accuracy Heart Feature extraction Convolutional neural networks Arrhythmia Transformers Deep learning Cardiovascular diseases Convolution ECG images classification electrocardiogram machine learning vision transformer

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：