image segmentation is the critical step in different imaging and especially optical inspection applications: detection and recognition of objects, classification, analysis, and identification. Also, image gradient, as...
详细信息
imageprocessing plays a vital role in artificial visual systems, which have diverse applications in areas such as biomedical imaging and machinevision. In particular, optical analog imageprocessing is of great inte...
详细信息
imageprocessing plays a vital role in artificial visual systems, which have diverse applications in areas such as biomedical imaging and machinevision. In particular, optical analog imageprocessing is of great interest because of its parallel processing capability and low power consumption. Here, we present ultra-compact metasurfaces performing all-optical geometric image transformations, which are essential for imageprocessing to correct image distortions, create special image effects, and morph one image into another. We show that our metasurfaces can realize binary image transformations by modifying the spatial relationship between pixels and converting binary images from Cartesian to log-polar coordinates with unparalleled advantages for scale- and rotation-invariant image preprocessing. Furthermore, we extend our approach to grayscale image transformations and convert an image with Gaussian intensity profile into another image with flat-top intensity profile. Our technique will potentially unlock new opportunities for various applications such as target tracking and laser manufacturing. Metasurfaces enable all-optical geometric coordinate transformations, converting images with altered pixel spatial relations, which can facilitate fast, energy-efficient preprocessing for tasks like object tracking, or aid in laser manufacturing.
Educational models currently integrate a variety of technologies and computer applications that seek to improve learning environments. With this objective, information technologies have increasingly adapted to assume ...
详细信息
Educational models currently integrate a variety of technologies and computer applications that seek to improve learning environments. With this objective, information technologies have increasingly adapted to assume the role of educational assistants that support the teacher, the students, and the areas enrolled in educational quality. One of the technologies that are gaining strength in the academic field is computer vision, which is used to monitor and identify the state of mind of students during the teaching of a subject. To do this, machine learning algorithms monitor student gestures and classify them to identify the emotions they convey in a teaching environment. These systems allow the evaluation of emotional aspects, based on two main elements, the first is the generation of an image database with the emotions generated in a learning environment such as interest, commitment, boredom, concentration, relaxation, and enthusiasm. The second is an emotion recognition system, through the recognition of facial gestures using non-invasive techniques. This work applies techniques for the recognition and processing of facial gestures and the classification of emotions focused on learning. This system helps the tutor in a modality of face-to-face education and allows him to evaluate emotional aspects and not only cognitive ones. This arises from the need to create a base of images focused on the spontaneous learning of emotions since most of the works reviewed focus on these acted-out emotions.
The enhancement of machine learning (ML) models relies heavily on the volume and integrity of data, emphasizing the importance of efficient data collection and rigorous ground-truth labeling. While deep learning, part...
详细信息
The enhancement of machine learning (ML) models relies heavily on the volume and integrity of data, emphasizing the importance of efficient data collection and rigorous ground-truth labeling. While deep learning, particularly its subset convolutional neural network (CNN), has shown promise in crack detection, there remains a need for sophisticated algorithms to identify structural defects accurately. This study presents a novel deep CNN model tailored for the binary classification of concrete surfaces, addressing a significant need in infrastructure engineering. The deep CNN model was developed using a comprehensive dataset (40,000 images each measuring 227 x 227 pixels). various metrics, including precision, sensitivity, binary accuracy, and F1 score, were utilized to evaluate the model's performance. Additionally, the model's generalization capability was assessed by testing its proficiency in accurately classifying unseen data. The study demonstrates that the model's predictive performance improves with additional epochs, indicating enhanced learning over learning cycles. validation metrics suggest potential generalization capability despite slight accuracy declines, showcasing the model's robustness in accurately classifying positive instances. The findings reveal significant advancements in deep CNN models for concrete material classification, surpassing previous comparable models. Employing CNN models holds promising outcomes for quality control and repair processes in infrastructure engineering applications. Future research directions include exploring the application of the deep CNN model to classify alternative materials and assessing its generalization capability using larger and more diverse datasets. Overall, this study contributes to the advancement of ML techniques in infrastructure engineering, with implications for optimizing material classification processes and enhancing infrastructure repair outcomes.
An image is often considered worth a thousand words, and certain images can tell rich and insightful stories. Can these stories be told via image captioning? images from folklore genres, such as mythology, folk dance,...
详细信息
Contemporary infrared imaging systems in thermonuclear fusion devices are preventing thermal overloads on plasma-facing components (PFCs) relying on the surface temperature. Automatic delineation and classification of...
详细信息
Contemporary infrared imaging systems in thermonuclear fusion devices are preventing thermal overloads on plasma-facing components (PFCs) relying on the surface temperature. Automatic delineation and classification of thermal events would facilitate scene understanding, contributing to advanced machine protection, control, and physics exploration applications. However, the absence of image annotations, which require a significant amount of expert labor and are prone to inconsistencies, limits the use of deep learning computer vision methods in fusion devices. A semi-automatic annotation method based on deterministic infrared imageprocessing is proposed to reduce annotation efforts while maintaining consistency. The method exploits discharge sequence properties to minimize expert involvement. It was evaluated on infrared images from the Wendelstein 7-X (W7-X) stellarator by comparing the generated annotation with manually prepared ground-truth annotations. The generated annotations have a high mean similarity to the manual annotations, measured with Sorensen-Dice coefficient (SDC), equal to 0.825 with a sample standard deviation of 0.030. Furthermore, a customized metric temperature over limit weighted SDC (tlwSDC), which weighs pixel severity based on the surface temperature relative to the PFC temperature limit, is proposed, and this mean similarity is equal to 0.904 with a sample standard deviation of 0.018. Encouraging results for an infrared image from the W Environment in Steady-state Tokamak (WEST) tokamak indicate that the method might be cross-device viable. The proposed semi-automatic method enabled the generation of an annotated image dataset and, consequently, the training of the first W7-X instance segmentation model.
Extending the depth of field (DOF) of imaging optics is a longstanding challenge in machinevision, microscopy, photography and cinematography. This paper presents a method to extend DOF of camera lenses up to 5 times...
详细信息
ISBN:
(纸本)9781510673151;9781510673144
Extending the depth of field (DOF) of imaging optics is a longstanding challenge in machinevision, microscopy, photography and cinematography. This paper presents a method to extend DOF of camera lenses up to 5 times by using foto-foXXus - multi-focus quasi afocal optics. The foto-foXXus devices are implemented as achromatic aplanatic optical systems installed in front of camera lenses in such a way that the combined optical system has simultaneously several focuses separated along the optical axis. When applied for imaging a scene, such a combined optical system forms along the optical axis several images of each object of the extended DOF. The inevitable decrease in contrast of the common image, resulting from defocusing of some images from the plane of camera sensor (or film), can be enhanced using specific algorithms in the stage of imageprocessing, which is nowadays an obligatory part of image capture in machinevision or microscopy. This method is very effective in capturing black-and-white objects, such as QR-codes, or in computer vision-based robotic arms for detecting the shape and size of objects. Direct measurements of the modulation transfer function (MTF) and through-focus MTF curves for a system consisting of a foto-foXXus and a state-of-the-art machinevision objective confirm the increase in depth of focus of the combined optical system and, consequently, depth of field in the Object space. The paper presents description of the foto-foXXus devices, measurements data of MTF and through-focus MTF-curves using the MTF test bench, as well as examples of imaging real objects demonstrating effective extending depth of field.
Object detection is one of the major areas of computer vision, which adopts machine learning approaches in diverse contributions. Nowadays, the machine learning field has been directed through Deep Neural Networks (DN...
详细信息
Object detection is one of the major areas of computer vision, which adopts machine learning approaches in diverse contributions. Nowadays, the machine learning field has been directed through Deep Neural Networks (DNNs) that takes eminent features of progressions in data availability and computing power. In all the cases, the quality of images and videos are biased and noisy, and thus, the distributions of data are also considered as imbalanced and disturbed. Different techniques are developed for solving the abovementioned challenges, which are mostly considered based on deep learning and computer vision. Though, traditional algorithms constantly offer poor detection for dense and small objects and yet fail the detection of objects through random geometric transformations. One of the categories of deep learning called Convolutional Neural Network (CNN) is famous and well-matched method for image-related tasks, in which the network is trained for discovering the numerous features like colour differences, corners, and edges in the images and videos that are combined into more complex shapes. This proposal intends to develop improved object detection in images and videos with the advancements of deep learning models. The three main phases of the proposed object detection model are (a) pre-processing, (b) segmentation, and (c) detection. Once the pre-processing of the image is performed by median filtering approach, the adaptive U-Net segmentation is performed for the object segmentation using the newly proposed Sun Flower-Deer Hunting Optimization Algorithm (SF-DHOA). The maximization of segmentation accuracy and dice coefficient is considered as the main objective of the proposed segmentation. The hybrid meta-heuristic algorithm termed SF-DHOA is proposed with Sun Flower Optimization (SFO) and Deer Hunting Optimization Algorithm (DHOA), which is used for optimally tuning the U-Net by optimizing the encoder depth and the number of epoch. Further, the detection is per
machine learning plays an increasingly important role in the field of artificial intelligence, and obtains fantastic performance in various real-world applications, including image classification, computer vision, nat...
详细信息
machine learning plays an increasingly important role in the field of artificial intelligence, and obtains fantastic performance in various real-world applications, including image classification, computer vision, natural language processing, and recommendation systems, among many others. Meanwhile, in the era of Big Data, both security and privacy are of paramount importance. machine learning vulnerabilities and privacy-preserving machine learning have attracted growing interest in the fields of artificial intelligence, information security, and data privacy.
Cardiovascular diseases, which are currently the major causes of death globally, can be largely ameliorated through early detection and categorization. Electrocardiogram (ECG) tests have emerged as widely employed, lo...
详细信息
Cardiovascular diseases, which are currently the major causes of death globally, can be largely ameliorated through early detection and categorization. Electrocardiogram (ECG) tests have emerged as widely employed, low-cost and non-invasive procedures for evaluating electrical activities of the heart and diagnosing cardiovascular ailments. In this research, by using deep learning techniques to detect specific cardiac disorders like cardiac myocardial infarction(MI), arrhythmia, past history of myocardial infarction(PMI) and normal ECG patterns on a dataset containing patients with heart disease. We propose ECGConvT framework that combines Convolutional Neural Network (CNN) module for extracting local features, and vision Transformer (viT) module for capturing global features. The final classification is achieved by combining the two using Multilayer Perceptron (MLP) module. The experimental results indicate promise of ECGConvT in ECG image classification where it outperforms other approaches showing an average accuracy of 98.5%, F1-score: 98.7%, Recall: 98.8% and Precision: 98.5%. In order to meet the practical needs of clinical applications, we implemented a lightweight post-processing step to reduce the size of the model.
暂无评论