The human visual system serves as an inspiration for an efficient image sensor for applications in robotics, sensing and computer vision. Inspired by the retina, we present a perovskite nanowire based artificial visio...
详细信息
ISBN:
(数字)9798331504168
ISBN:
(纸本)9798331504175
The human visual system serves as an inspiration for an efficient image sensor for applications in robotics, sensing and computer vision. Inspired by the retina, we present a perovskite nanowire based artificial vision system for integrated sensing and data preprocessing. The sensor shows stable response to different learning and forgetting visual stimuli. We demonstrate the capabilities of the artificial vision system with a crossbar array and integration with perovskite-based memory for different shape recognition.
Recent years vision Transformers (ViTs) have gained significant attention in the field of computer vision for their impressive performance in various tasks, including image recognition and machine translation tasks, q...
详细信息
Recent years vision Transformers (ViTs) have gained significant attention in the field of computer vision for their impressive performance in various tasks, including image recognition and machine translation tasks, question answering, text classification, image captioning. ViTs performs better on several benchmark image datasets such as imageNet with fewer parameters and computation compared to CNN-based models. The self-attention part performs the feature extraction component of the convolutional neural network (CNN). The proposed model provides a framework on vision transformer-based model for 2D ear recognition. The self-attention part is jointly applied with Convolutional Neural Network (CNNs) in the proposed model. Adjustments and fine-tuning has been done based on the specific characteristics of the ear dataset and the desired performance requirements. In the field of deep learning, the application areas of the CNNs have been proven to be de-facto mainly due to its learning capability of spatially local representations based on their inductive biases, learning the global representation further enhances the recognition accuracy through self-attention mechanism of vision transformers (ViT's). This has been made possible by direct applications of transformer on to the sequence of image patches for better performance in classifying the images. The proposed work utilizes various patch size of images during the model training. From the experimental analysis, it has been observed that with patch size 16 x 16 it achieves highest accuracy of 99.36%. The proposed model has been validated with the Kaggle and iiTD-ii data set. The efficiency of the proposed model over the existing models has been also reported in the present work.
The rapid growth of artificial intelligence (AI) technology and its applications in recent years has transformed the process of data analytics in many scientific fields, including geoscience. Geoscience has traditiona...
详细信息
The rapid growth of artificial intelligence (AI) technology and its applications in recent years has transformed the process of data analytics in many scientific fields, including geoscience. Geoscience has traditionally been a descriptive science and fundamentally relies upon visual recognition and identification of different geological features, from satellite images to subsurface seismic, to study Earth's history. Geological image data provides immense potential to apply advanced AI methods, such as deep learning to improve and optimize different geological and geophysical characterization workflows. Despite the increasing efforts and interest toward using AI in geosciences, its actual potential remains untapped, and further exploration is required. The prospect of AI application in geosciences is primarily hindered by the following: (i) limited availability of high-quality labeled datasets and (ii) inherited imbalance dataset distribution. These limitations are compounded by overexploitation of the transfer learning method to mitigate such issues, discarding the interpretability of the AI black-box problems. In this study, a robust and effortless strategy is proposed to overcome the limitations and simultaneously reduce our dependency on to the transfer learning method. Among the various methods available to mitigate these issues, only traditional data augmentation is heavily used in geosciences. This study, therefore, explored and developed a workflow by combining three readily available methods to maximize the performance of machine learning algorithms when dealing with a limited and imbalanced geoscience dataset. Here, the proposed method follows three robust and straightforward end-to-end steps: (i) combining traditional and advanced data augmentation (e.g., CutOut and CutMix) techniques to enhance localization and generalization performance;(ii) employing an algorithm-level class weight method to minimize detrimental impact and performance bias due to class
Current post-earthquake damage assessment methodologies are not only time-consuming but also subjective in nature and difficult to document. Recent advancements in artificial intelligence and technological devices mak...
Current post-earthquake damage assessment methodologies are not only time-consuming but also subjective in nature and difficult to document. Recent advancements in artificial intelligence and technological devices make it possible to accomplish this task automatically, efficiently, and objectively. Our vision for an automated post-earthquake evaluation begins with image data, such as that obtained by an Unmanned Aerial Vehicle, which is then processed to detect damage and generate a Finite Element Method (FEM) model. This thesis aims to realize this vision for free-standing stone masonry buildings. The main objective of the current research is to propose robust and computationally efficient methodologies to automatically generate 3D models for free-standing stone masonry buildings and provide information on damage detected in RGB images. This allows for an effective and more objective post-earthquake damage assessment with straightforward documentation, allowing future correlation of damage information with the mechanical properties of the model. RGB images were used for two purposes, i. e., 3D model generation and damage detection. Related to 3D models, an image-based pipeline was developed to automatically create level of detail (LOD) models, specifically LOD3, using structure-from-motion and semantic segmentation, in order to produce a geometrical representation of a building. In contrast to the existing works, the method does not rely on post-processing of extremely precise 3D models, does not use predefined templates, does not require human manipulation, and provides semantic understanding of the final model's components. Cracks were detected using state-of-the-art deep learning approaches, which were complemented with a TOPO-Loss function that does not require pixel-precise labels and emphasizes the continuity of the crack topology. When assessing the mechanical effect of a crack, not only the crack geometry but also the crack opening in Mode I and ii are impo
The article presents the usage of machinevision to automate quality control (QC) of metallic surfaces. QC include detection of selected defects of metallic surface, i.e. scratches, cracks. Imaging using the scatter m...
详细信息
ISBN:
(数字)9781510649569
ISBN:
(纸本)9781510649569;9781510649552
The article presents the usage of machinevision to automate quality control (QC) of metallic surfaces. QC include detection of selected defects of metallic surface, i.e. scratches, cracks. Imaging using the scatter method has been proposed, resulting in greater contrast. The article provides a detailed description of the measurement stand, image acquisition method and image analysis algorithm. The project's principal aim is to construct an automatic system that controls the state of the surface with a frequency of 6 Hz.
This master thesis focuses on the cutting-edge application of AI in developing intrusion detection systems (IDS) for unmanned aerial vehicles (UAVs) in smart cities. The objective is to address the escalating problem ...
This master thesis focuses on the cutting-edge application of AI in developing intrusion detection systems (IDS) for unmanned aerial vehicles (UAVs) in smart cities. The objective is to address the escalating problem of UAV intrusions, which pose a significant risk to the safety and security of citizens and critical infrastructure. The thesis explores the current state of the art and provides a comprehensive understanding of recent advancements in the field, encompassing both physical and network attacks. The literature review examines various techniques and approaches employed in the development of AI-based IDS. This includes the utilization of machine learning algorithms, computer vision technologies, and edge computing. A proposed solution leveraging computer vision technologies is presented to detect and identify intruding UAVs in the sky effectively. The system employs machine learning algorithms to analyze video feeds from city-installed cameras, enabling real-time identification of potential intrusions. The proposed approach encompasses the detection of unauthorized drones, dangerous UAVs, and UAVs carrying suspicious payloads. Moreover, the thesis introduces a Cycle GAN network for image denoising that can translate noisy images to clean images without the need for paired training data. This approach employs two generators and two discriminators, incorporating a cycle consistency loss that ensures the generated images align with their corresponding input images. Furthermore, a distributed architecture is proposed for processing collected images using an edge-offloading approach within the UAV network. This architecture allows flying and ground cameras to leverage the computational capabilities of their IoT peers to process captured images. A hybrid neural network is developed to predict, based on input tasks, the potential edge computers capable of real-time processing. The edge-offloading approach reduces the computational burden on the centralized system a
Parts assembly clearance measurement is facing a trend towards high-precision and noncontact. This work aims to measure clearance by imageprocessing based on machinevision. The machinevision system is to highlight ...
详细信息
In the present time, there has been many adaptations of Object Detection is developed. Object Detection means catching the object name and it's other characteristics in an image or a video. This field is known to ...
详细信息
Face presentation attacks, also known as spoofing attacks, pose a substantial threat to biometric systems that rely on facial recognition systems, such as access control systems, mobile payments, and identity verifica...
详细信息
The field of computer vision research has been experiencing rapid and remarkable development in recent years, aiming to analyze image and video data through increasingly sophisticated machine learning models. In this ...
详细信息
ISBN:
(纸本)9789819759330;9789819759347
The field of computer vision research has been experiencing rapid and remarkable development in recent years, aiming to analyze image and video data through increasingly sophisticated machine learning models. In this research domain, capturing and extracting relevant features plays a crucial role in approaching the detailed content and semantics of image and video data. Among these, skeleton data, with the ability to represent the position and movements of human body parts, along with its simplicity and independence from external factors, has proven highly effective in solving human action recognition problems. Consequently, many researchers have shown interest and proposed various skeleton data extraction models following different approaches. In this study, we introduce the Omni-TransPose model for skeleton data extraction, constructed by combining the OmniPose model with the Transformer architecture. We conducted experiments on the MPii dataset, using the Percentage of Correct Key Points (PCK) metric to evaluate the effectiveness of the new model. The experimental results were compared with the original OmniPose model, demonstrating a significant improvement in skeleton extraction and recognition, thereby enhancing the capability of human action recognition. This work promises to provide an efficient and powerful method for human action recognition, with broad potential applications in practical scenarios.
暂无评论