Depth information is useful in many imageprocessing and computer visionapplications, but in photography, depth information is lost in the process of projecting a real-world scene onto a 2D plane. Extracting depth in...
详细信息
machine learning-based algorithms using fully convolutional networks (FCNs) have been a promising option for medical image segmentation. However, such deep networks silently fail if input samples are drawn far from th...
详细信息
ISBN:
(纸本)9781665493468
machine learning-based algorithms using fully convolutional networks (FCNs) have been a promising option for medical image segmentation. However, such deep networks silently fail if input samples are drawn far from the training data distribution, thus causing critical problems in automatic data processing pipelines. To overcome such outof-distribution (OoD) problems, we propose a novel OoD score formulation and its regularization strategy by applying an auxiliary add-on classifier to an intermediate layer of an FCN, where the auxiliary module is helfpul for analyzing the encoder output features by taking their class information into account. Our regularization strategy train the module along with the FCN via the principle of outlier exposure so that our model can be trained to distinguish OoD samples from normal ones without modifying the original network architecture. Our extensive experiment results demonstrate that the proposed approach can successfully conduct effective OoD detection without loss of segmentation performance. In addition, our module can provide reasonable explanation maps along with OoD scores, which can enable users to analyze the reliability of predictions.
Human Pose (HP) estimation is actively researched because of its wide range of applications. However, even estimators pre-trained on large datasets may not perform satisfactorily due to a domain gap between the traini...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
Human Pose (HP) estimation is actively researched because of its wide range of applications. However, even estimators pre-trained on large datasets may not perform satisfactorily due to a domain gap between the training and test data. To address this issue, we present our approach combining Active Learning (AL) and Transfer Learning (TL) to adapt HP estimators to individual video domains efficiently. For efficient learning, our approach quantifies (i) the estimation uncertainty based on the temporal changes in the estimated heatmaps and (ii) the unnaturalness in the estimated full-body HPs. These quantified criteria are then effectively combined with the state-of-the-art representativeness criterion to select uncertain and diverse samples for efficient HP estimator learning. Furthermore, we reconsider the existing Active Transfer Learning (ATL) method to introduce novel ideas related to the retraining methods and Stopping Criteria (SC). Experimental results demonstrate that our method enhances learning efficiency and outperforms comparative methods. Our code is publicly available at: https://***/ImIntheMiddle/VATL4Pose-WACV2024
The rapid development of machinevisionapplications demands hardware that can sense and process visual information in a single monolithic unit to avoid redundant data transfer. Here, we design and demonstrate a monol...
详细信息
The rapid development of machinevisionapplications demands hardware that can sense and process visual information in a single monolithic unit to avoid redundant data transfer. Here, we design and demonstrate a monolithic vision enhancement chip with light-sensing, memory, digital-to-analog conversion, and processing functions by implementing a 619-pixel with 8582 transistors and physical dimensions of 10 mm by 10 mm based on a wafer-scale two-dimensional (2D) monolayer molybdenum disulfide (MoS2). The light- sensing function with analog MoS2 transistor circuits offers low noise and high photosensitivity. Furthermore, we adopt a MoS2 analog processing circuit to dynamically adjust the photocurrent of individual imaging sensor, which yields a high dynamic light- sensing range greater than 90 decibels. The vision chip allows the applications for contrast enhancement and noise reduction of imageprocessing. This large-scale monolithic chip based on 2D semiconductors shows multiple functions with light sensing, memory, and processing for artificial machinevisionapplications, exhibiting the potentials of 2D semiconductors for future electronics.
Within (semi-)automated visual industrial inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on h...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
Within (semi-)automated visual industrial inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on high-resolution imagery. The emergence of these often rarely occurring defect patterns explains the general need for labeled data corpora. To alleviate this issue and advance the current state of the art in unsupervised visual inspection, this work proposes a DifferNet-based solution enhanced with attention modules: AttentDifferNet. It improves image-level detection and classification capabilities on three visual anomaly detection datasets for industrial inspection: InsPLAD-fault, MVTec AD, and Semiconductor Wafer. In comparison to the state of the art, AttentDifferNet achieves improved results, which are, in turn, highlighted throughout our quali-quantitative study. Our quantitative evaluation shows an average improvement compared to DifferNet - of 1.77 +/- 0.25 percentage points in overall AUROC considering all three datasets, reaching SOTA results in InsPLAD-fault, an industrial inspection in-the-wild dataset. As our variants to AttentDifferNet show great prospects in the context of currently investigated approaches, a baseline is formulated, emphasizing the importance of attention for industrial anomaly detection both in the wild and in controlled environments.
Deep leaming, a subset of machine learning within artificial intelligence, has been successful in medical image analysis in vascular surgery. Unlike traditional computer-based segmentation methods that manually extrac...
详细信息
Deep leaming, a subset of machine learning within artificial intelligence, has been successful in medical image analysis in vascular surgery. Unlike traditional computer-based segmentation methods that manually extract features from input images, deep leaming methods learn image features and classify data without making prior assumptions. Convolutional neural networks, the main type of deep learning for computer visionprocessing, are neural networks with multilevel architecture and weighted connections between nodes that can "auto-leam" through repeated exposure to training data without manual input or supervision. These networks have numerous applications in vascular surgery imaging analysis, particularly in disease classification, object identification, semantic segmentation, and instance segmentation. The purpose of this review article was to review the relevant concepts of machine leaming image analysis and its application to the field of vascular surgery. (c) 2023 Elsevier Inc. All rights reserved.
Inspection of components using machinevision technologies provides solution for quality and process control. This technique is used in various applications such as automotive, pharmaceutical, food and beverage, elect...
详细信息
In the era of rapid technological advancement, Computer vision has emerged as a transformative force, reshaping the landscape of Artificial Intelligence (AI) and machine Learning (ML). This comprehensive review paper ...
详细信息
Airborne platforms and satellites provide rich sensor data in the form of hyperspectral images (HSI), which are crucial for numerous vision-related tasks, such as feature extraction, image enhancement, and data synthe...
详细信息
Airborne platforms and satellites provide rich sensor data in the form of hyperspectral images (HSI), which are crucial for numerous vision-related tasks, such as feature extraction, image enhancement, and data synthesis. This article reviews the contextual importance and applications of generative artificial intelligence (GAI) in the advancement of HSI processing. GAI methods address the inherent challenges of HSI data, such as high dimensionality, noise, and the need to preserve spectral-spatial correlations, rendering them indispensable for modern HSI analysis. Generative neural networks, including generative adversarial networks and denoising diffusion probabilistic models, are highlighted for their superior performance in classification, segmentation, and object identification tasks, often surpassing traditional approaches, such as U-Nets, autoencoders, and deep convolutional neural networks. Diffusion models showed competitive performance in tasks, such as feature extraction and image resolution enhancement, particularly in terms of inference time and computational cost. Transformer architectures combined with attention mechanisms further improved the accuracy of generative methods, particularly for preserving spectral and spatial information in tasks, such as image translation, data augmentation, and data synthesis. Despite these advancements, challenges remain, particularly in developing computationally efficient models for super-resolution and data synthesis. In addition, novel evaluation metrics tailored to the complex nature of HSI data are needed. This review underscores the potential of GAI in addressing these challenges while presenting its current strengths, limitations, and future research directions.
In recent years, algorithms based on machine learning have significantly advanced many technical areas, including computer vision. Since the performance of machine learning applications is data-dependent, a sufficient...
详细信息
ISBN:
(数字)9781624107115
ISBN:
(纸本)9781624107115
In recent years, algorithms based on machine learning have significantly advanced many technical areas, including computer vision. Since the performance of machine learning applications is data-dependent, a sufficient amount of high-quality data must be available to achieve robust and stable performance. However, the collection of large amounts of real-world data that covers the operational parameters of the AI-based system is often a difficult task because of availability, cost, or even potential danger. Therefore, synthetic data generation is often used to supplement data sets with additional required data samples. In this paper, we propose a baseline for an automated toolchain to generate synthetic image data of aircraft for machine-learning computer visionapplications using a flight simulator. Scenario-based approaches have shown applicability to systematically generate valid test cases for system safety evaluation. We leverage a similar approach to generate data for training of AI-based systems. Our approach requires the user to create scenario models using our modelling tool. These models define the operational ranges for a set of parameters that characterize executable scenarios. The scenarios defined by the models are used to automatically produce images from simulations carried out with the FlightGear open-source flight simulator. We distinguish between a static and a dynamic simulation approach. The static approach generates a sequence of independent scenes, while the dynamic approach creates situations that mimic a collision avoidance scenario. With our approach, we can automatically generate large amounts of raw image data covering the relevant parameter ranges based on the models created by the user.
暂无评论