Anomaly localization is a practical technology for improving industrial production line efficiency. Due to anomalies are manifold and hard to be collected, existing unsupervised researches are usually equipped with an...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Anomaly localization is a practical technology for improving industrial production line efficiency. Due to anomalies are manifold and hard to be collected, existing unsupervised researches are usually equipped with anomaly synthesis methods. However, most of them are biased towards structural defects synthesis while ignoring the underlying logical constraints. To fill the gap and boost anomaly localization performance, we propose an edge manipulation based anomaly synthesis framework, named LogicAL, that produces photo-realistic both logical and structural anomalies. We introduce a logical anomaly generation strategy that is adept at breaking logical constraints and a structural anomaly generation strategy that complements to the structural defects synthesis. We further improve the anomaly localization performance by introducing edge reconstruction into the network structure. Extensive experiments on the challenge MVTecLOCO, MVTecAD, VisA and MADsim datasets verify the advantage of proposed LogicAL on both logical and structural anomaly localization.
Mitigating bias in machine learning models is a critical endeavor for ensuring fairness and equity. In this paper, we propose a novel approach to address bias by leveraging pixel image attributions to identify and reg...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Mitigating bias in machine learning models is a critical endeavor for ensuring fairness and equity. In this paper, we propose a novel approach to address bias by leveraging pixel image attributions to identify and regularize regions of images containing significant information about bias attributes. Our method utilizes a model-agnostic approach to extract pixel attributions by employing a convolutional neural network (CNN) classifier trained on small image patches. By training the classifier to predict a property of the entire image using only a single patch, we achieve region-based attributions that provide insights into the distribution of important information across the image. We propose utilizing these attributions to introduce targeted noise into datasets with confounding attributes that bias the data, thereby constraining neural networks from learning these biases and emphasizing the primary attributes. Our approach demonstrates its efficacy in enabling the training of unbiased classifiers on heavily biased datasets.
Hypomimia, also known as "facial masking", is a common symptom of Parkinson's Disease (PD). PD is a neurological disorder characterized by non-motor and motor impairments. Hypomimia is the reduction of f...
详细信息
ISBN:
(纸本)9781665448994
Hypomimia, also known as "facial masking", is a common symptom of Parkinson's Disease (PD). PD is a neurological disorder characterized by non-motor and motor impairments. Hypomimia is the reduction of facial expressiveness, including the emotion expressions. In this work, we explore the use of static and dynamic features for the analysis of evoked facial gestures in PD patients. The main contributions of this work are: (1) We propose a multimodal PD detection system based on both static and dynamic features obtained from evoked face gestures;(2) we propose a novel set of 17 dynamic features to characterize the facial expressiveness and demonstrate that facial dynamics features can be used to improve PD detection;and (3) we analyze different evoked facial expressions and its performance for PD detection. Different expressions activate different Action Units (AUs) and we analyze to what extent each of these AUs contribute to PD detection. The results show that the use of static features generated by pre-trained deep architectures yield up to 77.36% of accuracy for PD detection and the combination with dynamic features improves PD detection by up to 13.46% (from 75.00% to 88.46%). Our experiments also suggest differences in the performance of evoked face gestures in this PD detection task.
Anomaly Detection and Segmentation (AD&S) is crucial for industrial quality control. While existing methods excel in generating anomaly scores for each pixel, practical applications require producing a binary segm...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Anomaly Detection and Segmentation (AD&S) is crucial for industrial quality control. While existing methods excel in generating anomaly scores for each pixel, practical applications require producing a binary segmentation to identify anomalies. Due to the absence of labeled anomalies in many real scenarios, standard practices binarize these maps based on some statistics derived from a validation set containing only nominal samples, resulting in poor segmentation performance. This paper addresses this problem by proposing a test time training strategy to improve the segmentation performance. Indeed, at test time, we can extract rich features directly from anomalous samples to train a classifier that can discriminate defects effectively. Our general approach can work downstream to any AD&S method that provides an anomaly score map as output, even in mul-timodal settings. We demonstrate the effectiveness of our approach over baselines through extensive experimentation and evaluation on MVTec AD and MVTec 3D-AD.
The recent advancements in Text-to-Video Artificial Intelligence Generated Content (AIGC) have been remarkable. Compared with traditional videos, the assessment of AIGC videos encounters various challenges: visual inc...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
The recent advancements in Text-to-Video Artificial Intelligence Generated Content (AIGC) have been remarkable. Compared with traditional videos, the assessment of AIGC videos encounters various challenges: visual inconsistency that defy common sense, discrepancies between content and the textual prompt, and distribution gap between various generative models, etc. Target at these challenges, in this work, we categorize the assessment of AIGC video quality into three dimensions: visual harmony, videotext consistency, and domain distribution gap. For each dimension, we design specific modules to provide a comprehensive quality assessment of AIGC videos. Furthermore, our research identifies significant variations in visual quality, fluidity, and style among videos generated by different text-to-video models. Predicting the source generative model can make the AIGC video features more discriminative, which enhances the quality assessment performance. The proposed method was used in the third-place winner of the NTIRE 2024 Quality Assessment for AI-Generated Content - Track 2 Video, demonstrating its effectiveness.
Functional data analysis (FDA) is focused on various statistical tasks, including inference, for observations that vary over a continuum, which are not effectively addressed by multivariate methods. A feature of these...
详细信息
ISBN:
(纸本)9781665448994
Functional data analysis (FDA) is focused on various statistical tasks, including inference, for observations that vary over a continuum, which are not effectively addressed by multivariate methods. A feature of these functional observations is the presence of two distinct forms of variability: amplitude that describes differences in magnitudes of features, e.g., extrema, and phase that describes differences in timings of amplitude features. One area of focus in FDA is the classification of new observations based on previously observed training data that has been split into pre-defined classes. Existing methods fail to directly account for both phase and amplitude variability, and work under the restrictive assumption that functional observations are measured on a common, fine grid over the input domain. In this work, we address these issues directly by formulating a Bayesian hierarchical model for irregular, fragmented or sparsely sampled functional observations, where training data from different classes are available. Our approach builds on a recently developed inferential framework for incomplete functional observations and the elastic FDA framework for characterizing amplitude and phase variability. The approach operates by inferring individual parameters that separately track amplitude and phase, which can be combined to infer complete functions underlying each observation, and a class parameter, which can be used to discern the class membership of an observation based on the training data. We validate the proposed framework using simulation studies and real data applications, and showcase the advantages of this perspective when both amplitude and phase variability are present in the data.
This paper introduces a novel cross-camera domain adaptation method to address the challenges associated with achieving consistency and adaptability in cardiovascular disease (CVD) risk assessment using retinal images...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
This paper introduces a novel cross-camera domain adaptation method to address the challenges associated with achieving consistency and adaptability in cardiovascular disease (CVD) risk assessment using retinal images captured by conventional and portable cameras. The proposed method leverages an enhanced ordinal CVD risk classification approach to predict CVD risk levels, effectively capturing the ordinal relationship and implicit information embedded within retinal images. Additionally, a plug-and-play risk consistency loss is incorporated into the image translation model to ensure alignment in risk assessment between different image domains. Experimental evaluations on diverse datasets demonstrate the effectiveness and superiority of the proposed method in achieving consistent CVD risk assessment across various camera models. The results highlight the potential of the proposed approach to enhance early detection and intervention of CVD, utilizing the convenience and cost-effectiveness of portable retinal imaging technology. Overall, this research contributes to the field of computer-aided medical imaging by providing a robust and adaptable solution for CVD risk assessment, ultimately benefiting patients and healthcare providers in their efforts to combat CVD.
Hyperspectral imaging offers manifold opportunities for applications that may not, or only partially, be achieved within the visual spectrum. Our paper presents a novel approach for Single-Label Hyperspectral Image Cl...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Hyperspectral imaging offers manifold opportunities for applications that may not, or only partially, be achieved within the visual spectrum. Our paper presents a novel approach for Single-Label Hyperspectral Image Classification, demonstrated through the example of a key challenge faced by agricultural seed producers: seed purity testing. We employ Self-Supervised Learning and Masked Image Modeling techniques to tackle this task. Recognizing the challenges and costs associated with acquiring hyperspectral data, we aim to develop a versatile method capable of working with visible, arbitrary combinations of spectral bands (multispectral data) and hyperspectral sensor data. By integrating RGB and hyperspectral data, we leverage the detailed spatial information from RGB images and the rich spectral information from hyperspectral data to enhance the accuracy of seed classification. Through evaluations in various real-life scenarios, we demonstrate the flexibility, scalability, and efficiency of our approach.
While there are a lot of models for instance segmentation, PolarMask stands out as a unique one that represents an object by a Polar coordinate system. With an anchor-box-free design and a single-stage framework that ...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
While there are a lot of models for instance segmentation, PolarMask stands out as a unique one that represents an object by a Polar coordinate system. With an anchor-box-free design and a single-stage framework that conducts detection and segmentation at one time, PolarMask is proved to be able to balance efficiency and accuracy. Hence, it can be easily connected with other downstream real-time applications. In this work, we observe that there are two deficiencies associated with PolarMask: (i) inability of representing concave objects and (ii) inefficiency in using ray regression. We propose MP-PolarMask (Multi-Point PolarMask) by taking advantage of multiple Polar systems. The main idea is to extend from one main Polar system to four auxiliary Polar systems, thus capable of representing more complicated convex-and-concave-mixed shapes. We validate MP-PolarMask on both general objects and food objects of the COCO dataset, and the results demonstrate significant improvement of 13.69% in AP
L
and 7.23% in AP over PolarMask with 36 rays.
Digital mammography is essential to breast cancer detection, and deep learning offers promising tools for faster and more accurate mammogram analysis. In radiology and other high-stakes environments, uninterpretable (...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Digital mammography is essential to breast cancer detection, and deep learning offers promising tools for faster and more accurate mammogram analysis. In radiology and other high-stakes environments, uninterpretable ("black box") deep learning models are unsuitable and there is a call in these fields to make interpretable models. Recent work in interpretable computervision provides transparency to these formerly black boxes by utilizing prototypes for case-based explanations, achieving high accuracy in applications including mammography. However, these models struggle with precise feature localization, reasoning on large portions of an image when only a small part is relevant. This paper addresses this gap by proposing a novel multi-scale interpretable deep learning model for mammographic mass margin classification. Our contribution not only offers an interpretable model with reasoning aligned with radiologist practices, but also provides a general architecture for computervision with user-configurable prototypes from coarse-to fine-grained prototypes.
暂无评论