Line art plays a fundamental role in illustration and design, and allows for iteratively polishing designs. However, as they lack color, they can have issues in conveying final designs. In this work, we propose an int...
详细信息
ISBN:
(纸本)9781665448994
Line art plays a fundamental role in illustration and design, and allows for iteratively polishing designs. However, as they lack color, they can have issues in conveying final designs. In this work, we propose an interactive colorization approach based on a conditional generative adversarial network that takes both the line art and color hints as inputs to produce a high-quality colorized image. Our approach is based on a U-net architecture with a multi-discriminator framework. We propose a Concatenation and Spatial Attention module that is able to generate more consistent and higher quality of line art colorization from user given hints. We evaluate on a large-scale illustration dataset and comparison with existing approaches corroborate the effectiveness of our approach.
In this paper we analyze the classification performance of neural network structures without parametric inference. Making use of neural architecture search, we empirically demonstrate that it is possible to find rando...
详细信息
ISBN:
(纸本)9781665448994
In this paper we analyze the classification performance of neural network structures without parametric inference. Making use of neural architecture search, we empirically demonstrate that it is possible to find random weight architectures, a deep prior, that enables a linear classification to perform on par with fully trained deep counterparts. Through ablation experiments, we exclude the possibility of winning a weight initialization lottery and confirm that suitable deep priors do not require additional inference. In an extension to continual learning, we investigate the possibility of catastrophic interference free incremental learning. Under the assumption of classes originating from the same data distribution, a deep prior found on only a subset of classes is shown to allow discrimination of further classes through training of a simple linear classifier.
Learning a common representation space between vision and language allows deep networks to relate objects in the image to the corresponding semantic meaning. We present a model that learns a shared Gaussian mixture re...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Learning a common representation space between vision and language allows deep networks to relate objects in the image to the corresponding semantic meaning. We present a model that learns a shared Gaussian mixture representation imposing the compositionality of the text onto the visual domain without having explicit location supervision. By combining the spatial transformer with a representation learning approach we learn to split images into separately encoded patches to associate visual and textual representations in an interpretable manner. On variations of MNIST and CIFAR10, our model is able to perform weakly supervised object detection and demonstrates its ability to extrapolate to unseen combination of objects.
Honey fraud and adulteration are an increasing concern globally. Hyperspectral imaging and machine learning can detect adulterated honey within a known set of honey, where we have captured data at different sugar conc...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Honey fraud and adulteration are an increasing concern globally. Hyperspectral imaging and machine learning can detect adulterated honey within a known set of honey, where we have captured data at different sugar concentrations. Previous work in this area has used a minimal number of honey types, as sample preparation and data capture is a time-consuming process. This paper develops a new approach using variational autoencoders (VAEs) for generating adulterated honey data for unseen honey types. The results show that the binary adulteration detector can achieve on average 81.3% accuracy on unseen honey types by adding the generated data to the existing training data. Without including the generated data while training, the classifier can only achieve 44% on unseen honey types.
We propose a simple yet effective proposal-free architecture for lidar panoptic segmentation. We jointly optimize both semantic segmentation and class-agnostic instance classification in a single network using a pilla...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
We propose a simple yet effective proposal-free architecture for lidar panoptic segmentation. We jointly optimize both semantic segmentation and class-agnostic instance classification in a single network using a pilla-rbased bird's-eye view representation. The instance classification head learns pairwise affinity between pillars to determine whether the pillars belong to the same instance or not. We further propose a local clustering algorithm to propagate instance ids by merging semantic segmentation and affinity predictions. Our experiments on nuScenes dataset show that our approach outperforms previous proposal-free methods and is comparable to proposal-based methods which requires extra annotation from object detection.
Deep learning and patternrecognition in smart farming has seen rapid growth as a building bridge between crop science and computervision. One of the important application is anomaly segmentation in agriculture like ...
详细信息
ISBN:
(纸本)9781665448994
Deep learning and patternrecognition in smart farming has seen rapid growth as a building bridge between crop science and computervision. One of the important application is anomaly segmentation in agriculture like weed, standing water, cloud shadow, etc. Our research work focuses on aerial farmland image dataset known as Agriculture vision. We propose to have data fusion of R, G, B, and NIR modalities that enhances the feature extraction and also propose Efficient Fused Pyramid Network (Fuse-PN) for anomaly pattern segmentation. The proposed encoder module is a bottom-up pathway having a compound scaled network and decoder module is a top-down pyramid network enhancing features at different scales having rich semantic features with lateral connections of low-level features. This proposed approach achieved a mean dice similarity score of 0.8271 for six agricultural anomaly patterns of Agriculture vision dataset and outperforms various approaches in literature.
Nowadays, there are outstanding strides towards a future with autonomous vehicles on our roads. While the perception of autonomous vehicles performs well under closed-set conditions, they still struggle to handle the ...
详细信息
ISBN:
(纸本)9781665487399
Nowadays, there are outstanding strides towards a future with autonomous vehicles on our roads. While the perception of autonomous vehicles performs well under closed-set conditions, they still struggle to handle the unexpected. This survey provides an extensive overview of anomaly detection techniques based on camera, lidar, radar, multimodal and abstract object level data. We provide a systematization including detection approach, corner case level, ability for an online application, and further attributes. We outline the state-of-the-art and point out current research gaps.
As the request for deep learning solutions increases, the need for explainability is even more fundamental. In this setting, particular attention has been given to visualization techniques, that try to attribute the r...
详细信息
ISBN:
(纸本)9781665448994
As the request for deep learning solutions increases, the need for explainability is even more fundamental. In this setting, particular attention has been given to visualization techniques, that try to attribute the right relevance to each input pixel with respect to the output of the network. In this paper, we focus on Class Activation Mapping (CAM) approaches, which provide an effective visualization by taking weighted averages of the activation maps. To enhance the evaluation and the reproducibility of such approaches, we propose a novel set of metrics to quantify explanation maps, which show better effectiveness and simplify comparisons between approaches. To evaluate the appropriateness of the proposal, we compare different CAM-based visualization methods on the entire ImageNet validation set, fostering proper comparisons and reproducibility.
This paper describes a CNN where all CNN style 2D convolution operations that lower to matrix matrix multiplication are fully binary. The network is derived from a common building block structure that is consistent wi...
详细信息
ISBN:
(纸本)9781665448994
This paper describes a CNN where all CNN style 2D convolution operations that lower to matrix matrix multiplication are fully binary. The network is derived from a common building block structure that is consistent with a constructive proof outline showing that binary neural networks are universal function approximators. 71.24% top 1 accuracy on the 2012 ImageNet validation set was achieved with a 2 step training procedure and implementation strategies optimized for binary operands are provided.
Machine Learning models have started to outperform medical experts in some classification tasks. Meanwhile, the question of how these classifiers produce certain results is attracting increasing research attention. Cu...
详细信息
ISBN:
(纸本)9781665448994
Machine Learning models have started to outperform medical experts in some classification tasks. Meanwhile, the question of how these classifiers produce certain results is attracting increasing research attention. Current interpretation methods provide a good starting point in investigating such questions, but they still massively lack the relation to the problem domain. In this work, we present how explanations of an AI system for skin image analysis can be made more domain-specific. We apply the synthesis of Local Interpretable Model-agnostic Explanations (LIME) with the ABCD-rule, a diagnostic approach of dermatologists, and present the results using a Deep Neural Network (DNN) based skin image classifier.
暂无评论