The AdderNet was recently developed as a way to implement deep neural networks without needing multiplication operations to combine weights and inputs. Instead, absolute values of the difference between weights and in...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
The AdderNet was recently developed as a way to implement deep neural networks without needing multiplication operations to combine weights and inputs. Instead, absolute values of the difference between weights and inputs are used, greatly reducing the gate-level implementation complexity. Training of AdderNets is challenging, however, and the loss curves during training tend to fluctuate significantly. In this paper we propose the Conjugate Adder Network, or CAddNet, which uses the difference between the absolute values of conjugate pairs of inputs and the weights. We show that this can be implemented simply via a single minimum operation, resulting in a roughly 50% reduction in logic gate complexity as compared with AdderNets. The CAddNet method also stabilizes training as compared with AdderNets, yielding training curves similar to standard CNNs.
Architectures based on siamese networks with triplet loss have shown outstanding performance on the image-based similarity search problem. This approach attempts to discriminate between positive (relevant) and negativ...
详细信息
ISBN:
(纸本)9781665448994
Architectures based on siamese networks with triplet loss have shown outstanding performance on the image-based similarity search problem. This approach attempts to discriminate between positive (relevant) and negative (irrelevant) items. However, it undergoes a critical weakness. Given a query, it cannot discriminate weakly relevant items, for instance, items of the same type but different color or texture as the given query, which could be a serious limitation for many real-world search applications. Therefore, in this work, we present a quadruplet-based architecture that overcomes the aforementioned weakness. Moreover, we present an instance of this quadruplet network, which we call Sketch-QNet, to deal with the color sketch-based image retrieval (CSBIR) problem, achieving new state-of-the-art results.
Manifold learning has been effectively used in computervision applications for dimensionality reduction that improves classification performance and reduces computational load. Grassmann manifolds are well suited for...
详细信息
ISBN:
(纸本)9780769549903
Manifold learning has been effectively used in computervision applications for dimensionality reduction that improves classification performance and reduces computational load. Grassmann manifolds are well suited for computervision problems because they promote smooth surfaces where points are represented as subspaces. In this paper we propose Grassmannian Sparse Representations (GSR), a novel subspace learning algorithm that combines the benefits of Grassmann manifolds with sparse representations using least squares loss L1-norm minimization for optimal classification. We further introduce a new descriptor that we term Motion Depth Surface (MDS) and compare its classification performance against the traditional Motion History Image (MHI) descriptor. We demonstrate the effectiveness of GSR on computationally intensive 3D action sequences from the Microsoft Research 3D-Action and 3D-Gesture datasets.
We present a semantic segmentation algorithm for RGB remote sensing images. Our method is based on the Dilated Stacked U-Nets architecture. This state-of-the-art method has been shown to have good performance in other...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
We present a semantic segmentation algorithm for RGB remote sensing images. Our method is based on the Dilated Stacked U-Nets architecture. This state-of-the-art method has been shown to have good performance in other applications. We perform additional post-processing by blending image tiles and degridding the result. Our method gives competitive results on the DeepGlobe dataset.
Dynamic vision sensor event cameras produce a variable data rate stream of brightness change events. Event production at the pixel level is controlled by threshold, bandwidth, and refractory period bias current parame...
详细信息
ISBN:
(纸本)9781665448994
Dynamic vision sensor event cameras produce a variable data rate stream of brightness change events. Event production at the pixel level is controlled by threshold, bandwidth, and refractory period bias current parameter settings. Biases must be adjusted to match application requirements and the optimal settings depend on many factors. As a first step towards automatic control of biases, this paper proposes fixed-step feedback controllers that use measurements of event rate and noise. The controllers regulate the event rate within an acceptable range using threshold and refractory period control, and regulate noise using bandwidth control. Experiments demonstrate model validity and feedback control.
Climate change is a pressing issue that is currently affecting and will affect every part of our lives. It's becoming incredibly vital we, as a society, address the climate crisis as a universal effort, including ...
详细信息
ISBN:
(纸本)9781665448994
Climate change is a pressing issue that is currently affecting and will affect every part of our lives. It's becoming incredibly vital we, as a society, address the climate crisis as a universal effort, including those in the computervision (CV) community. In this work, we analyze the total cost of CO2 emissions by breaking it into (1) the architecture creation cost and (2) the life-time evaluation cost. We show that over time, these costs are non-negligible and are having a direct impact on our future. Importantly, we conduct an ethical analysis of how the CV-community is unintentionally overlooking its own ethical AI principles by emitting this level of CO2. To address these concerns, we propose adding "enforcement" as a pillar of ethical AI and provide some recommendations for how architecture designers and broader CV community can curb the climate crisis.
Being able to detect irrelevant test examples with respect to deployed deep learning models is paramount to properly and safely using them. In this paper, we address the problem of rejecting such out-of-distribution (...
详细信息
ISBN:
(纸本)9781665448994
Being able to detect irrelevant test examples with respect to deployed deep learning models is paramount to properly and safely using them. In this paper, we address the problem of rejecting such out-of-distribution (OOD) samples in a fully sample-free way, i.e., without requiring any access to in-distribution or OOD samples. We propose several indicators which can be computed alongside the prediction with little additional cost, assuming white-box access to the network. These indicators prove useful, stable and complementary for OOD detection on frequently-used architectures. We also introduce a surprisingly simple, yet effective summary OOD indicator. This indicator is shown to perform well across several networks and datasets and can furthermore be easily tuned as soon as samples become available. Lastly, we discuss how to exploit this summary in real-world settings.
Image completion is widely used in photo restoration and editing applications, e.g. for object removal. Recently, there has been a surge of research on generating diverse completions for missing regions. However, exis...
详细信息
ISBN:
(纸本)9798350302493
Image completion is widely used in photo restoration and editing applications, e.g. for object removal. Recently, there has been a surge of research on generating diverse completions for missing regions. However, existing methods require large training sets from a specific domain of interest, and often fail on general-content images. In this paper, we propose a diverse completion method that does not require a training set and can thus treat arbitrary images from any domain. Our internal diverse completion (IDC) approach draws inspiration from recent single-image generative models that are trained on multiple scales of a single image, adapting them to the extreme setting in which only a small portion of the image is available for training. We illustrate the strength of IDC on several datasets, using both user studies and quantitative comparisons.
Despite the rapid progress in deep visual recognition, modern computervision datasets significantly overrepresent the developed world and models trained on such datasets underperform on images from unseen geographies...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Despite the rapid progress in deep visual recognition, modern computervision datasets significantly overrepresent the developed world and models trained on such datasets underperform on images from unseen geographies. We investigate the effectiveness of unsupervised domain adaptation (UDA) of such models across geographies at closing this performance gap. To do so, we first curate two shifts from existing datasets to study the Geographical DA problem, and discover new challenges beyond data distribution shift: context shift, wherein object surroundings may change significantly across geographies, and subpopulation shift, wherein the intra-category distributions may shift. We demonstrate the inefficacy of standard DA methods at Geographical DA, highlighting the need for specialized geographical adaptation solutions to address the challenge of making object recognition work for everyone.
Few-shot learning features the capability of generalizing from a few examples. In this paper, we first identify that a discriminative feature space, namely a rectified metric space, that is learned to maintain the met...
详细信息
ISBN:
(纸本)9781665448994
Few-shot learning features the capability of generalizing from a few examples. In this paper, we first identify that a discriminative feature space, namely a rectified metric space, that is learned to maintain the metric consistency from training to testing, is an essential component to the success of metric-based few-shot learning. Numerous analyses indicate that a simple modification of the objective can yield substantial performance gains. The resulting approach, called rectified metric propagation (ReMP), further optimizes an attentive prototype propagation network, and applies a repulsive force to make confident predictions. Extensive experiments demonstrate that the proposed ReMP is effective and efficient, and outperforms the state of the arts on various standard few-shot learning datasets.
暂无评论