Being able to detect irrelevant test examples with respect to deployed deep learning models is paramount to properly and safely using them. In this paper, we address the problem of rejecting such out-of-distribution (...
详细信息
ISBN:
(纸本)9781665448994
Being able to detect irrelevant test examples with respect to deployed deep learning models is paramount to properly and safely using them. In this paper, we address the problem of rejecting such out-of-distribution (OOD) samples in a fully sample-free way, i.e., without requiring any access to in-distribution or OOD samples. We propose several indicators which can be computed alongside the prediction with little additional cost, assuming white-box access to the network. These indicators prove useful, stable and complementary for OOD detection on frequently-used architectures. We also introduce a surprisingly simple, yet effective summary OOD indicator. This indicator is shown to perform well across several networks and datasets and can furthermore be easily tuned as soon as samples become available. Lastly, we discuss how to exploit this summary in real-world settings.
AI City Challenge 2021 Task 5: The Natural Language-Based Vehicle Tracking is a Natural Language-based Vehicle Retrieval task, which requires retrieving a single-camera track using a set of three natural language desc...
详细信息
ISBN:
(纸本)9781665448994
AI City Challenge 2021 Task 5: The Natural Language-Based Vehicle Tracking is a Natural Language-based Vehicle Retrieval task, which requires retrieving a single-camera track using a set of three natural language descriptions of the specific targets. In this paper, we present our methods to tackle the difficulties of the provided task. Experiments with our approaches on the competitive dataset from AICity Challenge 2021 show that our techniques achieve Mean Reciprocal Rank score of 0.1701 on the public test dataset and 0.1571 on the private test dataset.
For convolutional neural networks (CNNs), a common hypothesis that explains both their generalization capability and their characteristic brittleness is that these models are implicitly regularized to rely on impercep...
详细信息
ISBN:
(纸本)9781665448994
For convolutional neural networks (CNNs), a common hypothesis that explains both their generalization capability and their characteristic brittleness is that these models are implicitly regularized to rely on imperceptible high-frequency patterns, more than humans would do. This hypothesis has seen some empirical validation, but most works do not rigorously divide the image frequency spectrum. We present a model to divide the spectrum in disjointed discs based on the distribution of energy and apply simple feature importance procedures to test whether high-frequencies are more important than lower ones. We find evidence that mid or high-level frequencies are disproportionately important for CNNs. The evidence is robust across different datasets and networks. Moreover, we find the diverse effects of the network's attributes, such as architecture and depth, on frequency bias and robustness in general.
Recently released depth cameras provide effective estimation of 3D positions of skeletal joints in temporal sequences of depth maps. In this work, we propose an efficient yet effective method to recognize human action...
详细信息
ISBN:
(纸本)9780769549903
Recently released depth cameras provide effective estimation of 3D positions of skeletal joints in temporal sequences of depth maps. In this work, we propose an efficient yet effective method to recognize human actions based on the positions of joints. First, the body skeleton is decomposed in a set of kinematic chains, and the position of each joint is expressed in a locally defined reference system which makes the coordinates invariant to body translations and rotations. A multi-part bag-of-poses approach is then defined, which permits the separate alignment of body parts through a nearest-neighbor classification. Experiments conducted on the Florence 3D Action dataset and the MSR Daily Activity dataset show promising results.
Interactive substitute recommendation for fashion products improves the online retail customer experience. Traditional fashion search platforms incorporate product metadata between the query products and the products ...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Interactive substitute recommendation for fashion products improves the online retail customer experience. Traditional fashion search platforms incorporate product metadata between the query products and the products to be retrieved. In this paper, we propose DAtRNet, an attribute representation network to disentangle the features in the query product. It is used to recommend attribute-aware substitute items based on the conditional similarity of the retrieved products. The proposed architecture relies on attribute-level similarity providing a fine-grained recommendation. In addition, a concurrent axial attention mechanism is proposed that generates global information embedding and adaptively re-calibrates the soft attention masks. Overall, the end-to-end framework enables the system to disentangle the attribute features and independently deals with them to enhance its flexibility towards one or multiple attributes. The proposed method outperforms the state-ofthe-art by a significant margin.
In this paper, we present a distributed embedded vision system that enables surround scene analysis and vehicle threat estimation. The proposed system analyzes the surroundings of the ego-vehicle using four cameras, e...
详细信息
ISBN:
(纸本)9781509014378
In this paper, we present a distributed embedded vision system that enables surround scene analysis and vehicle threat estimation. The proposed system analyzes the surroundings of the ego-vehicle using four cameras, each connected to a separate embedded processor. Each processor runs a set of optimized vision-based techniques to detect surrounding vehicles, so that the entire system operates at real-time speeds. This setup has been demonstrated on multiple vehicle testbeds with high levels of robustness under real-world driving conditions and is scalable to additional cameras. Finally, we present a detailed evaluation which shows over 95% accuracy and operation at nearly 15 frames per second.
In this paper, efforts have been made to analyze the impact of training strategies, transfer learning and domain knowledge on two biometric-based problems namely: three class oculus classification and fingerprint sens...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
In this paper, efforts have been made to analyze the impact of training strategies, transfer learning and domain knowledge on two biometric-based problems namely: three class oculus classification and fingerprint sensor classification. For analyzing these problems we have considered deep-learning based architecture and evaluated our results on benchmark contact-lens datasets like IIIT-D, ND, IIT-K ( our model is publicly available) and on fingerprint datasets like FVC-2002, FVC-2004, FVC-2006, IIITD-MOLF, IIT-K. In-depth feature analysis of various proposed deep-learning models has been done in order to infer that indeed training in different ways along with transfer learning and domain knowledge plays a vital role in deciding the learning ability of any network.
Neural Architecture Search (NAS) can automatically design model architecture with better performance. Current researchers have searched for local architecture similar to block, then stacked to construct entire models,...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Neural Architecture Search (NAS) can automatically design model architecture with better performance. Current researchers have searched for local architecture similar to block, then stacked to construct entire models, or searched the entire model based on a manually designed benchmark module. There is no method to directly search the architecture of the global(entire) model at the operation level. The purpose of this article is to search the entire model directly in the operation level search space. We analyzed the search space of past methods which searching for local architectures, then a working mode for global model architecture search named CAM is proposed. Proposed CAM decouples the architectural parameters of the entire model which can complete the entire model architecture search with few architecture parameters. In the experiment, the test error 2.68 % in CIFAR-10 is obtained by the proposed method at the global architecture level, which can compare with the stage-of-art local architecture search methods.
Image completion is widely used in photo restoration and editing applications, e.g. for object removal. Recently, there has been a surge of research on generating diverse completions for missing regions. However, exis...
详细信息
ISBN:
(纸本)9798350302493
Image completion is widely used in photo restoration and editing applications, e.g. for object removal. Recently, there has been a surge of research on generating diverse completions for missing regions. However, existing methods require large training sets from a specific domain of interest, and often fail on general-content images. In this paper, we propose a diverse completion method that does not require a training set and can thus treat arbitrary images from any domain. Our internal diverse completion (IDC) approach draws inspiration from recent single-image generative models that are trained on multiple scales of a single image, adapting them to the extreme setting in which only a small portion of the image is available for training. We illustrate the strength of IDC on several datasets, using both user studies and quantitative comparisons.
The AdderNet was recently developed as a way to implement deep neural networks without needing multiplication operations to combine weights and inputs. Instead, absolute values of the difference between weights and in...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
The AdderNet was recently developed as a way to implement deep neural networks without needing multiplication operations to combine weights and inputs. Instead, absolute values of the difference between weights and inputs are used, greatly reducing the gate-level implementation complexity. Training of AdderNets is challenging, however, and the loss curves during training tend to fluctuate significantly. In this paper we propose the Conjugate Adder Network, or CAddNet, which uses the difference between the absolute values of conjugate pairs of inputs and the weights. We show that this can be implemented simply via a single minimum operation, resulting in a roughly 50% reduction in logic gate complexity as compared with AdderNets. The CAddNet method also stabilizes training as compared with AdderNets, yielding training curves similar to standard CNNs.
暂无评论