In this paper, we study deep transfer learning as a way of overcoming object recognition challenges encountered in the field of digital pathology. Through several experiments, we investigate various uses of pre-traine...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
In this paper, we study deep transfer learning as a way of overcoming object recognition challenges encountered in the field of digital pathology. Through several experiments, we investigate various uses of pre-trained neural network architectures and different combination schemes with random forests for feature selection. Our experiments on eight classification datasets show that densely connected and residual networks consistently yield best performances across strategies. It also appears that network fine-tuning and using inner layers features are the best performing strategies, with the former yielding slightly superior results.
In this paper we analyze the classification performance of neural network structures without parametric inference. Making use of neural architecture search, we empirically demonstrate that it is possible to find rando...
详细信息
ISBN:
(纸本)9781665448994
In this paper we analyze the classification performance of neural network structures without parametric inference. Making use of neural architecture search, we empirically demonstrate that it is possible to find random weight architectures, a deep prior, that enables a linear classification to perform on par with fully trained deep counterparts. Through ablation experiments, we exclude the possibility of winning a weight initialization lottery and confirm that suitable deep priors do not require additional inference. In an extension to continual learning, we investigate the possibility of catastrophic interference free incremental learning. Under the assumption of classes originating from the same data distribution, a deep prior found on only a subset of classes is shown to allow discrimination of further classes through training of a simple linear classifier.
Recent research has shown that faces can be obfuscated in large-scale datasets with a minimal performance impact on image classification and downstream tasks like object recognition. In this paper, we investigate the ...
详细信息
ISBN:
(纸本)9781665448994
Recent research has shown that faces can be obfuscated in large-scale datasets with a minimal performance impact on image classification and downstream tasks like object recognition. In this paper, we investigate the role of face obfuscation in video classification datasets and quantify a more significant reduction in performance caused by face blurring. To reduce such performance effects, we propose a generalized distillation approach in which a privacy-preserving action recognition network is trained with privileged information given by face identities. We show, through experiments performed on Kinetics-400, that the proposed approach can fully close the performance gap caused by face anonymization.
Honey fraud and adulteration are an increasing concern globally. Hyperspectral imaging and machine learning can detect adulterated honey within a known set of honey, where we have captured data at different sugar conc...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Honey fraud and adulteration are an increasing concern globally. Hyperspectral imaging and machine learning can detect adulterated honey within a known set of honey, where we have captured data at different sugar concentrations. Previous work in this area has used a minimal number of honey types, as sample preparation and data capture is a time-consuming process. This paper develops a new approach using variational autoencoders (VAEs) for generating adulterated honey data for unseen honey types. The results show that the binary adulteration detector can achieve on average 81.3% accuracy on unseen honey types by adding the generated data to the existing training data. Without including the generated data while training, the classifier can only achieve 44% on unseen honey types.
We tackle here a specific, still not widely addressed aspect, of AI robustness, which consists of seeking invariance / insensitivity of model performance to hidden factors of variations in the data. Towards this end, ...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
We tackle here a specific, still not widely addressed aspect, of AI robustness, which consists of seeking invariance / insensitivity of model performance to hidden factors of variations in the data. Towards this end, we employ a two step strategy that a) does unsupervised discovery, via generative models, of sensitive factors that cause models to under-perform, and b) intervenes models to make their performance invariant to these sensitive factors' influence. We consider 3 separate interventions for robustness, including: data augmentation, semantic consistency, and adversarial alignment. We evaluate our method using metrics that measure trade offs between invariance (insensitivity) and overall performance (utility) and show the benefits of our method for 3 settings (unsupervised, semi-supervised and generalization).
AI City Challenge 2021 Task 5: The Natural Language-Based Vehicle Tracking is a Natural Language-based Vehicle Retrieval task, which requires retrieving a single-camera track using a set of three natural language desc...
详细信息
ISBN:
(纸本)9781665448994
AI City Challenge 2021 Task 5: The Natural Language-Based Vehicle Tracking is a Natural Language-based Vehicle Retrieval task, which requires retrieving a single-camera track using a set of three natural language descriptions of the specific targets. In this paper, we present our methods to tackle the difficulties of the provided task. Experiments with our approaches on the competitive dataset from AICity Challenge 2021 show that our techniques achieve Mean Reciprocal Rank score of 0.1701 on the public test dataset and 0.1571 on the private test dataset.
Human risky behavior in driving is an important visual recognition problem. In this paper, we propose a multi-view temporal action localization system based on the grayscale video to achieve action recognition in natu...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Human risky behavior in driving is an important visual recognition problem. In this paper, we propose a multi-view temporal action localization system based on the grayscale video to achieve action recognition in naturalistic driving. Specifically, we adopted SwinTransformer as feature extractor, and a single framework to detect boundary and class at the same time. Also, we improve multiple loss function for explicit constraints of embedded feature distributions. Our proposed framework achieves the overall F1 -score of 0.3154 on A2 dataset.
Machine Learning models have started to outperform medical experts in some classification tasks. Meanwhile, the question of how these classifiers produce certain results is attracting increasing research attention. Cu...
详细信息
ISBN:
(纸本)9781665448994
Machine Learning models have started to outperform medical experts in some classification tasks. Meanwhile, the question of how these classifiers produce certain results is attracting increasing research attention. Current interpretation methods provide a good starting point in investigating such questions, but they still massively lack the relation to the problem domain. In this work, we present how explanations of an AI system for skin image analysis can be made more domain-specific. We apply the synthesis of Local Interpretable Model-agnostic Explanations (LIME) with the ABCD-rule, a diagnostic approach of dermatologists, and present the results using a Deep Neural Network (DNN) based skin image classifier.
We propose a learning-based image compression method that achieves any arbitrary input bitrate via user-guided bit allocation to preferred regions. We verify our hypothesis of incorporating user guidance for bitrate c...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
We propose a learning-based image compression method that achieves any arbitrary input bitrate via user-guided bit allocation to preferred regions. We verify our hypothesis of incorporating user guidance for bitrate control by experimenting with alternatives that do not have any guidance. We conduct extensive evaluation on CelebA-HQ and CityScapes dataset using standard quantitative metrics and human studies showing that our single model for multiple bitrates achieves similar or better performance as compared to previous learned image compression methods that require re-training for each new bitrate.
In this paper, we propose an online movement-specific vehicle counting system to realize robust traffic flow analysis at crowded intersections. Our proposed framework adopts PP-YOLO as the vehicle detector and adapts ...
详细信息
ISBN:
(纸本)9781665448994
In this paper, we propose an online movement-specific vehicle counting system to realize robust traffic flow analysis at crowded intersections. Our proposed framework adopts PP-YOLO as the vehicle detector and adapts the Deep-Sort algorithm to perform multi-object tracking. In order to realize online and robust vehicle counting, we further adopt a shape-based movement assignment strategy to differentiate movements and carefully designed spatial constraints to effectively reduce false-positive counts. Our proposed framework achieves the overall S1-score of 0.9467, ranking the first in the AICITY2021-track1 challenge.
暂无评论