Branch-and-bound is a systematic enumerative method for combinatorial optimization, where the performance highly relies on the variable selection strategy. State-of-theart handcrafted heuristic strategies suffer from ...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
Branch-and-bound is a systematic enumerative method for combinatorial optimization, where the performance highly relies on the variable selection strategy. State-of-theart handcrafted heuristic strategies suffer from relatively slow inference time for each selection, while the current machinelearning methods require a significant amount of labeled data. We propose a new approach for solving the data labeling and inference latency issues in combinatorial optimization based on the use of the reinforcement learning (RL) paradigm. We use imitation learning to bootstrap an RL agent and then use Proximal Policy Optimization (PPO) to further explore global optimal actions. then, a value network is used to run Monte-Carlo tree search (MCTS) to enhance the policy network. We evaluate the performance of our method on four different categories of combinatorial optimization problems and show that our approach performs strongly compared to the state-of-the-art machinelearning and heuristics based methods.
data augmentation is a technique to improve the generalization ability of machinelearning methods by increasing the size of the dataset. However, since every augmentation method is not equally effective for every dat...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
data augmentation is a technique to improve the generalization ability of machinelearning methods by increasing the size of the dataset. However, since every augmentation method is not equally effective for every dataset, you need to select an appropriate method carefully. We propose a neural network that dynamically selects the best combination of data augmentation methods using a Gating Network and a mutually beneficial feature consistency loss. the Gating Network is able to control how much of each data augmentation is used for the representation within the network. the feature consistency loss gives a constraint that augmented features from the same input pattern should be in similar. In the experiments, we demonstrate the effectiveness of the proposed method on the 12 largest time-series datasets from 2018 UCR Time Series Archive and reveal the relationships between the data augmentation methods through analysis of the proposed method.
machinelearning models that can generalize to unseen domains are essential when applied in real-world scenarios involving strong domain shifts. We address the challenging domain generalization (DG) problem, where a m...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
machinelearning models that can generalize to unseen domains are essential when applied in real-world scenarios involving strong domain shifts. We address the challenging domain generalization (DG) problem, where a model trained on a set of source domains is expected to generalize well in unseen domains without any exposure to their data. the main challenge of DG is that the features learned from the source domains are not necessarily present in the unseen target domains, leading to performance deterioration. We assume that learning a richer set of features is crucial to improve the transfer to a wider set of unknown domains. For this reason, we propose COLUMBUS, a method that enforces new feature discovery via a targeted corruption of the most relevant input and multi-level representations of the data. We conduct an extensive empirical evaluation to demonstrate the effectiveness of the proposed approach which achieves new state-of-the-art results by outperforming 18 DG algorithms on multiple DG benchmark datasets in the DOMAINBED framework.
A major challenge encountered in the offline evaluation of machinelearning models before being released to production is the discrepancy between the distributions of the offline test data and of the online data, due ...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
A major challenge encountered in the offline evaluation of machinelearning models before being released to production is the discrepancy between the distributions of the offline test data and of the online data, due to, e.g., biased sampling scheme, data aging issues and occurrence(s) of regime shift. Consequently, the offline evaluation metrics often do not reflect the actual performance of the model online. In this paper, we propose online adaptive metrics, a computationally efficient method which re-weights the offline metrics based on calculating the joint distributions of the model hypotheses over the offline test data VS. the online data. It provides offline metrics which estimate the production performance of the model by taking into account the test data biases. the proposed method is demonstrated by real life examples on text classification and a commercial natural language understanding system. We show that the online adaptive metrics can provide accurate predictions of online recall and precision even with a small test dataset.
Over the past two decades, machinelearning and deep learning techniques for forecasting solar flares have generated great impact due to their ability to learn from a high dimensional data space. However, lack of high...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
Over the past two decades, machinelearning and deep learning techniques for forecasting solar flares have generated great impact due to their ability to learn from a high dimensional data space. However, lack of high quality data from flaring phenomena becomes a constraining factor for such tasks. One of the methods to tackle this complex problem is utilizing trained classifiers with multivariate time series of magnetic field parameters. In this work, we compare the exceedingly popular multivariate time series classifiers applying deep learning techniques with commonly used machinelearning classifiers (i.e., SVM). We intend to explore the role of data augmentation on time series oriented flare prediction techniques, specifically the deep learning-based ones. We utilize four time series data augmentation techniques and couple them with selected multivariate time series classifiers to understand how each of them affects the outcome. In the end, we show that the deep learning algorithms as well as augmentation techniques improve our classifiers performance. the resulting classifiers' performance after augmentation outplayed the traditional flare forecasting techniques.
In the fields of machinelearning and datamining, unsupervised feature selection plays an important role in processing large amounts of high-dimensional unlabeled data. this paper proposes an original and novel unsup...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
In the fields of machinelearning and datamining, unsupervised feature selection plays an important role in processing large amounts of high-dimensional unlabeled data. this paper proposes an original and novel unsupervised feature selection based on feature grouping and orthogonal constraints. We consider the domain relationship in the original data and reconstruct the similarity matrix based on the correlation between the features. We use a generalized incoherent regression model based on orthogonal constraints. Furthermore, a graph regularization term with local structure preservation constraints is added to ensure that the feature subset does not lose local structural features in the original data space. Besides, an iterative algorithm is proposed to solve the optimization problem by iteratively updating the global similarity matrix, and constructing weight matrix, pseudo-label matrix and transformation matrix. through experiments on 6 benchmark datasets, the clustering performance of the proposed method outperforms state-of-the-art unsupervised feature selection methods. the source code is available at: https://***/misteru/FGOC.
In this paper, we propose a Neural Architecture Search strategy based on self supervision and semi-supervised learning for the task of semantic segmentation. Our approach builds an optimized neural network (NN) model ...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
In this paper, we propose a Neural Architecture Search strategy based on self supervision and semi-supervised learning for the task of semantic segmentation. Our approach builds an optimized neural network (NN) model for this task by solving a jigsaw pretext problem identified by self-supervised learning over unlabeled training data, and, leveraging the structure of the unlabeled data with semi-supervised learning. Dynamic routing with a gradient descent approach is used to find the architecture of the NN model. Experiments on the Cityscapes and PASCAL VOC 2012datasets show that the found neural network is four times more efficient than a state-of-the-art hand designed NN model in terms of floating-point operations.
Gait recognition has been greatly improved by deep learning and can achieve a relative high accuracy. the advances depend on the data size of gait. However, due to public concerns on privacy and regulations and laws f...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
Gait recognition has been greatly improved by deep learning and can achieve a relative high accuracy. the advances depend on the data size of gait. However, due to public concerns on privacy and regulations and laws from different countries, it is very difficult and almost impossible to collect a huge centralized gait database for algorithm training. Federated learning is a distributed machinelearning technique for privacy-preserving, and can help to solve the problem. We propose a federated gait recognition benchmark, FedGait, to train algorithms using distributed gait data. It is the first benchmark on gait recognition to the best of our knowledge. FedGait can utilizes the gait videos available on multiple clients to learn a robust and generalized model. Based on the real-world gait scenarios, we introduce two federated gait recognition scenarios: institution-based scenario (IBS) and device-based scenario (DBS). Compared with centralized training, federated learning will encounter more serious heterogeneous data and data imbalance problems. We employ four popular databases for experiments, CASIA-B, CASIA-E, ReSGait and OU-MVLP, are involved in FedGait to investigate the problems in federated learning. We hope FedGait is a good start to solve data privacy problem in gait recognition.
machinelearning-based classification algorithms typically operate under assumptions that assert that the underlying data generating distribution is stationary and draws from a finite set of categories. In some scenar...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
machinelearning-based classification algorithms typically operate under assumptions that assert that the underlying data generating distribution is stationary and draws from a finite set of categories. In some scenarios, these assumptions might not hold, but identifying violating inputs - here referred to as anomalies - is a challenging task. Recent publications propose deep learning-based approaches that perform anomaly detection and classification jointly by (implicitly) learning a mapping that projects data points to a lower-dimensional space, such that the images of points of one class reside inside of a hypersphere, while others are mapped outside of it. In this work, we propose Multi-Class Hypersphere Anomaly Detection (MCHAD), a new hypersphere learning algorithm for anomaly detection in classification settings, as well as a generalization of existing hypersphere learning methods that allows incorporating example anomalies into the training. Extensive experiments on competitive benchmark tasks, as well as theoretical arguments, provide evidence for the effectiveness of our method. Our code is publicly available(1).
data augmentation has been a prevalent approach in improving the performance of deep learning models against slight variations in data. Adversarial learning is one such form of data augmentation. In this work, we aim ...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
data augmentation has been a prevalent approach in improving the performance of deep learning models against slight variations in data. Adversarial learning is one such form of data augmentation. In this work, we aim to introduce a framework to generate harder examples for a specific object class and an adversarial attack for the object detection task. We have also presented our study on the effect of training against such generated harder examples and adversarial samples in object detection. We have applied this adversarial learning technique to a YOLOv3 model and due to the nature of the attack, we demonstrated a substantial improvement in average precision (AP) for a single class of the COCO dataset. As per the literature, we are the first to introduce this kind of class-specific data augmentation strategy in object detection. With our approach, we have shown an improvement of 23.34% in AP for Cat class and 3.1% on overall mAP of YOLOv3 model on clean validation data, while 43.5% improvement in AP for the Cat class on the composite images with class-specific adversarial samples.
暂无评论