检索结果-内蒙古大学图书馆

Effects of local minima and bifurcation delay on combinatorial optimization with continuous variables

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Sato, Shintaro NTT Computer and Data Science Laboratories NTT Corporation Musashino180-8585 Japan

Combinatorial optimization problems can be mapped onto Ising models, and their ground state is generally difficult to find. A lot of heuristics for these problems have been proposed, and one promising approach is to use continuous variables. In recent years, one such algorithm has been implemented by using parametric oscillators known as coherent Ising machines. Although these algorithms have been confirmed to have high performance through many experiments, unlike other familiar algorithms such as simulated annealing, their computational ability has not been fully investigated. In this paper, we propose a simple heuristic based on continuous variables whose static and dynamical properties are easy to investigate. Through the analyses of the proposed algorithm, we find that many local minima in the early stage of the optimization and bifurcation delay reduce its performance in a certain class of Ising models. Copyright © 2022, The Authors. All rights reserved.

关键词： Combinatorial optimization

StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning

学校读者我要写书评

暂无评论

StyleAdv: Meta Style Adversarial Training for Cross-Domain F...

Conference on computer Vision and Pattern Recognition (CVPR)

作者： Yuqian Fu Yu Xie Yanwei Fu Yu-Gang Jiang Shanghai Key Lab of Intelligent Information Processing School of Computer Science Fudan University Purple Mountain Laboratories Nanjing China School of Data Science Fudan University

Cross-Domain Few-Shot Learning (CD-FSL) is a recently emerging task that tackles few-shot learning across different domains. It aims at transferring prior knowledge learned on the source dataset to novel target datasets. The CD-FSL task is especially challenged by the huge domain gap between different datasets. Critically, such a domain gap actually comes from the changes of visual styles, and wave-SAN [10] empirically shows that spanning the style distribution of the source data helps alleviate this issue. However, wave-SAN simply swaps styles of two images. Such a vanilla operation makes the generated styles “real” and “easy”, which still fall into the original set of the source styles. Thus, inspired by vanilla adversarial learning, a novel model-agnostic meta Style Adversarial training (StyleAdv) method together with a novel style adversarial attack method is proposed for CD-FSL. Particularly, our style attack method synthesizes both “virtual” and “hard” adversarial styles for model training. This is achieved by perturbing the original style with the signed style gradients. By continually attacking styles and forcing the model to recognize these challenging adversarial styles, our model is gradually robust to the visual styles, thus boosting the generalization ability for novel target datasets. Besides the typical CNN-based backbone, we also employ our StyleAdv method on large-scale pre-trained vision transformer. Extensive experiments conducted on eight various target datasets show the effectiveness of our method. Whether built upon ResNet or ViT, we achieve the new state of the art for CD-FSL. Code is available at https://***/lovelyqian/StyleAdv-CDFSL.

关键词：

Recurrent neural networks for learning long-term temporal dependencies with reanalysis of time scale representation

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Ohno, Kentaro Kumagai, Atsutoshi NTT Computer and Data Science Laboratories

Recurrent neural networks with a gating mechanism such as an LSTM or GRU are powerful tools to model sequential data. In the mechanism, a forget gate, which was introduced to control information flow in a hidden state in the RNN, has recently been re-interpreted as a representative of the time scale of the state, i.e., a measure how long the RNN retains information on inputs. On the basis of this interpretation, several parameter initialization methods to exploit prior knowledge on temporal dependencies in data have been proposed to improve learnability. However, the interpretation relies on various unrealistic assumptions, such as that there are no inputs after a certain time point. In this work, we reconsider this interpretation of the forget gate in a more realistic setting. We first generalize the existing theory on gated RNNs so that we can consider the case where inputs are successively given. We then argue that the interpretation of a forget gate as a temporal representation is valid when the gradient of loss with respect to the state decreases exponentially as time goes back. We empirically demonstrate that existing RNNs satisfy this gradient condition at the initial training phase on several tasks, which is in good agreement with previous initialization methods. On the basis of this finding, we propose an approach to construct new RNNs that can represent a longer time scale than conventional models, which will improve the learnability for long-term sequential data. We verify the effectiveness of our method by experiments with real-world datasets. © 2021, CC BY.

关键词： Time measurement

Research on Transfer Learning to Give AI the Same Versatility and Skill as Humans

学校读者我要写书评

暂无评论

NTT Technical Review 2021年第12期19卷 12-15页

作者： Kumagai, Atsutoshi NTT Computer and Data Science Laboratories NTT Social Informatics Laboratories Japan

Machine learning is needed to build artificial intelligence (AI), and this requires a large amount of training data. Sometimes, however, you cannot get enough high-quality training data. What’s more, to prevent an AI from becoming out of date, you need to re-train it regularly to keep pace with data that is constantly changing. In this interview, Distinguished Researcher Atsutoshi Kumagai talked to us about the technology of transfer learning, which can be used to improve AI performance even when ideal data cannot be obtained. © 2021 Nippon Telegraph and Telephone Corp.. All rights reserved.

关键词： Artificial intelligence

Recurrent Neural Networks for Learning Long-term Temporal Dependencies with Reanalysis of Time Scale Representation

学校读者我要写书评

暂无评论

Recurrent Neural Networks for Learning Long-term Temporal De...

IEEE International Conference on Big Knowledge (ICBK)

作者： Kentaro Ohno Atsutoshi Kumagai NTT Computer and Data Science Laboratories

ISBN: (纸本)9781665438599

Recurrent neural networks with a gating mechanism such as an LSTM or GRU are powerful tools to model sequential data. In the mechanism, a forget gate, which was introduced to control information flow in a hidden state in the RNN, has recently been re-interpreted as a representative of the time scale of the state, i.e., a measure how long the RNN retains information on inputs. On the basis of this interpretation, several parameter initialization methods to exploit prior knowledge on temporal dependencies in data have been proposed to improve learn-ability. However, the interpretation relies on various unrealistic assumptions, such as that there are no inputs after a certain time point. In this work, we reconsider this interpretation of the forget gate in a more realistic setting. We first generalize the existing theory on gated RNNs so that we can consider the case where inputs are successively given. We then argue that the interpretation of a forget gate as a temporal representation is valid when the gradient of loss with respect to the state decreases exponentially as time goes back. We empirically demonstrate that existing RNNs satisfy this gradient condition at the initial training phase on several tasks, which is in good agreement with previous initialization methods. On the basis of this finding, we propose an approach to construct new RNNs that can represent a longer time scale than conventional models, which will improve the learnability for long-term sequential data. We verify the effectiveness of our method by experiments with real-world datasets.

关键词： Training Analytical models Recurrent neural networks Conferences Logic gates data models Time measurement

Scheduling Space Expander: An Extension of Concurrency Control for data Ingestion Queries

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Nakazono, Sho Uchiyama, Hiroyuki Fujiwara, Yasuhiro Kawashima, Hideyuki NTT Computer and Data Science Laboratories Tokyo Japan Recruit Co. Ltd. Tokyo Japan NTT Communication Science Laboratories Kanagawa Japan Faculty of Environment and Information Studies Keio University Kanagawa Japan

With the continuing advances of sensing devices and IoT applications, database systems needs to process data ingestion queries that update the sensor data frequently. To process data ingestion queries with transactional correctness, we propose a novel protocol extension method, scheduling space expander (SSE). The key idea of SSE is that we can safely omit an update if the update becomes outdated and unnecessary. SSE adds another control flow to conventional protocols to test the transactional correctness of an erasing version order, which assumes that a transactions' updates are all outdated and unnecessary. In addition, we present an optimization of SSE called epoch-based SSE (ESSE), which generates, tests, and maintains the erasing version order more efficiently than SSE. Our approach makes the performance of data ingestion queries more efficient. Experimental results demonstrate that our ESSE extensions of Silo and MVTO improve 2.7× and 2.5× performance on the TATP benchmark on a 144-core machine, and the extensions achieved performance comparable to that of the original protocol for the TPC-C benchmark. © 2023, CC BY.

关键词： Concurrency control

Few-shot learning for unsupervised feature selection

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Kumagai, Atsutoshi Iwata, Tomoharu Fujiwara, Yasuhiro NTT Computer and Data Science Laboratories NTT Communication Science Laboratories

We propose a few-shot learning method for unsupervised feature selection, which is a task to select a subset of relevant features in unlabeled data. Existing methods usually require many instances for feature selection. However, sufficient instances are often unavailable in practice. The proposed method can select a subset of relevant features in a target task given a few unlabeled target instances by training with unlabeled instances in multiple source tasks. Our model consists of a feature selector and decoder. The feature selector outputs a subset of relevant features taking a few unlabeled instances as input such that the decoder can reconstruct the original features of unseen instances from the selected ones. The feature selector uses the Concrete random variables to select features via gradient descent. To encode task-specific properties from a few unlabeled instances to the model, the Concrete random variables and decoder are modeled using permutation-invariant neural networks that take a few unlabeled instances as input. Our model is trained by minimizing the expected test reconstruction error given a few unlabeled instances that is calculated with datasets in source tasks. We experimentally demonstrate that the proposed method outperforms existing feature selection methods. Copyright © 2021, The Authors. All rights reserved.

关键词： Random variables

Meta-learning for relative density-ratio estimation 21

学校读者我要写书评

暂无评论

Meta-learning for relative density-ratio estimation

Proceedings of the 35th International Conference on Neural Information Processing Systems

作者： Atsutoshi Kumagai Tomoharu Iwata Yasuhiro Fujiwara NTT Computer and Data Science Laboratories NTT Communication Science Laboratories

ISBN: (纸本)9781713845393

The ratio of two probability densities, called a density-ratio, is a vital quantity in machine learning. In particular, a relative density-ratio, which is a bounded extension of the density-ratio, has received much attention due to its stability and has been used in various applications such as outlier detection and dataset comparison. Existing methods for (relative) density-ratio estimation (DRE) require many instances from both densities. However, sufficient instances are often unavailable in practice. In this paper, we propose a meta-learning method for relative DRE, which estimates the relative density-ratio from a few instances by using knowledge in related datasets. Specifically, given two datasets that consist of a few instances, our model extracts the datasets' information by using neural networks and uses it to obtain instance embeddings appropriate for the relative DRE. We model the relative density-ratio by a linear model on the embedded space, whose global optimum solution can be obtained as a closed-form solution. The closed-form solution enables fast and effective adaptation to a few instances, and its differentiability enables us to train our model such that the expected test error for relative DRE can be explicitly minimized after adapting to a few instances. We empirically demonstrate the effectiveness of the proposed method by using three problems: relative DRE, dataset comparison, and outlier detection.

关键词：

Meta-learning for relative density-ratio estimation

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Kumagai, Atsutoshi Iwata, Tomoharu Fujiwara, Yasuhiro NTT Computer and Data Science Laboratories NTT Communication Science Laboratories

关键词： Anomaly detection