Multilingual neural machine translation has shown the capability of directly translating between language pairs unseen in training, i.e. zero-shot translation. Despite being conceptually attractive, it often suffers f...
详细信息
Symbolic regression corresponds to an ensemble of techniques that allow to uncover an analytical equation from data. Through a closed form formula, these techniques provide great advantages such as potential scientifi...
详细信息
With the vast development and employment of artificial intelligence applications, research into the fairness of these algorithms has been increased. Specifically, in the natural language processing domain, it has been...
详细信息
Ludii is a general game system being developed as part of the ERC-funded Digital Ludeme Project (DLP). While its primary aim is to model, play, and analyse the full range of traditional strategy games, Ludii also has ...
Expert Iteration (ExIt) is an effective framework for learning game-playing policies from self-play. ExIt involves training a policy to mimic the search behaviour of a tree search algorithm -- such as Monte-Carlo tree...
详细信息
ISBN:
(数字)9781728145334
ISBN:
(纸本)9781728145341
Expert Iteration (ExIt) is an effective framework for learning game-playing policies from self-play. ExIt involves training a policy to mimic the search behaviour of a tree search algorithm -- such as Monte-Carlo tree search -- and using the trained policy to guide it. The policy and the tree search can then iteratively improve each other, through experience gathered in self-play between instances of the guided tree search algorithm. This paper outlines three different approaches for manipulating the distribution of data collected from self-play, and the procedure that samples batches for learning updates from the collected data. Firstly, samples in batches are weighted based on the durations of the episodes in which they were originally experienced. Secondly, Prioritized Experience Replay is applied within the ExIt framework, to prioritise sampling experience from which we expect to obtain valuable training signals. Thirdly, a trained exploratory policy is used to diversify the trajectories experienced in self-play. This paper summarises the effects of these manipulations on training performance evaluated in fourteen different board games. We find major improvements in early training performance in some games, and minor improvements averaged over fourteen games.
As of February 2016 Facebook allows users to express their experienced emotions about a post by using five so-called 'reactions'. This research paper proposes and evaluates alternative methods for predicting t...
详细信息
Designing a logo is a long, complicated, and expensive process for any designer. However, recent advancements in generative algorithms provide models that could offer a possible solution. Logos are multi-modal, have v...
详细信息
This paper investigates the performance of different general-game-playing heuristics for games in the Ludii general game system. Based on these results, we train several regression learning models to predict the perfo...
详细信息
Domains such as logo synthesis, in which the data has a high degree of multi-modality, still pose a challenge for generative adversarial networks (GANs). Recent research shows that progressive training (ProGAN) and ma...
详细信息
Domains such as logo synthesis, in which the data has a high degree of multi-modality, still pose a challenge for generative adversarial networks (GANs). Recent research shows that progressive training (ProGAN) and mapping network extensions (StyleGAN) enable both increased training stability for higher dimensional problems and better feature separation within the embedded latent space. However, these architectures leave limited control over shaping the output of the network. This paper explores a conditional extension to the StyleGAN architecture with the aim of firstly, improving on the low resolution results of previous research and, secondly, increasing the controllability of the output through the use of synthetic class-conditions. Furthermore, methods of extracting such class conditions are explored, where the challenge lies in the fact that, visual logo characteristics are hard to define. The introduced conditional style-based generator architecture is trained on the extracted class-conditions in two experiments and studied relative to the performance of an unconditional model. Results show that, whilst the unconditional model more closely matches the training distribution, high quality conditions enabled the embedding of finer details onto the latent space, leading to more diverse output.
Many games often share common ideas or aspects between them, such as their rules, controls, or playing area. However, in the context of General Game Playing (GGP) for board games, this area remains under-explored. We ...
详细信息
暂无评论