The proceedings contain 224 papers. The topics discussed include: translation modeling with bidirectional recurrent neural networks;a neural network approach to selectional preference acquisition;learning image embedd...
ISBN:
(纸本)9781937284961
The proceedings contain 224 papers. The topics discussed include: translation modeling with bidirectional recurrent neural networks;a neural network approach to selectional preference acquisition;learning image embeddings using convolutional neural networks for improved multi-modal semantics;identifying argumentative discourse structures in persuasive essays;policy learning for domain selection in an extensible multi-domain spoken dialogue system;a constituent-based approach to argument labeling with joint inference in discourse parsing;strongly incremental repair detection;semi-supervised Chinese word segmentation using partial-label learning with conditional random fields;and accurate word segmentation and POS tagging for Japanese microblogs: corpus annotation and joint modeling with lexical normalization.
Poetry is one of the most beautiful forms of human language art. As a crucial step towards computer creativity, automatic poetry generation has drawn researchers' attention for decades. In recent years, some neura...
详细信息
ISBN:
(纸本)9781948087841
Poetry is one of the most beautiful forms of human language art. As a crucial step towards computer creativity, automatic poetry generation has drawn researchers' attention for decades. In recent years, some neural models have made remarkable progress in this task. However, they are all based on maximum likelihood estimation, which only learns common patterns of the corpus and results in loss-evaluation mismatch. Human experts evaluate poetry in terms of some specific criteria, instead of word-level likelihood. To handle this problem, we directly model the criteria and use them as explicit rewards to guide gradient update by reinforcement learning, so as to motivate the model to pursue higher scores. Besides, inspired by writing theories, we propose a novel mutual reinforcement learning schema. We simultaneously train two learners (generators) which learn not only from the teacher (rewarder) but also from each other to further improve performance. We experiment on Chinese poetry. Based on a strong basic model, our method achieves better results and outperforms the current state-of-the-art method.
For the WMT 2018 shared task of translating documents pertaining to the Biomedical domain, we developed a scoring formula that uses an unsophisticated and effective method of weighting term frequencies and was integra...
详细信息
In Visual Question Answering, most existing approaches adopt the pipeline of representing an image via pre-trained CNNs, and then using the uninterpretable CNN features in conjunction with the question to predict the ...
详细信息
ISBN:
(纸本)9781948087841
In Visual Question Answering, most existing approaches adopt the pipeline of representing an image via pre-trained CNNs, and then using the uninterpretable CNN features in conjunction with the question to predict the answer. Although such end-to-end models might report promising performance, they rarely provide any insight, apart from the answer, into the VQA process. In this work, we propose to break up the end-to-end VQA into two steps: explaining and reasoning, in an attempt towards a more explainable VQA by shedding light on the intermediate results between these two steps. To that end, we first extract attributes and generate descriptions as explanations for an image. Next, a reasoning module utilizes these explanations in place of the image to infer an answer. The advantages of such a breakdown include: (1) the attributes and captions can reflect what the system extracts from the image, thus can provide some insights for the predicted answer;(2) these intermediate results can help identify the inabilities of the image understanding or the answer inference part when the predicted answer is wrong. We conduct extensive experiments on a popular VQA dataset and our system achieves comparable performance with the baselines, yet with added benefits of explanability and the inherent ability to further improve with higher quality explanations.
We describe LMU Munich’s unsupervised machine translation systems for English↔German translation. These systems were used to participate in the WMT18 news translation shared task and more specifically, for the unsupe...
详细信息
Research must be reproducible in order to make an impact on science and to contribute to the body of knowledge in our field. Yet studies have shown that 70% of research from academic labs cannot be reproduced. In soft...
详细信息
ISBN:
(纸本)9781450356961
Research must be reproducible in order to make an impact on science and to contribute to the body of knowledge in our field. Yet studies have shown that 70% of research from academic labs cannot be reproduced. In software engineering, and more specifically requirements engineering (RE), reproducible research is rare, with datasets not always available or methods not fully described. This lack of reproducible research hinders progress, with researchers having to replicate an experiment from scratch A researcher starting out in RE has to sift through conference papers, finding ones that are empirical, then must look through the data available from the empirical paper (if any) to make a preliminary determination if the paper can be reproduced. This paper addresses two parts of that problem, identifying RE papers and identifying empirical papers within the RE papers. Recent RE and empiricalconference papers were used to learn features and to build an automatic classifier to identify RE and empirical papers. We introduce the empirical Requirements Research Classifier (ERRC) method, which uses naturallanguageprocessing and machine learning to perform supervised classification of conference papers. We compare our method to a baseline keyword-based approach To evaluate our approach, we examine sets of papers from the IEEE Requirements Engineering conference and the IEEE International Symposium on Software Testing and Analysis. We found that the ERRC method performed better than the baseline method in all but a few cases.
Automatically generating rich naturallanguage descriptions for open-domain videos is among the most challenging tasks of computer vision, naturallanguageprocessing and machine learning. Based on the general approac...
详细信息
ISBN:
(数字)9781510623118
ISBN:
(纸本)9781510623118
Automatically generating rich naturallanguage descriptions for open-domain videos is among the most challenging tasks of computer vision, naturallanguageprocessing and machine learning. Based on the general approach of encoder-decoder frameworks, we propose a bidirectional long short-term memory network with spatial-temporal attention based on multiple features of objects, activities and scenes, which can learn valuable and complementary high-level visual representations, and dynamically focus on the most important context information of diverse frames within different subsets of videos. From the experimental results, our proposed methods achieve competitive or better than state-of-the-art performance on the MSVD video dataset.
PatternAttribution is a recent method, introduced in the vision domain, that explains classifications of deep neural networks. We demonstrate that it also generates meaningful interpretations in the language domain. &...
详细信息
Sign language translation (SLT) aims to interpret sign video sequences into text-based naturallanguage sentences. Sign videos consist of continuous sequences of sign gestures with no clear boundaries in between. Exis...
ISBN:
(纸本)9781713829546
Sign language translation (SLT) aims to interpret sign video sequences into text-based naturallanguage sentences. Sign videos consist of continuous sequences of sign gestures with no clear boundaries in between. Existing SLT models usually represent sign visual features in a frame-wise manner so as to avoid needing to explicitly segmenting the videos into isolated signs. However, these methods neglect the temporal information of signs and lead to substantial ambiguity in translation. In this paper, we explore the temporal semantic structures of sign videos to learn more discriminative features. To this end, we first present a novel sign video segment representation which takes into account multiple temporal granularities, thus alleviating the need for accurate video segmentation. Taking advantage of the proposed segment representation, we develop a novel hierarchical sign video feature learning method via a temporal semantic pyramid network, called TSPNet. Specifically, TSPNet introduces an inter-scale attention to evaluate and enhance local semantic consistency of sign segments and an intra-scale attention to resolve semantic ambiguity by using non-local video context. Experiments show that our TSPNet outperforms the state-of-the-art with significant improvements on the BLEU score (from 9.58 to 13.41) and ROUGE score (from 31.80 to 34.96) on the largest commonly-used SLT dataset.
暂无评论