This paper presents a novel multi-perspective document revision task. In conventional studies on document revision, tasks such as grammatical error correction, sentence reordering, and discourse relation classificatio...
详细信息
We describe our initiatives in creating smart cities using digital twin (DT) technology. In these Feature Articles on Urban DTC for Creating Optimized Smart Cities Attentive to the Individual, we first outline the con...
详细信息
To reduce the environmental burden, it is necessary to adjust the supply and demand of electricity so that the proportion of renewable energy is increased. For solar-power generation, which is widely used, total solar...
详细信息
Recurrent neural networks with a gating mechanism such as an LSTM or GRU are powerful tools to model sequential data. In the mechanism, a forget gate, which was introduced to control information flow in a hidden state...
详细信息
Machine learning is needed to build artificial intelligence (AI), and this requires a large amount of training data. Sometimes, however, you cannot get enough high-quality training data. What’s more, to prevent an AI...
详细信息
We propose a method for next-speaker prediction, a task to predict who speaks in the next turn among multiple current listeners, in multi-party video conversation. Previous studies used non-verbal features, such as he...
详细信息
We propose a method for next-speaker prediction, a task to predict who speaks in the next turn among multiple current listeners, in multi-party video conversation. Previous studies used non-verbal features, such as head movements and gaze behavior, for next-speaker prediction in face-to-face conversation. However, in video conversation, these non-verbal features are vague and ineffective because they look at the screen displaying other participants. Since non-verbal features include participant characteristics, it is necessary to use training data with rich combinations of participants to robustly predict the next speaker. Previous studies used training data with a limited number of combinations of participants because the data consist only of recorded data. Therefore, the proposed method uses 1) novel non-verbal features for next-speaker prediction in video conversation, specifically facial expressions, hand movements and speech segments, and 2) data augmentation of participant combinations in the training data. We conducted experiments to evaluate the proposed method, and the results using video-conversation data indicate its effectiveness.
This study proposes introducing facial-expression synchrony features to machine learning to estimate a customer’s psychological information from online business negotiation dialogue data. It is important for synchron...
This study proposes introducing facial-expression synchrony features to machine learning to estimate a customer’s psychological information from online business negotiation dialogue data. It is important for synchrony features to model the information on who led the synchrony and who followed it, the lead-lag structure, because the psychology of the leader and follower can differ. However, conventional synchrony models cannot incorporate such lead-lag structure information because they are based on the assumption that synchrony involves the co-occurrence of features in the same frame. To solve this problem, we propose using synchrony features extracted on the basis of windowed time-lagged cross-correlation, which cuts out a short segment from each of the input sequences and computes the cross-correlation between the segments. Since this method measures the similarity of signals across different frames, it is suitable for modeling the lead-lag structure. We conducted experiments based on an audio visual corpus of business negotiation dialogue assessed with various psychological measurements. The results indicate that considering lead-lag information can improve the accuracy in estimating psychological information.
Recurrent neural networks with a gating mechanism such as an LSTM or GRU are powerful tools to model sequential data. In the mechanism, a forget gate, which was introduced to control information flow in a hidden state...
详细信息
ISBN:
(纸本)9781665438599
Recurrent neural networks with a gating mechanism such as an LSTM or GRU are powerful tools to model sequential data. In the mechanism, a forget gate, which was introduced to control information flow in a hidden state in the RNN, has recently been re-interpreted as a representative of the time scale of the state, i.e., a measure how long the RNN retains information on inputs. On the basis of this interpretation, several parameter initialization methods to exploit prior knowledge on temporal dependencies in data have been proposed to improve learn-ability. However, the interpretation relies on various unrealistic assumptions, such as that there are no inputs after a certain time point. In this work, we reconsider this interpretation of the forget gate in a more realistic setting. We first generalize the existing theory on gated RNNs so that we can consider the case where inputs are successively given. We then argue that the interpretation of a forget gate as a temporal representation is valid when the gradient of loss with respect to the state decreases exponentially as time goes back. We empirically demonstrate that existing RNNs satisfy this gradient condition at the initial training phase on several tasks, which is in good agreement with previous initialization methods. On the basis of this finding, we propose an approach to construct new RNNs that can represent a longer time scale than conventional models, which will improve the learnability for long-term sequential data. We verify the effectiveness of our method by experiments with real-world datasets.
We propose a few-shot learning method for feature selection that can select relevant features given a small number of labeled instances. Existing methods require many labeled instances for accurate feature selection. ...
ISBN:
(纸本)9781713871088
We propose a few-shot learning method for feature selection that can select relevant features given a small number of labeled instances. Existing methods require many labeled instances for accurate feature selection. However, sufficient instances are often unavailable. We use labeled instances in multiple related tasks to alleviate the lack of labeled instances in a target task. To measure the dependency between each feature and label, we use the Hilbert-Schmidt Independence Criterion, which is a kernel-based independence measure. By modeling the kernel functions with neural networks that take a few labeled instances in a task as input, we can encode the task- specific information to the kernels such that the kernels are appropriate for the task. Feature selection with such kernels is performed by using iterative optimization methods, in which each update step is obtained as a closed-form. This formulation enables us to directly and efficiently minimize the expected test error on features selected by a small number of labeled instances. We experimentally demonstrate that the proposed method outperforms existing feature selection methods.
Many neural network-based out-of-distribution (OoD) detection methods have been proposed. However, they require many training data for each target task. We propose a simple yet effective meta-learning method to detect...
详细信息
暂无评论