In this paper, we address the problem of recognizing group activities that include interactions between human objects based on their motion trajectory analysis. In order to resolve the complexity and ambiguity problem...
详细信息
ISBN:
(纸本)9781479952106
In this paper, we address the problem of recognizing group activities that include interactions between human objects based on their motion trajectory analysis. In order to resolve the complexity and ambiguity problems caused by a large number of human objects, we propose a Group Interaction Zone (GIZ) to detect meaningful groups in a scene so as to be robust against noisy information. Two novel features, Group Interaction Energy feature and Attraction and Repulsion Features, are proposed to better describe group activities within a GIZ. We demonstrate the effectiveness of our method with other methods on the public BEHAVE dataset.
In this paper, we propose an improved graph model for Chinese spell checking. The model is based on a graph model for generic errors and two independentlytrained models for specific errors. First, a graph model repres...
详细信息
We propose a simple and effective approach to learn translation spans for the hierarchical phrase-based translation model. Our model evaluates if a source span should be covered by translation rules during decoding, w...
详细信息
Determination of pitch in noise is challenging because of corrupted harmonic structure. In this paper, we extract pitch using supervised learning, where probabilistic pitch states are directly learned from noisy speec...
详细信息
ISBN:
(纸本)9781479928941
Determination of pitch in noise is challenging because of corrupted harmonic structure. In this paper, we extract pitch using supervised learning, where probabilistic pitch states are directly learned from noisy speech. We investigate two alternative neural networks modeling the pitch states given observations. The first one is the feedforward deep neural network (DNN), which is trained on static frame-level features. The second one is the recurrent deep neural network (RNN) capable of learning the temporal dynamics trained on sequential frame-level features. Both DNNs and RNNs produce accurate probabilistic outputs of pitch states, which are then connected into pitch contours by Viterbi decoding. Our systematic evaluation shows that the proposed pitch tracking approaches are robust to different noise conditions and significantly outperform current state-of-the-art pitch tracking techniques.
Reverberation distorts human speech and usually has negative effects on speech intelligibility, especially for hearing-impaired listeners. It also causes performance degradation in automatic speech recognition and spe...
详细信息
ISBN:
(纸本)9781479928941
Reverberation distorts human speech and usually has negative effects on speech intelligibility, especially for hearing-impaired listeners. It also causes performance degradation in automatic speech recognition and speaker identification systems. Therefore, the dereverberation problem must be dealt with in daily listening environments. We propose to use deep neural networks (DNNs) to learn a spectral mapping from the reverberant speech to the anechoic speech. The trained DNN produces the estimated spectral representation of the corresponding anechoic speech. We demonstrate that distortion caused by reverberation is substantially attenuated by the DNN whose outputs can be resynthesized to the dereverebrated speech signal. The proposed approach is simple, and our systematic evaluation shows promising dereverberation results, which are significantly better than those of related systems.
Non-invasive brain-computer interfaces (BCIs) allow users to control external devices by their intentions. Currently, most BCI systems are synchronous, which means, they rely on cues or tasks to which a subject has to...
详细信息
Non-invasive brain-computer interfaces (BCIs) allow users to control external devices by their intentions. Currently, most BCI systems are synchronous, which means, they rely on cues or tasks to which a subject has to react. It would be more useful for users if they could control a device at their own will (i.e., asynchronous BCIs). However, previous asynchronous BCI systems that rely on non-invasive electroencephalogram (EEG) measurements, are not accurate and stable enough for real world applications. Previously, hybrid BCI systems, relying on simultaneous EEG and near-infrared spectroscopy (NIRS) measurements, have been shown to increase the classification performance of synchronous motor imagery (MI) tasks. In this study, we present a first report on an asynchronous multi-modal hybrid BCI, based on simultaneous EEG and near-infrared spectroscopy (NIRS) measurements and propose novel subject-dependent classification strategies for combining both measurements.
Since larger n-gram Language Model (LM) usually performs better in Statistical Machine Translation (SMT), how to construct efficient large LM is an important topic in SMT. However, most of the existing LM growing meth...
详细信息
Algorithms using concepts from information geometry have recently become very popular in machine learning and signal processing. These methods not only have a solid mathematical foundation but they also allow to inter...
详细信息
Algorithms using concepts from information geometry have recently become very popular in machine learning and signal processing. These methods not only have a solid mathematical foundation but they also allow to interpret the optimization process and the solution from a geometric perspective. In this paper we apply information geometry to brain-computer Interfacing (BCI). More precisely, we show that the spatial filter computation in BCI can be cast into an information geometric framework based on divergence maximization. This formulation not only allows to integrate many of the recently proposed CSP algorithms in a principled manner, but also enables us to easily develop novel CSP variants with different properties. We evaluate the potentials of our information geometric framework on a data set containing recordings from 80 subjects.
Neuro-driving simulation framework was proposed in this article for studying neural correlates of braking intention in diversified driving situations. In addition, the possibility was investigated that these neural co...
详细信息
Neuro-driving simulation framework was proposed in this article for studying neural correlates of braking intention in diversified driving situations. In addition, the possibility was investigated that these neural correlates can be used to detect a participant's braking intention prior to the behavioral response. Electroencephalographic (EEG) and electromyographic (EMG) signals were measured from fifteen participants during they were exposed to several kinds of traffic situations in a neuro-driving simulation framework. After that, the novel characteristic feature was extracted from the measured signals to categorize according to whether the driver intended to brake or not. This proposed novel feature consists of readiness potential (RP), event-related desynchronization (ERD) and event-related potential (ERP) as used in a previous study. The prediction performance of braking intention based on the proposed feature combination exhibited superior prediction performance than simple ERP feature used in a previous study.
Conventional paradigms of machine learning assume all the training data are available when learning starts. However, in lifelong learning, the examples are observed sequentially as learning unfolds, and the learner sh...
详细信息
暂无评论