Classification is an essential task in data mining, machine learning and pattern recognition *** classification models focus on distinctive samples from different categories. There are fine-grained differences between...
详细信息
Classification is an essential task in data mining, machine learning and pattern recognition *** classification models focus on distinctive samples from different categories. There are fine-grained differences between data instances within a particular category. These differences form the preference information that is essential for human learning, and, in our view, could also be helpful for classification models. In this paper, we propose a preference-enhanced support vector machine(PSVM), that incorporates preference-pair data as a specific type of supplementary information into SVM. Additionally, we propose a two-layer heuristic sampling method to obtain effective preference-pairs, and an extended sequential minimal optimization(SMO)algorithm to fit PSVM. To evaluate our model, we use the task of knowledge base acceleration-cumulative citation recommendation(KBA-CCR) on the TREC-KBA-2012 dataset and seven other datasets from UCI,Stat Lib and ***. The experimental results show that our proposed PSVM exhibits high performance with official evaluation metrics.
Nowadays surveillance systems are becoming increasingly complex by combining a variety of sensors and systems in order to deliver more accurate decisions. This is due to the fact that the development of a simple and y...
详细信息
We introduce Zipporah, a fast and scalable data cleaning system. We propose a novel type of bag-of-words translation feature, and train logistic regression models to classify good data and synthetic noisy data in the ...
详细信息
The overall goal of this study was to identify an objective physiological correlate of electric-acoustic pitch matching in unilaterally implanted cochlear implant (CI) participants with residual hearing in the non-imp...
详细信息
Various informative factors mixed in speech signals, leading to great difficulty when decoding any of the factors. An intuitive idea is to factorize each speech frame into individual informative factors, though it tur...
详细信息
ISBN:
(纸本)9781538646595
Various informative factors mixed in speech signals, leading to great difficulty when decoding any of the factors. An intuitive idea is to factorize each speech frame into individual informative factors, though it turns out to be highly difficult. Recently, we found that speaker traits, which were assumed to be long-term distributional properties, are actually short-time patterns, and can be learned by a carefully designed deep neural network (DNN). This discovery motivated a cascade deep factorization (CDF) framework that will be presented in this paper. The proposed framework infers speech factors in a sequential way, where factors previously inferred are used as conditional variables when inferring other factors. We will show that this approach can effectively factorize speech signals, and using these factors, the original speech spectrum can be recovered with a high accuracy. This factorization and reconstruction approach provides potential values for many speechprocessing tasks, e.g., speaker recognition and emotion recognition, as will be demonstrated in the paper.
In recent studies, it has shown that speaker patterns can be learned from very short speech segments (e.g., 0.3 seconds) by a carefully designed convolutional & time-delay deep neural network (CT-DNN) model. By en...
详细信息
ISBN:
(纸本)9781538646595
In recent studies, it has shown that speaker patterns can be learned from very short speech segments (e.g., 0.3 seconds) by a carefully designed convolutional & time-delay deep neural network (CT-DNN) model. By enforcing the model to discriminate the speakers in the training data, frame-level speaker features can be derived from the last hidden layer. In spite of its good performance, a potential problem of the present model is that it involves a parametric classifier, i.e., the last affine layer, which may consume some discriminative knowledge, thus leading to 'information leak' for the feature learning. This paper presents a full-info training approach that discards the parametric classifier and enforces all the discriminative knowledge learned by the feature net. Our experiments on the Fisher database demonstrate that this new training scheme can produce more coherent features, leading to consistent and notable performance improvement on the speaker verification task.
Various informative factors mixed in speech signals, leading to great difficulty when decoding any of the factors. An intuitive idea is to factorize each speech frame into individual informative factors, though it tur...
详细信息
This paper proposes a packet loss concealment (PLC) technique for increase the robustness of automatic speech recognition (ASR) of speech coded with the G729 codec, on the Voice over Internet Protocol (VoIP). Many of ...
详细信息
This paper proposes a packet loss concealment (PLC) technique for increase the robustness of automatic speech recognition (ASR) of speech coded with the G729 codec, on the Voice over Internet Protocol (VoIP). Many of the standard ITU-T CELP based speech coders, such as the G.723.1, G.728, and G.729, model speech reproduction in their decoders. These decoders have enough state information to integrate PLC algorithms directly in the decoder, and are specified as part of their standards in particular by PLC based ITU-T G711 Appendix I. speech is transmitted with source and channel codes optimized, this channel is simulated by two states Markov model to modeled loss packets. The objective of PLC based ITU-T G711 Appendix I is to generate a synthetic speech signal to cover missing data or loss packets in a received bit stream for the ASR application, i.e., to minimize word error rate.
We propose a process for investigating the extent to which sentence representations arising from neural machine translation (NMT) systems encode distinct semantic phenomena. We use these representations as features to...
详细信息
The Multitarget Challenge aims to assess how well current speech technology is able to determine whether or not a recorded utterance was spoken by one of a large number of "blacklisted" speakers. It is a for...
详细信息
暂无评论