The mismatch between close-set training and open-set testing usually leads to significant performance degradation for speaker verification task. For existing loss functions, metric learning-based objectives depend str...
详细信息
Audio-visual speech recognition (AVSR) takes advantage of noise-invariant visual information to improve the robustness of automatic speech recognition (ASR) systems. While previous works mainly focused on the clean co...
详细信息
In this work, we present DiffVoice, a novel text-to-speech model based on latent diffusion. We propose to first encode speech signals into a phoneme-rate latent representation with a variational autoencoder enhanced b...
详细信息
In the process of developing the C919 large aircraft customer service intelligence system,we find that heterogeneous and incomplete data cause the inefficient and inaccurate decision ***,to solve this problem,we propo...
详细信息
In the process of developing the C919 large aircraft customer service intelligence system,we find that heterogeneous and incomplete data cause the inefficient and inaccurate decision ***,to solve this problem,we propose to introduce the idea of ontology modeling and reasoning into competitive intelligence system building in this *** first present the building principles and methods of the civil aviation customer service *** then define the classes and properties to contribute a real-world civil aviation customer service ontology,which is published on the Web(http:/***/dataset/cacso).We finally design SWRL rules corresponding to different intelligence analysis targets to support reasoning in our designed competitive intelligence system.
Assumable Logic Programming (ALP), an extension of Answer Set Programming (ASP), has been theoretically demonstrated to possess significant advantages in addressing problems involving incomplete information. Therefore...
详细信息
Traffic flow prediction is a critical issue in transportation engineering and presents distinct challenges when handling large-scale datasets in the real world. Existing complex spatio-temporal forecasting paradigms u...
详细信息
Large margin fine-tuning (LMFT) is an effective strategy to improve the speaker verification system's performance and is widely used in speaker verification challenge systems. Because the large margin in the loss ...
详细信息
End-to-end automatic speech recognition (ASR) systems have gained popularity given their simplified architecture and promising results. However, text-only domain adaptation remains a big challenge for E2E systems. Tex...
详细信息
Traditional automatic speech recognition (ASR) systems usually focus on individual utterances, without considering long-form speech with useful historical information, which is more practical in real scenarios. Simply...
详细信息
Although current neural text-to-speech (TTS) models are able to generate high-quality speech, intensity controllable emotional TTS is still a challenging task. Most existing methods need external optimizations for int...
详细信息
暂无评论