Electroencephalogram (EEG) signals provide an important pathway to reflect brain activations, from which auditory attention clues of the listener can be decoded, termed as auditory attention decoding (AAD). However, e...
详细信息
This paper proposes a novel neural audio codec, named APCodec+, which is an improved version of APCodec. The APCodec+ takes the audio amplitude and phase spectra as the coding object, and employs an adversarial traini...
详细信息
In our previous work, we have proposed a neural vocoder called APNet, which directly predicts speech amplitude and phase spectra with a 5 ms frame shift in parallel from the input acoustic features, and then reconstru...
详细信息
Using sarcasm on social media platforms to express negative opinions towards a person or object has become increasingly ***,detecting sarcasm in various forms of communication can be difficult due to conflicting *** t...
详细信息
Using sarcasm on social media platforms to express negative opinions towards a person or object has become increasingly ***,detecting sarcasm in various forms of communication can be difficult due to conflicting *** this paper,we introduce a contrasting sentiment-based model for multimodal sarcasm detection(CS4MSD),which identifies inconsistent emotions by leveraging the CLIP knowledge module to produce sentiment features in both text and ***,five external sentiments are introduced to prompt the model learning sentimental preferences among ***,we highlight the importance of verbal descriptions embedded in illustrations and incorporate additional knowledge-sharing modules to fuse such imagelike *** results demonstrate that our model achieves state-of-the-art performance on the public multimodal sarcasm dataset.
We describe a set of new methods to partially automate linguistic phylogenetic inference given (1) cognate sets with their respective protoforms and sound laws, (2) a mapping from phones to their articulatory features...
详细信息
This paper presents our machine translation system that was developed for the WAT2024 MultiIndic MT shared task. We built our system for the Sindhi-English language pair. We developed two MT systems. The first system ...
详细信息
In this paper, a Packet Loss Concealment (PLC) algorithm is proposed for G723.1 CELP-type speech coders in order to improve the quality of decoded speech in VoIP under burst packet loss. The original PLC method implem...
详细信息
Lip-to-speech (Lip2speech) synthesis, which predicts corresponding speech from talking face images, has witnessed significant progress with various models and training strategies in a series of independent studies. Ho...
详细信息
This paper studies the task of speech reconstruction from ultrasound tongue images and optical lip videos recorded in a silent speaking mode, where people only activate their intra-oral and extra-oral articulators wit...
详细信息
In this paper, we describe our submitted systems to the ADD2023 Challenge Track 3–Deepfake algorithm recognition (AR). This task requires not only identifying known deepfake algorithms in closed-set but also distingu...
详细信息
暂无评论