There is an imminent need for guidelines and standard test sets to allow direct and fair comparisons of speech emotion recognition (SER). While resources, such as the Interactive Emotional Dyadic Motion Capture (IEMOC...
详细信息
There is an imminent need for guidelines and standard test sets to allow direct and fair comparisons of speech emotion recognition (SER). While resources, such as the Interactive Emotional Dyadic Motion Capture (IEMOCAP) database, have emerged as widely-adopted reference corpora for researchers to develop and test models for SER, published work reveals a wide range of assumptions and variety in its use that challenge reproducibility and generalization. Based on a critical review of the latest advances in SER using IEMOCAP as the use case, our work aims at two contributions: First, using an analysis of the recent literature, including assumptions made and metrics used therein, we provide a set of SER evaluation guidelines. Second, using recent publications with open-sourced implementations, we focus on reproducibility assessment in SER.
This paper presents a system to detect symptoms of allergic rhinitis remotely by using uttered speech and by exploiting its specific spectral characteristics. Based on the principles of adaptive modeling and fundament...
详细信息
Task-oriented dialogue systems often employ a Dialogue State Tracker (DST) to successfully complete conversations. Recent state-of-the-art DST implementations rely on schemata of diverse services to improve model robu...
详细信息
Recent deep learning Text-to-speech (TTS) systems have achieved impressive performance by generating speech close to human parity. However, they suffer from training stability issues as well as incorrect alignment of ...
详细信息
We present a large multi-signer video corpus for the Greek Sign language (GSL), suitable for the development and evaluation of GSL recognition algorithms. The database has been collected as part of the “SL-ReDu” pro...
We present a large multi-signer video corpus for the Greek Sign language (GSL), suitable for the development and evaluation of GSL recognition algorithms. The database has been collected as part of the “SL-ReDu” project that focuses on the education use-case of systematic teaching of GSL as a second language (L2). The project aims to assist this process by allowing self-monitoring and objective assessment of GSL learners’ productions through the use of recognition technology, thus requiring suitable data resources relevant to the aforementioned use-case. To this end, we present the SL-ReDu GSL corpus, an extensive RGB+D video collection of 21 informants with a duration of 36 hours, recorded under studio conditions, consisting of: (i) isolated signs; (ii) continuous signing (annotated at the sentence level); and (iii) fingerspelling of words. We provide a detailed description of the design and acquisition methods used to develop it, along with corpus statistics and a comparison to existing sign language datasets. The SL-ReDu GSL corpus, as well as proposed frameworks for recognition experiments on it, are publicly available at https://***/corpus.
Hyperledger Fabric is an open-source private permissioned blockchain that supports the use of smart contracts (chaincode). It is aimed mainly at private networks of companies. To serve the different needs of each comp...
详细信息
Hyperledger Fabric is an open-source private permissioned blockchain that supports the use of smart contracts (chaincode). It is aimed mainly at private networks of companies. To serve the different needs of each company and to be flexible in customer requirements, it consists of various adaptive components. Although this structure efficiently addresses a wide range of needs, deploying such a network for research purposes or rapid development is complex. In this paper, we present a web-based system architecture for the automated deployment of a Hyperledger Fabric network, and in addition, we describe the tools needed to manage and update such a network. Finally, as a proof-of-concept, we implement the proposed architecture to demonstrate the feasibility of our approach.
There is an imminent need for guidelines and standard test sets to allow direct and fair comparisons of speech emotion recognition (SER). While resources, such as the Interactive Emotional Dyadic Motion Capture (IEMOC...
详细信息
Designing powerful adversarial attacks is of paramount importance for the evaluation of p-bounded adversarial defenses. Projected Gradient Descent (PGD) is one of the most effective and conceptually simple algorithms ...
详细信息
The field of automatic music composition has seen great progress in recent years, specifically with the invention of transformer-based architectures. When using any deep learning model which considers music as a seque...
详细信息
Starting in 2003 when the first MWE workshop was held with ACL in Sapporo, Japan, this year, the joint workshop of MWE-UD co-located with the LREC-COLING 2024 conference marked the 20th anniversary of MWE workshop eve...
暂无评论