This article introduces the Tenth Dialog systemtechnology Challenge (DSTC-10). This edition of the DSTC focuses on applying end-to-end dialog technologies for five distinct tasks in dialog systems, namely 1. Incorpor...
详细信息
This paper explores the size-invariance of evaluation metrics in Salient Object Detection (SOD), especially when multiple targets of diverse sizes co-exist in the same image. We observe that current metrics are size-s...
This paper explores the size-invariance of evaluation metrics in Salient Object Detection (SOD), especially when multiple targets of diverse sizes co-exist in the same image. We observe that current metrics are size-sensitive, where larger objects are focused, and smaller ones tend to be ignored. We argue that the evaluation should be size-invariant because bias based on size is unjustified without additional semantic information. In pursuit of this, we propose a generic approach that evaluates each salient object separately and then combines the results, effectively alleviating the imbalance. We further develop an optimization framework tailored to this goal, achieving considerable improvements in detecting objects of different sizes. Theoretically, we provide evidence supporting the validity of our new metrics and present the generalization analysis of SOD. Extensive experiments demonstrate the effectiveness of our method. The code is available at https://***/Ferry-Li/SI-SOD.
This paper explores the size-invariance of evaluation metrics in Salient Object Detection (SOD), especially when multiple targets of diverse sizes co-exist in the same image. We observe that current metrics are size-s...
详细信息
Multi-behavioral recommender systems have emerged as a solution to address data sparsity and cold-start issues by incorporating auxiliary behaviors alongside target behaviors. However, existing models struggle to accu...
详细信息
Recently published graph neural networks (GNNs) show promising performance at social event detection tasks. However, most studies are oriented toward monolingual data in languages with abundant training samples. This ...
详细信息
Artificial intelligence (AI) empowered edge computing has given rise to a new paradigm and effectively facilitated the promotion and development of multimedia applications. The speech assistant is one of the significa...
详细信息
Artificial intelligence (AI) empowered edge computing has given rise to a new paradigm and effectively facilitated the promotion and development of multimedia applications. The speech assistant is one of the significant services provided by multimedia applications, which aims to offer intelligent interactive experiences between humans and machines. However, malicious attackers may exploit spoofed speeches to deceive speech assistants, posing great challenges to the security of multimedia applications. The limited resources of multimedia terminal devices hinder their ability to effectively load speech spoofing detection models. Furthermore, processing and analyzing speech in the cloud can result in poor real-time performance and potential privacy risks. Existing speech spoofing detection methods rely heavily on annotated data and exhibit poor generalization capabilities for unseen spoofed speeches. To address these challenges, this paper first proposes the Coordinate Attention Network (CA2Net) that consists of coordinate attention blocks and Res2Net blocks. CA2Net can simultaneously extract temporal and spectral speech feature information and represent multi-scale speech features at a granularity level. Besides, a contrastive learning-based speech spoofing detection framework named GEMINI is proposed. GEMINI can be effectively deployed on edge nodes and autonomously learn speech features with strong generalization capabilities. GEMINI first performs data augmentation on speech signals and extracts conventional acoustic features to enhance the feature robustness. Subsequently, GEMINI utilizes the proposed CA2Net to further explore the discriminative speech features. Then, a tensor-based multi-attention comparison model is employed to maximize the consistency between speech contexts. GEMINI continuously updates CA2Net with contrastive learning, which enables CA2Net to effectively represent speech signals and accurately detect spoofed speeches. Extensive experiments on
Due to the "double fading" effect caused by conventional passive intelligent reflecting surface (IRS), the signal via the reflection link is weak. To enhance the received signal, active elements with the abi...
详细信息
We revisited ab initio evaluations of the energy barriers along the possible diffusion paths of the defects in rutile TiO2. By using a method carefully considering the cancellation of the self-interaction, Ti intersti...
详细信息
The AAAI-14 Workshop program was held Sunday and Monday, July 27-28, 2014, at the Québec City Convention Centre in Québec, Canada. The AAAI-14 workshop program included 15 workshops covering a wide range of ...
详细信息
暂无评论