检索结果-内蒙古大学图书馆

Multimodal Emotion-Cause Pair Extraction with Holistic Interaction and Label Constraint

ACM Transactions on Multimedia computing, Communications, and Applications 1000年

作者： Bobo Li Hao Fei Fei Li Tat-seng Chua Donghong Ji Key Laboratory of Aerospace Information Security and Trusted Computing Ministry of Education School of Cyber Science and Engineering Wuhan University China School of Computing National University of Singapore Singapore School of Cyber Science and Engineering Wuhan University China

The multimodal emotion-cause pair extraction (MECPE) task aims to detect the emotions, causes, and emotion-cause pairs from multimodal conversations. Existing methods for this task typically concatenate representations of each utterance from distinct modalities and then predict emotion-cause pairs directly. This approach struggles to effectively integrate multimodal features and capture the subtleties of emotion transitions, which are crucial for accurately identifying causes—thereby limiting overall performance. To address these challenges, we propose a novel model that captures holistic interaction and label constraint (HiLo) features for the MECPE task. HiLo facilitates cross-modality and cross-utterance feature interaction with various attention mechanisms, establishing a robust foundation for precise cause extraction. Notably, our model innovatively leverages emotion transition features as pivotal cues to enhance causal inference within conversations. The experimental results demonstrate the superior performance of HiLo, evidenced by an increase of more than 2% in the F1 score compared to existing benchmarks. Further analysis reveals that our approach adeptly utilizes multimodal and dialogue features, making a significant contribution to the field of emotion-cause analysis. Our code is publicly available at https://***/MVdYmx.

关键词： Multimodal Learning Emotion Recognition

来源：评论

学校读者我要写书评

暂无评论

Intention-sensitive Preference Learning Network for Personalized Session-based Recommendation

引用

ACM Transactions on Recommender Systems 1000年

作者： Qingbo Zhang Xiaochun Yang Hao Chen Bin Wang Xiangmin Zhou Computer Science and Engineering Northeastern University Shenyang China Northeastern University Shenyang China National Frontiers Science Center for Industrial Intelligence and Systems Optimization Shenyang China Key Laboratory of Data Analytics and Optimization for Smart Industry Northeastern University Ministry of Education Shenyang China School of Computing Technologies RMIT University Melbourne Australia

Nowadays, research on session-based recommender systems (SRSs) is one of the hot spots in the recommendation domain. Existing methods make recommendations based on the user’s current intention (also called short-term preference) during a session, often overlooking the specific preferences associated with these intentions. In reality, users usually exhibit diverse preferences for different intentions, and even for the same intention, individual preferences can vary significantly between users. As users interact with items throughout a session, their intentions can shift accordingly. To enhance recommendation quality, it is crucial not only to consider the user’s intentions but also to dynamically learn their varying preferences as these intentions change. In this paper, we propose a novel Intention-sensitive Preference Learning Network (IPLN) including three main modules: intention recognizer, preference detector, and prediction layer. Specifically, the intention recognizer infers the user’s underlying intention within his/her current session by analyzing complex relationships among items. Based on the acquired intention, the preference detector learns the intention-specific preference by selectively integrating latent features from items in the user’s historical sessions. Besides, the user’s general preference is utilized to refine the obtained preference to reduce the potential noise carried from historical records. Ultimately, the fine-tuned preference and intention collaborate to instruct the next-item recommendation in the prediction layer. To prove the effectiveness of the proposed IPLN, we perform extensive experiments on two real-world datasets. The experiment results demonstrate the superiority of IPLN compared with other state-of-the-art models.

关键词： session-based recommender systems personalized next-item recommendation user intention intention-specific preference

来源：评论

学校读者我要写书评

暂无评论

Large-Scale Tensorized Multi-View Kernel Subspace Clustering

引用

ACM Transactions on Intelligent Systems and Technology 1000年

作者： Guang-Yu Zhang Dong Huang Chang-Dong Wang College of Mathematics and Informatics South China Agricultural University China School of Computer Science and Engineering Sun Yat-sen University Guangxi Key Laboratory of Digital Infrastructure Guangxi Zhuang Autonomous Region Information Center Guangdong Key Laboratory of Big Data Analysis and Processing Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China

The Anchor-based Multi-view Subspace Clustering (AMSC) has turned into a favourable tool for large-scale multi-view clustering. However, there still exist some limitations to the current AMSC approaches. First, they typically recover anchor graph structure in the original linear space, restricting their feasibility for nonlinear scenarios. Second, they usually overlook the potential benefits of jointly capturing the inter-view and intra-view information for enhancing the anchor representation learning. Third, these approaches mostly perform anchor-based subspace learning by a specific matrix norm, neglecting the latent high-order correlation across different views. To overcome these limitations, this paper presents an efficient and effective approach termed Large-scale Tensorized Multi-view Kernel Subspace Clustering (LTKMSC). Different from the existing AMSC approaches, our LTKMSC approach exploits both inter-view and intra-view awareness for anchor-based representation building. Concretely, the low-rank tensor learning is leveraged to capture the high-order correlation (i.e., the inter-view complementary information) among distinct views, upon which the \(l_{1,2}\) norm is imposed to explore the intra-view anchor graph structure in each view. Moreover, the kernel learning technique is leveraged to explore the nonlinear anchor-sample relationships embedded in multiple views. With the unified objective function formulated, an efficient optimization algorithm that enjoys low computational complexity is further designed. Extensive experiments on a variety of multi-view datasets have confirmed the efficiency and effectiveness of our approach when compared with the other competitive approaches.

关键词： Data clustering Multi-view clustering Large-scale clustering Tensorized kernel subspace clustering Inter-view and intra-view awareness

来源：评论

学校读者我要写书评

暂无评论

Scaled Background Swap: Video Augmentation for Action Quality Assessment with Background Debiasing

引用

ACM Transactions on Multimedia computing, Communications, and Applications 1000年

作者： Xin Zhang Hongzhi Feng M. Shamim Hossain Yinzhuo Chen Hongbo Wang Yuyu Yin Hangzhou Dianzi University China Key Laboratory of Complex Systems Modeling and Simulation Ministry of Education China Zhoushan Tongbo Marine Electronic Information Research Institute Hangzhou Dianzi University China and Yunnan Key Laboratory of Service Computing Yunnan University of Finance and Economics China Hangzhou Dianzi University China Department of Software Engineering College of Computer and Information Sciences King Saud University Saudi Arabia Hangzhou Dianzi University China Key Laboratory of Complex Systems Modeling and Simulation Ministry of Education China and Zhoushan Tongbo Marine Electronic Information Research Institute Hangzhou Dianzi University China

Action Quality Assessment (AQA) has become crucial in video analysis, finding wide applications in various domains, such as healthcare and sports. A significant challenge faced by AQA is the background bias due to the dominance of the background in videos. Especially, the background bias tends to overshadow subtle foreground differences, which is crucial for precise action evaluation. To address the background bias issue, we propose a novel data augmentation method named Scaled Background Swap. Firstly, the background regions between different video samples are swapped to guide models focus toward the dynamic foreground regions and mitigate its sensitivity to the background during training. Secondly, the video’s foreground region is up-scaled to further enhance models’ attention to the critical foreground action information for AQA tasks. In particular, the proposed Scaled Background Swap method can effectively improve models’ accuracy and generalization by prioritizing foreground motion and swapping backgrounds. It can be flexibly applied with various video analysis models. Extensive experiments on AQA benchmarks demonstrate that Scaled Background Swap method achieves better performance than baselines. Specifically, the Spearman’s rank correlation on datasets AQA-7 and MTL-AQA reaches 0.8870 and 0.9526, respectively. The code is available at: https://***/Emy-cv/Scaled-Background Swap.

关键词： Background Swap Foreground Up-scale Data Augmentation Background Bias Action Quality Assessment

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：