In the process of iron and steel smelting, steel slag is inevitably produced as a byproduct. Accurately identifying steel slag is a prerequisite for controlling the content of steel slag. Conventional vision-based ste...
详细信息
Kolmogorov-Arnold Networks (KAN) is an emerging neural network architecture in machine learning. It has greatly interested the research community about whether KAN can be a promising alternative to the commonly used M...
详细信息
Automatic code summarization aims to generate concise natural language descriptions (summary) for source code, which can free software developers from the heavy burden of manual commenting and software maintenance. Ex...
详细信息
In real-world physiological and psychological scenarios, there often exists a robust complementary correlation between audio and visual signals. Audio-Visual Event Localization (AVEL) aims to identify segments with Au...
详细信息
In real-world physiological and psychological scenarios, there often exists a robust complementary correlation between audio and visual signals. Audio-Visual Event Localization (AVEL) aims to identify segments with Audio-Visual Events (AVEs) that contain both audio and visual tracks in unconstrained videos. Prior studies have predominantly focused on audio-visual cross-modal fusion methods, overlooking the fine-grained exploration of the cross-modal information fusion mechanism. Moreover, due to the inherent heterogeneity of multi-modal data, inevitable new noise is introduced during the audio-visual fusion process. To address these challenges, we propose a novel Cross-modal Contrastive Learning Network (CCLN) for AVEL, comprising a backbone network and a branch network. In the backbone network, drawing inspiration from physiological theories of sensory integration, we elucidate the process of audio-visual information fusion, interaction, and integration from an information-flow perspective. Notably, the Self-constrained Bi-modal Interaction (SBI) module is a bi-modal attention structure integrated with audio-visual fusion information, and through gated processing of the audio-visual correlation matrix, it effectively captures inter-modal correlation. The Foreground Event Enhancement (FEE) module emphasizes the significance of event-level boundaries by elongating the distance between scene events during training through adaptive weights. Furthermore, we introduce weak video-level labels to constrain the cross-modal semantic alignment of audio-visual events and design a weakly supervised cross-modal contrastive learning loss (WCCL Loss) function, which enhances the quality of fusion representation in the dual-branch contrastive learning framework. Extensive experiments conducted on the AVE dataset for both fully supervised and weakly supervised event localization, as well as Cross-Modal Localization (CML) tasks, demonstrate the superior performance of our model compa
Most existing Salient Object Detection (SOD) methods focus on achieving better performance, often resulting in models with a large number of parameters. However, there is limited research on lightweight models in this...
详细信息
Detailed system operations are recorded in logs. To ensure system reliability, developers can detect system anomalies through log anomaly detection. Log parsing, which converts semi-structured log messages into struct...
详细信息
Malware detection is a critical issue in software engineering as it directly threatens user information security. Existing approaches often focus on individual modality (either source code or binary code) for the dete...
详细信息
Because of server-untrustable model in outsourcing encrypted database system, it’s fatal for data owners to keep their data confidential under complex query operations performed by database servers. Trusted...
详细信息
Continuous cognitive diagnosis models (CDMs) are vital tools for assessing students’ mastery of knowledge points. However, traditional probability-based CDMs are prone to falling into local optima due to their u...
详细信息
Deep-learning-based super-resolution (SR) methods for a single hyperspectral image have made significant progress in recent years and become an important research direction in remote sensing. Existing methods perform ...
详细信息
暂无评论