检索结果-内蒙古大学图书馆

16th International Conference on Modelling, Identification and Control, ICMIC 2024

作者： Liu, Kecheng Li, Jiangyun Yuan, Li School of Automation and Electrical Engineering University of Science and Technology Beijing Beijing100083 China Key Laboratory of Knowledge Automation for Industrial Processes Ministry of Education Beijing100083 China

ISBN: (纸本)9789819617760

In the process of iron and steel smelting, steel slag is inevitably produced as a byproduct. Accurately identifying steel slag is a prerequisite for controlling the content of steel slag. Conventional vision-based steel slag processing solutions are constrained by real-time performance or robustness against disturbances, rendering them directly applied in industrial sites. To address these issues, SlagNet, a steel slag segmentation network with real-time performance and strong disturbance resistance, is designed in this paper. By designing a lightweight branch to extract rich context information, and combining edge branches to supplement the slag edge information, 77.69 mIOU and 13.25 FPS are achieved in the field collection of slagging data set. The results indicate that SlagNet achieves the optimal balance between real-time performance and accuracy in comparison with other segmentation methods, thereby meeting the requirements of industrial sites. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： Slags

来源：评论

学校读者我要写书评

暂无评论

KAN v.s. MLP for Offline Reinforcement Learning

KAN v.s. MLP for Offline Reinforcement Learning

引用

2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025

作者： Guo, Haihong Li, Fengxin Li, Jiao Liu, Hongyan School of Information Renmin University of China China Institute of Medical Information Medical Library Chinese Academy of Medical Sciences Peking Union Medical College China Key Laboratory of Data Engineering and Knowledge Engineering Ministry of Education China School of Economics and Management Tsinghua University China

ISBN: (纸本)9798350368741

Kolmogorov-Arnold Networks (KAN) is an emerging neural network architecture in machine learning. It has greatly interested the research community about whether KAN can be a promising alternative to the commonly used Multi-Layer Perceptions (MLP). Experiments in various fields demonstrated that KAN-based machine learning can achieve comparable if not better performance than MLP-based methods, but with much smaller parameter scales and are more explainable. In this paper, we explore the incorporation of KAN into the actor and critic networks for offline reinforcement learning (RL). We evaluated the performance, parameter scales, and training efficiency of various KAN and MLP-based conservative Q-learning (CQL) on the classical D4RL benchmark for offline RL. Our study demonstrates that KAN can achieve performance close to the commonly used MLP with significantly fewer parameters. This allows us to choose the base networks according to the offline RL task requirements. © 2025 IEEE.

关键词： KAN Kolmogorov-Arnold networks MLP multilayer perceptrons offline reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Rethinking-based Code Summarization with Chain of Comments 31

Rethinking-based Code Summarization with Chain of Comments

引用

31st International Conference on Computational Linguistics, COLING 2025

作者： Cao, Liuwen He, Hongkui Huang, Hailin Wang, Jiexin Cai, Yi School of Software Engineering South China University of Technology China Key Laboratory of Big Data and Intelligent Robot South China University of Technology Ministry of Education China

ISBN: (纸本)9798891761964

Automatic code summarization aims to generate concise natural language descriptions (summary) for source code, which can free software developers from the heavy burden of manual commenting and software maintenance. Existing methods focus on learning a direct mapping from pure code to summaries, overlooking the significant heterogeneity gap between code and summary. Moreover, existing methods lack a human-like re-check process to evaluate whether the generated summaries match well with the code. To address these two limitations, we introduce RBCoSum, a novel framework that incorporates the generated Chain Of Comments (COC) as auxiliary intermediate information for the model to bridge the gap between code and summaries. Also, we propose a rethinking process where a learned ranker trained on our constructed ranking dataset scores the extent of matching between the generated summary and the code, selecting the highest-scoring summary to achieve a re-check process. We conduct extensive experiments to evaluate our approach and compare it with other automatic code summarization models as well as multiple code Large Language Models (LLMs). The experimental results show that RBCoSum is effective and outperforms baselines by a large margin. The human evaluation also proves the summaries generated with RBCoSum are more natural, informative, useful, and truthful. © 2025 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Listen With Seeing: Cross-Modal Contrastive Learning for Audio-Visual Event Localization

引用

IEEE Transactions on Multimedia 2025年 27卷 2650-2665页

作者： Sun, Chao Chen, Min Zhu, Chuanbo Zhang, Sheng Lu, Ping Chen, Jincai Huazhong University of Science and Technology Wuhan National Laboratory for Optoelectronics Key Laboratory of Information Storage System Engineering Research Center of Data Storage Systems and Technology Ministry of Education of China School of Computer Science and Technology Wuhan430074 China South China University of Technology School of Computer Science and Engineering Guangzhou510640 China Pazhou Laboratory Guangzhou510640 China

In real-world physiological and psychological scenarios, there often exists a robust complementary correlation between audio and visual signals. Audio-Visual Event Localization (AVEL) aims to identify segments with Audio-Visual Events (AVEs) that contain both audio and visual tracks in unconstrained videos. Prior studies have predominantly focused on audio-visual cross-modal fusion methods, overlooking the fine-grained exploration of the cross-modal information fusion mechanism. Moreover, due to the inherent heterogeneity of multi-modal data, inevitable new noise is introduced during the audio-visual fusion process. To address these challenges, we propose a novel Cross-modal Contrastive Learning Network (CCLN) for AVEL, comprising a backbone network and a branch network. In the backbone network, drawing inspiration from physiological theories of sensory integration, we elucidate the process of audio-visual information fusion, interaction, and integration from an information-flow perspective. Notably, the Self-constrained Bi-modal Interaction (SBI) module is a bi-modal attention structure integrated with audio-visual fusion information, and through gated processing of the audio-visual correlation matrix, it effectively captures inter-modal correlation. The Foreground Event Enhancement (FEE) module emphasizes the significance of event-level boundaries by elongating the distance between scene events during training through adaptive weights. Furthermore, we introduce weak video-level labels to constrain the cross-modal semantic alignment of audio-visual events and design a weakly supervised cross-modal contrastive learning loss (WCCL Loss) function, which enhances the quality of fusion representation in the dual-branch contrastive learning framework. Extensive experiments conducted on the AVE dataset for both fully supervised and weakly supervised event localization, as well as Cross-Modal Localization (CML) tasks, demonstrate the superior performance of our model compa

关键词： Information fusion

来源：评论

学校读者我要写书评

暂无评论

Lightweight Dual Grouped Large-Kernel Convolutions for Salient Object Detection Network 31st

Lightweight Dual Grouped Large-Kernel Convolutions for Sali...

引用

31st International Conference on Multimedia Modeling, MMM 2025

作者： Liu, Jiajie Zhang, Zhibin Engineering Research Center of Ecological Big Data Ministry of Education Beijing China Key Laboratory of Wireless Networks and Mobile Computing Inner Mongolia University Hohhot010021 China

ISBN: (纸本)9789819620609

Most existing Salient Object Detection (SOD) methods focus on achieving better performance, often resulting in models with a large number of parameters. However, there is limited research on lightweight models in this field. To address this gap, our goal is to maintain performance while reducing the number of network parameters. Thanks to the development of large-kernel convolutions in recent years, we have improved the U2Net as the base network by adding our lightweight dual large-kernel fusion module. Our module better utilizes the depth information of U2Net and, due to the large receptive field of large-kernel convolutions, better captures the relationships between image elements. This allows our network to remain lightweight while maintaining performance. We designed a large-kernel (DLK) fusion module and a lightweight dual grouped large-kernel Unet network (DGLKUNET). Our network uses SRUS (an improved Residual U-blocks module, RUS) to construct DGLKUNET that predicts image contours and labels. Compared to the base RUS network (U2Net), our network reduces the number of parameters by up to 70% and the computational cost by 40% while maintaining performance. Evaluation results on five datasets demonstrate the performance of our lightweight network. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： Object recognition

来源：评论

学校读者我要写书评

暂无评论

TCMS: A Multi-Sequence Log Parsing Method Based on Token Conversion

引用

IEEE Transactions on Dependable and Secure Computing 2025年第3期22卷 3028-3045页

作者： Wei, Mingkuan Wen, Jigang He, Shiming Xie, Kun Liang, Wei Xie, Gaogang Li, Kenli Zhu, Ziyu Hunan University Ministry of Education Key Laboratory of "Fusion Computing of Supercomputing and Artificial Intelligence" College of Computer Science and Electronics Engineering Changsha410012 China Hunan University of Science and Technology School of Computer Science and Engineering Xiangtan411201 China Changsha University of Science and Technology School of Computer and Communication Engineering Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation Changsha410114 China Beijing100045 China Columbia University Electrical Engineering New YorkNY10027 United States

Detailed system operations are recorded in logs. To ensure system reliability, developers can detect system anomalies through log anomaly detection. Log parsing, which converts semi-structured log messages into structured data, is a crucial step in log anomaly detection and advanced program analysis and verification. Despite the availability of various log parsing tools, they generally suffer from low parsing accuracy and slow efficiency due to the ignorance of variable characteristics and the use of costly pairwise comparison methods. In this paper, we propose a TCMS framework to parse logs, consisting of two main technologies. First, by studying 16 public log datasets, we find that most log variable tokens are structured variable tokens. Based on this discovery, we propose a token conversion algorithm to improve parsing accuracy. This algorithm converts the changed parts in structured variable tokens into wildcards (‘’), preventing these tokens from being directly identified as constant tokens. Second, to improve efficiency, we propose the LogMLCS algorithm, which intelligently constructs a graph to facilitate the extraction of common parts from multiple log messages at once, instead of using pairwise comparisons. Comprehensive experiments conducted on 16 log datasets reveal that our TCMS outperforms seven other parsing methods, achieving the highest parsing accuracy at the fastest speed. Furthermore, experimental results from running a log anomaly detection algorithm in conjunction with different log parsing methods demonstrate that TCMS significantly boosts detection accuracy. For instance, on the OpenStack dataset, our TCMS-facilitated log anomaly detection algorithm achieves a perfect F1-score, precision, and recall of 100% each, surpassing the best peer method by 32.2, 0.8, and 19.5 percentage points, respectively. © 2004-2012 IEEE.

关键词： Accuracy Anomaly Detection Clustering Algorithms Peer To Peer Computing Electronic Mail Source Coding Particle Separators Codes Tokenization Itemsets Log Parsing Log Analysis Token Conversion Log MLCS Log Parsing Pairwise Comparisons Low Accuracy Public datasets Anomaly Detection Common Part Multiple Messages Long Short Term Memory Point Source Target Domain Logarithm Of The Number Captive Animals Multi Objective Optimization Problem Regular Expressions Biological Sequences Rule Based Methods Point In The Graph Kinds Of datasets Template Generation Number Of Templates Similar Template Parse Tree Syntax Errors Harmonic Average Of Precision Log Length Earliest Methods Common Subsequence Precision And Recall Leaf Node Part Of The Message

来源：评论

学校读者我要写书评

暂无评论

Multimodal Fusion for Android Malware Detection Based on Large Pre-Trained Models

引用

IEEE Transactions on Software engineering 2025年第5期51卷 1569-1590页

作者： Li, Xun Liu, Lei Liu, Yuzhou Zhao, Yu Zhang, Peng Liu, Huaxiao Jilin University College of Computer Science and Technology Changchun130012 China Northeast Electric Power University School of Computer Science Jilin132012 China Jilin University College of Computer Science and Technology Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Changchun130012 China

Malware detection is a critical issue in software engineering as it directly threatens user information security. Existing approaches often focus on individual modality (either source code or binary code) for the detection, but it ignores to effectively exploit the complementary information between them. This limits the detection performance, especially in complex and evasive malware scenarios. In this paper, we take Android applications written in Java as objects, and provide a novel fine-grained multimodal fusion method with large pre-trained models to combine the features from source and binary codes for the malware detection. For the source code modality, we employ the graphical user interface (GUI) as a framework to segment the source code into snippets, and use a pre-trained programming language model to extract feature representations. For the binary code modality, we convert binary code into grayscale images and fine-tune a pre-trained vision model to extract features indirectly. We then implement cross-modal attention and devise a contrastive loss to align features across modalities, supplementing this with supervised classification loss to refine the multimodal fusion process specifically for malware detection. Our experiments, conducted using the data-MD and data-MC benchmarks, demonstrate that our approach achieves a precision of 0.977 and a recall of 0.984 in detecting malware. This underscores the advantages of using large pre-trained models for feature representation and the fusion of information across different modalities for effective malware detection. © 1976-2012 IEEE.

关键词： Android malware

来源：评论

学校读者我要写书评

暂无评论

Secure Multi-way Join Query and Its Bench-Marking with Trusted Execution Environments 2nd

Secure Multi-way Join Query and Its Bench-Marking with Tru...

引用

2nd International Conference on data Security and Privacy Protection, DSPP 2024

作者： Zhao, Yi Zhao, Sen Lv, Siyi School of Cyber Engineering Xidian University Xi’an China State Key Laboratory of Integrated Service Networks Xidian University Xi’an China Key Laboratory of Data and Intelligent System Security Ministry of Education Nankai University Tianjin China

ISBN: (纸本)9789819785452

Because of server-untrustable model in outsourcing encrypted database system, it’s fatal for data owners to keep their data confidential under complex query operations performed by database servers. Trusted Execution Environments (TEEs) provide a secure space for privacy computation via hardware-based methods to increase the efficiency of database system, and it ensures the availability of computing directly on sensitive data in untrusted outsourcing database servers. However, while various query schemes are designed under the combination of encrypted database and TEEs, designing an available and secure scheme for join queries on multiple data tables with TEEs remains an open problem. Besides, previous works also devote less to bench-marking multi-way joins on TEEs. In our work, we extend binary join query scheme into multi-way forms in encrypted database system based on TEEs. It combines various join algorithms based on hash functions (hash-based, radix-based, etc.) with parallelized optimizing mechanism. Besides, we do bench-marking with TEEBench framework on those multi-way join schemes in three-way forms using both cache-fit and cache-exceed datasets and do different evaluation on two TEEs hardware platforms (Intel SGX and AMD SEV) to show its availability. After that, we also propose a multi-way join scheme with access pattern protection through the combination between our previous scheme and oblivious I/O methods. With the evaluation of its efficiency and memory leakage, we can also verify the feasibility of our access-pattern-protected multi-way join scheme. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： Outsourcing

来源：评论

学校读者我要写书评

暂无评论

Enhancing Continuous Cognitive Diagnosis with Fuzzy Strategy-Based Hybrid Genetic Algorithm 3rd

Enhancing Continuous Cognitive Diagnosis with Fuzzy Strateg...

引用

3rd International Conference on Cyberspace Simulation and Evaluation, CSE 2024

作者： He, Chenlong Hu, Xuegang Cao, Zhiyong Bu, Chenyang Luo, Wenjian Key Laboratory of Knowledge Engineering with Big Data (Hefei University of Technology) Ministry of Education Hefei China Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies School of Computer Science and Technology Harbin Institute of Technology Harbin China

ISBN: (纸本)9789819645053

Continuous cognitive diagnosis models (CDMs) are vital tools for assessing students’ mastery of knowledge points. However, traditional probability-based CDMs are prone to falling into local optima due to their use of single-point search methods, which can affect the accuracy of the models. To address this issue, we propose a hybrid genetic algorithm (HGA) enhanced with a fuzzy strategy to improve continuous cognitive diagnosis. This approach introduces the multidimensional item response theory (MIRT) as a local search operator to boost diagnostic precision. Additionally, considering the limitation on the number of local searches within a finite time, we introduce a fuzzy strategy that dynamically adjusts the number of local searches by evaluating the similarity between the current population and the elite set, thus balancing global and local search. Experimental results on three real-world datasets demonstrate that our method significantly outperforms six existing comparison models, validating the effectiveness of the fuzzy strategy and continuous CDM. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： Continuous cognitive diagnosis educational data mining Evolutionary algorithm Local search

来源：评论

学校读者我要写书评

暂无评论

CASSNet: Cross-Attention Enhanced Spectral–Spatial Interaction Network for Hyperspectral Image Super-Resolution

引用

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2025年 18卷 11716-11730页

作者： Zhang, Zhanxu Yang, Linzi Zhang, Guanglian Deng, Jiangwei Bian, Lifeng Yang, Chen Guizhou University Power Systems Engineering Research Center Ministry of Education College of Big Data and Information Engineering Guiyang550025 China Fudan University Frontier Institute of Chip and System Shanghai200433 China Guizhou University China State Key Laboratory of Public Big Data Guiyang550025 China

Deep-learning-based super-resolution (SR) methods for a single hyperspectral image have made significant progress in recent years and become an important research direction in remote sensing. Existing methods perform well in extracting spatial features, but challenges remain in integrating spectral and spatial features when modeling global relationships. In order to take full advantage of the higher spectral resolution of hyperspectral images, this article proposes a novel hyperspectral image SR method (CASSNet), which integrates convolutional neural networks and cross-attention mechanisms into a unified framework. This approach achieves comprehensive integration of spectral and spatial information, with extensive exploration at both local and global levels. In the local feature extraction stage, parallel 3-D/2-D convolutions work in tandem to efficiently capture detail information from both spectral and spatial dimensions. In addition, a spectral–spatial dual-branch module employing the cross-attention mechanism is designed to capture the global dependencies within the features, where the reconstructed spectral–spatial module and the spectral–spatial interaction unit can effectively promote the interaction and complementarity of spectral–spatial features. The experiments on three publicly available datasets demonstrated that the proposed method obtained superior SR results, outperforming state-of-the-art SR algorithms. © 2008-2012 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：