Federated learning (FL) is widely applied in edge computing scenarios to protect data privacy, but the assumption of independent and identically distributed (IID) data across clients often does not hold in practice, l...
详细信息
In recent years, Transformer model has made remarkable progress in machine translation tasks and has become the mainstream translation model. However, the calculation complexity of Transformer model is high, especiall...
详细信息
Considering the challenges posed by the dung beetle optimization algorithm, which often gravitates towards local optimality and disproportionate global exploration and local development potentials, a fusion strategy d...
详细信息
In this opinion piece, we question the efficacy of students conducting systematic reviews (SRs) at the very start of their PhDs, especially now that we are riding, or drowning in, the Generative AI wave. How would the...
In recent years, the increasing interest in ontologies resulted in the developing and publishing of many ontologies in the same or different domains. When users try to reuse the existing ontologies in their applicatio...
详细信息
The detection of anomalies in streaming data is crucial in enterprise operations, employing statistical and machine learning methods to identify irregularities. This enhances service stability and reduces operational ...
详细信息
Aiming at the problem that the acoustic input network of Conformer encoder is insufficient for Fbank speech feature extraction, an end-to-end speech recognition modeling method of DM-Conformer is proposed. Firstly, th...
详细信息
Medical image segmentation is pivotal in computer-aided diagnosis systems, demanding high precision and contextual understanding. Vision Transformer-based approaches have gained much attention recently due to their ex...
详细信息
In the era of rapid development of artificial intelligence technologies, traditional teaching models are unable to meet the employment needs of enterprises, and talent cultivation in universities faces more challenges...
详细信息
Code summarization aims to generate natural language descriptions of source code, facilitating programmers to understand and maintain it rapidly. While previous code summarization efforts have predominantly focused on...
详细信息
ISBN:
(纸本)9798350330663
Code summarization aims to generate natural language descriptions of source code, facilitating programmers to understand and maintain it rapidly. While previous code summarization efforts have predominantly focused on method-level, this paper studies file-level code summarization, which can assist programmers in understanding and maintaining large source code projects. Unlike method-level code summarization, file-level code summarization typically involves long source code within a single file, which makes it challenging for Transformer-based models to understand the code semantics for the maximum input length of these models is difficult to set to a large number that can handle long code input well, due to the quadratic scaling of computational complexity with the input sequence length. To address this challenge, we propose SparseCoder, an identifier-aware sparse transformer for effectively handling long code sequences. Specifically, the SparseCoder employs a sliding window mechanism for self-attention to model short-term dependencies and leverages the structure message of code to capture long-term dependencies among source code identifiers by introducing two types of sparse attention patterns named global and identifier attention. To evaluate the performance of SparseCoder, we construct a new dataset FILE-CS for file-level code summarization in Python. Experimental results show that our SparseCoder model achieves state-of-the-art performance compared with other pre-trained models, including full self-attention and sparse models. Additionally, our model has low memory overhead and achieves comparable performance with models using full self-attention mechanism. Furthermore, we verify the generality of SparseCoder on other code understanding tasks, i.e., code clone detection and code search, and results show that our model outperforms baseline models in both tasks, demonstrating that our model can generate better code representations for various downstream tasks. Our
暂无评论