Pre-training a language model and then fine-tuning it has shown to be an efficient and effective technique for a wide range of code intelligence tasks, such as code generation, code summarization, and vulnerability de...
详细信息
ISBN:
(数字)9798331535100
ISBN:
(纸本)9798331535117
Pre-training a language model and then fine-tuning it has shown to be an efficient and effective technique for a wide range of code intelligence tasks, such as code generation, code summarization, and vulnerability detection. However, pre-training language models on a large-scale code corpus is compu-tationally expensive. Fortunately, many off-the-shelf Pre-trained Code Models (PCMs), such as CodeBERT, CodeT5, CodeGen, and Code Llama, have been released publicly. These models acquire general code understanding and generation capability during pre-training, which enhances their performance on downstream code intelligence tasks. With an increasing number of these public pre-trained models, selecting the most suitable one to reuse for a specific task is essential. In this paper, we systematically investigate the reusability of PCMs. We first explore three intuitive model selection methods that select by size, training data, or brute-force fine-tuning. Experimental results show that these straightforward techniques either perform poorly or suffer high costs. Motivated by these findings, we explore learning-based model selection strategies that utilize pre-trained models without altering their parameters. Specifically, we train proxy models to gauge the performance of pre-trained models, and measure the distribution deviation between a model's latent features and the task's labels, using their closeness as an indicator of model transferability. We conduct experiments on 100 widely-used open-source PCMs for code intelligence tasks, with sizes ranging from 42.5 million to 3 billion parameters. The results demonstrate that learning-based selection methods reduce selection time to 100 seconds, compared to 2,700 hours with brute-force fine-tuning, with less than 6% performance degradation across related tasks.
Path planning is the core of autonomous robot navigation, which helps the robot to find a collision-free path to the destination based on the environment information. Most current path planning methods only consider t...
详细信息
ISBN:
(数字)9798350308365
ISBN:
(纸本)9798350308372
Path planning is the core of autonomous robot navigation, which helps the robot to find a collision-free path to the destination based on the environment information. Most current path planning methods only consider the path length, but the optimal path may deviate from the shortest when considering other environmental factors such as uneven terrain or regions with varying traversal costs. Similarly, in scenarios prioritizing energy efficiency, a sole focus on path length may lead to suboptimal solutions. In this paper, an improved Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D) with adaptive weight vector, external archive, and constrained update strategy namely the MOEA/D-EAWA is proposed. This algorithm not only considers the path length but also four additional objectives such as smoothness, traveling time, terrain (elevation), and speed limit (expected delay). In addition, MOEA/D-EAWA is better suited for such many-objective path planning problem which has an irregular, discrete, and sparse Pareto front. The simulation results from 90 map instances demonstrate that the proposed method outperforms the existing approaches.
Segment Anything Model (SAM) has recently gained much attention for its outstanding generalization to unseen data and tasks. Despite its promising prospect, the vulnerabilities of SAM, especially to universal adversar...
详细信息
Large kernels make standard convolutional neural networks (CNNs) great again over transformer architectures in various vision tasks. Nonetheless, recent studies meticulously designed around increasing kernel size have...
详细信息
The ability to autonomously explore and resolve tasks with minimal human guidance is crucial for the self-development of embodied intelligence. Although reinforcement learning methods can largely ease human effort, it...
详细信息
Deep learning has achieved great success in various areas and its success is closely linked to the availability of massive data. But in general, a large dataset could include sensitive data and therefore the model sho...
详细信息
Modern storage systems typically replicate data on multiple servers to provide high reliability and availability. However, most commercially-deployed datastores often fail to offer low latency, high throughput, and st...
Modern storage systems typically replicate data on multiple servers to provide high reliability and availability. However, most commercially-deployed datastores often fail to offer low latency, high throughput, and strong consistency at the same time. This paper presents Whale, a Remote Direct Memory Access (RDMA) based primary-backup replication system for in-memory datastores. Whale achieves both low latency and strong consistency by decoupling metadata multicasting from data replication for all backup nodes, and using an optimistic commitment mechanism to respond to client write requests earlier. Whale achieves high throughput by propagating writes from the primary node to backup nodes asynchronously via RDMA-optimized chain replication. To further reduce the cost of data replication, we design a log-structured datastore to fully exploit the advantages of one-sided RDMA and Persistent Memory (PM). We implement Whale on a cluster equipped with PM and InfiniBand RDMA networks. Experimental results show that Whale achieves much higher throughput and lower latency than state-of-the-art replication protocols.
With the advancement of deep learning, object detectors (ODs) with various architectures have achieved significant success in complex scenarios like autonomous driving. Previous adversarial attacks against ODs have be...
详细信息
Federated Learning enables collaboratively model training among a number of distributed devices with the coordination of a centralized server, where each device alternatively performs local gradient computation and co...
详细信息
作者:
Gu, QiliangLu, Qin
Shandong Engineering Research Center of Big Data Applied Technology Faculty of Computer Science and Technology Jinan China
Key Laboratory of Computing Power Network and Information Security Ministry of Education Shandong Computer Science Center Jinan China Shandong Fundamental Research Center for Computer Science
Shandong Provincial Key Laboratory of Industrial Network and Information System Security Jinan China
The legal judgement prediction (LJP) of judicial texts represents a multi-label text classification (MLTC) problem, which in turn involves three distinct tasks: the prediction of charges, legal articles, and terms of ...
详细信息
暂无评论