The inclusion-exclusion principle together with Legendre type theorems for number of distinct restricted partitions weighted by the parity of their length are used to give several recurrence relations for restricted p...
详细信息
Model quantification uses low bit-width values to represent the weight matrices of existing models to be quantized, which is a promising approach to reduce both storage and computational overheads of deploying highly ...
With the prevalence of pre-training-fine-tuning paradigm, how to efficiently adapt the pre-trained model to the downstream tasks has been an intriguing issue. Parameter-Efficient Fine-Tuning (PEFT) methods have been p...
详细信息
With the prevalence of pre-training-fine-tuning paradigm, how to efficiently adapt the pre-trained model to the downstream tasks has been an intriguing issue. Parameter-Efficient Fine-Tuning (PEFT) methods have been proposed for low-cost adaptation. Although PEFT has demonstrated effectiveness and been widely applied, the underlying principles are still unclear. In this paper, we adopt the PAC-Bayesian generalization error bound, viewing pre-training as a shift of prior distribution which leads to a tighter bound for generalization error. We validate this shift from the perspectives of oscillations in the loss landscape and the quasi-sparsity in gradient distribution. Based on this, we propose a gradient-based sparse finetuning algorithm, named Sparse Increment Fine-Tuning (SIFT), and validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning. The code is accessible at https://***/song-wx/SIFT. Copyright 2024 by the author(s)
Radio-Frequency (RF)-based Human Activity Recognition (HAR) rises as a promising solution for applications unamenable to techniques requiring computervisions. However, the scarcity of labeled RF data due to their non...
详细信息
In transport and supply chain management (SCM), accurate prediction of waiting times is pivotal for optimizing resource allocation, minimizing costs, and operational efficiency. However, this task is filled with chall...
详细信息
We present SketchGPT, a flexible framework that employs a sequence-to-sequence autoregressive model for sketch generation, and completion, and an interpretation case study for sketch recognition. By mapping complex sk...
详细信息
The box office (BO) income had significantly declined up to 80% in 2020, as the COVID-19 pandemic emerged. To minimize further financial risks, multiplex (multiple cinema complexes) owners need to analyze their potent...
详细信息
Examining topic-level variability in modeling Twitter data can potentially yield more comprehensive insights into public perception during critical periods, thereby enhancing natural disaster mitigation and surveillan...
详细信息
The projected increase in PayLater utilization reaches up to five million people by 2025. To optimize the yearly profit from their PayLater service, fintech companies must examine all possible risks before a unanimous...
详细信息
Air pollution is one of the most serious problems in many regions of the world. Thailand also has had to face this trouble unavoidably, especially in the northern region of Thailand, the area that has been highly cont...
详细信息
暂无评论