This paper introduces distribution-flexible subset quantization(DFSQ), a post-training quantization method for super-resolution networks. Our motivation for developing DFSQ is based on the distinctive activation distr...
详细信息
This paper introduces distribution-flexible subset quantization(DFSQ), a post-training quantization method for super-resolution networks. Our motivation for developing DFSQ is based on the distinctive activation distributions of current super-resolution models, which exhibit significant variance across samples and channels. To address this issue, DFSQ conducts channel-wise normalization of the activations and applies distribution-flexible subset quantization(SQ), wherein the quantization points are selected from a universal set consisting of multi-word additive log-scale values. To expedite the selection of quantization points in SQ, we propose a fast quantization points selection strategy that uses K-means clustering to select the quantization points closest to the centroids. Compared to the common iterative exhaustive search algorithm, our strategy avoids the enumeration of all possible combinations in the universal set, reducing the time complexity from exponential to linear. Consequently, the constraint of time costs on the size of the universal set is greatly relaxed. Extensive evaluations of various super-resolution models show that DFSQ effectively improves performance even without fine-tuning. For example, for 4-bit EDSR×2 on the Urban benchmark, DFSQ obtains 0.242 dB PSNR gains.
Incorporating human feedback to optimize text-to-image models has demonstrated significant effectiveness. However, the process of collecting high-quality human preference labels is both resource-intensive and time-con...
详细信息
Due to the spectral range mismatch between the images, building an efficient infrared (IR) image super-resolution algorithm suitable for embedded devices remains a significant challenge. Given that visible images poss...
详细信息
With the boom in maritime activities,the need for highly reliable maritime communication is becoming urgent,which is an important component of 5G/6G communication ***,the bandwidth reuse characteristic of 5G/6G networ...
详细信息
With the boom in maritime activities,the need for highly reliable maritime communication is becoming urgent,which is an important component of 5G/6G communication ***,the bandwidth reuse characteristic of 5G/6G networks will inevitably lead to severe interference,resulting in degradation in the communication performance of maritime *** this paper,we propose a safe deep reinforcement learning based interference coordination scheme to jointly optimize the power control and bandwidth allocation in maritime communication systems,and exploit the quality-of-service requirements of users as the risk value references to evaluate the communication *** particular,this scheme designs a deep neural network to select the communication policies through the evaluation network and update the parameters using the target network,which improves the communication performance and speeds up the convergence ***,the Nash equilibrium of the interference coordination game and the computational complexity of the proposed scheme are *** and experimental results verify the performance gain of the proposed scheme compared with benchmarks.
Despite the great success, most models in image captioning (IC) are still stuck in the dilemma of generating simple and non-discriminative captions. In this paper, we study this problem from the perspective of data au...
详细信息
Jailbreak attacks craft specific prompts or append adversarial suffixes to prompts, thereby inducing language models to generate harmful or unethical content and bypassing the model's safety guardrails. With the r...
详细信息
Existing camera motion-controlled video generation methods face computational bottlenecks in fine-tuning and inference. This paper proposes LightMotion, a light and tuning-free method for simulating camera motion in v...
详细信息
Speech-to-text translation (ST) is a cross-modal task that involves converting spoken language into text in a different language. Previous research primarily focused on enhancing speech translation by facilitating kno...
Representing 3D scenes from multiview images is a core challenge in computer vision and graphics, which requires both precise rendering and accurate reconstruction. Recently, 3D Gaussian Splatting (3DGS) has garnered ...
详细信息
Bridging natural language and 3D geometry is a crucial step toward flexible, language-driven scene understanding. While recent advances in 3D Gaussian Splatting (3DGS) have enabled fast and high-quality scene reconstr...
详细信息
暂无评论