检索结果-内蒙古大学图书馆

A neural network compression method based on knowledge-distillation and parameter quantization for the bearing fault diagnosis

引用

APPLIED SOFT COMPUTING 2022年 127卷

作者： Ji, Mengyu Peng, Gaoliang Li, Sijue Cheng, Feng Chen, Zhao Li, Zhixiong Du, Haiping Harbin Inst Technol State Key Lab Robot Technol & Syst Harbin 150001 Peoples R China Yonsei Univ Yonsei Frontier Lab Seoul 03722 South Korea Univ Wollongong Fac Engn Informat & Sci Wollongong NSW 2522 Australia Opole Univ Technol Fac Mech Engn PL-45758 Opole Poland

Condition monitoring and fault diagnosis have been critical for the optimal scheduling of machines, improving the system reliability and the reducing maintenance cost. In recent years, various of methods based on the deep learning method have made the great progress in the field of the mechanical fault diagnosis. However, there is a conflict between the massive parameters of the fault diagnosis networks and the limited computing resource of the embedded platforms. It is difficult to deploy the trained network on the small scale embedded platforms (like field programmable gate array (FPGA)) in the actual industrial situations. This seriously hinders the practical process of the intelligent fault diagnosis method. To address this problem, a new neural network compression method based on knowledge-distillation (K-D) and parameter quantization is proposed in this paper. In the proposed method, a large scale deep neural network with multiple convolutional layers and fully-connected layers is designed and trained as the teacher network. Then a small scale network with just one convolutional layer and one fully-connected layer is designed as the student network. When training the student network, the K-D process is conducted to improve the accuracy of the student network. After the training process, the parameter quantization is conducted to further compress the scale of the student network. Experimental results on the field programmable gate array (FPGA) are presented to demonstrate the effectiveness of the proposed method. The results show that the proposed method can greatly compress the scales of the fault diagnosis networks for over 10 times at the cost of the minimal loss of the accuracy.(c) 2022 Elsevier B.V. All rights reserved.

关键词： Bearing fault diagnosis Neural network compression method Knowledge-distillation parameter quantization Field programmable gate array (FPGA)

来源：评论

学校读者我要写书评

暂无评论

MEMORIZATION CAPACITY OF DEEP NEURAL NETWORKS UNDER parameter quantization 44

MEMORIZATION CAPACITY OF DEEP NEURAL NETWORKS UNDER PARAMETE...

引用

44th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

作者： Boo, Yoonho Shin, Sungho Sung, Wonyong Seoul Natl Univ Neural Proc Res Ctr Dept Elect & Comp Engn Seoul South Korea

ISBN: (纸本)9781479981311

Most deep neural networks (DNNs) require complex models to achieve high performance. parameter quantization is widely used for reducing the implementation complexities. Previous studies on quantization were mostly based on extensive simulation using training data on a specific model. We choose a different approach and attempt to measure the per- parameter capacity of DNN models and interpret the results to obtain insights on optimum quantization of parameters. This research uses artificially generated data and generic forms of fully connected DNNs, convolutional neural networks, and recurrent neural networks. We conduct memorization and classification tests to study the effects of the number and precision of the parameters on the performance. The model and the per- parameter capacities are assessed by measuring the mutual information between the input and the classified output. To get insight for parameter quantization when performing real tasks, the training and test performances are compared.

关键词： Deep neural network parameter quantization network capacity

来源：评论

学校读者我要写书评

暂无评论

Increasing Compactness of Deep Learning Based Speech Enhancement Models With parameter Pruning and quantization Techniques

引用

IEEE SIGNAL PROCESSING LETTERS 2019年第12期26卷 1887-1891页

作者： Wu, Jyun-Yi Yu, Cheng Fu, Szu-Wei Liu, Chih-Ting Chien, Shao-Yi Tsao, Yu Natl Taiwan Univ Grad Inst Environm Engn Taipei 10673 Taiwan Acad Sinica Res Ctr Informat Technol Innovat Taipei 115 Taiwan

The most recent studies on deep learning based speech enhancement (SE) are focused on improving denoising performance. However, successful SE applications require striking a desirable balance between the denoising performance and computational cost in real scenarios. In this study, we propose a novel parameter pruning (PP) technique, which removes redundant channels in a neural network. In addition, parameter quantization (PQ) and feature-map quantization (FQ) techniques were also integrated to generate even more compact SE models. The experimental results show that the integration of PP, PQ, and FQ can produce a compacted SE model with a size of only 9.76 % compared to that of the original model, resulting in minor performance losses of 0.01 (from 0.85 to 0.84) and 0.03 (from 2.55 to 2.52) for STOI and PESQ scores, respectively. These promising results confirm that the PP, PQ, and FQ techniques can be used to effectively reduce the storage of an SE system on edge devices.

关键词： Compactness parameter Pruning parameter quantization Low Computational Cost

来源：评论

学校读者我要写书评

暂无评论

Optimizing Fine-Tuning in Quantized Language Models:An In-Depth Analysis of Key Variables

引用

Computers, Materials & Continua 2025年第1期82卷 307-325页

作者： Ao Shen Zhiquan Lai Dongsheng Li Xiaoyu Hu National Key Laboratory of Parallel and Distributed Computing National University of Defense TechnologyChangsha410073China Strategic Assessments and Consultation Institute Academy of Military ScienceBeijing100091China

Large-scale Language Models(LLMs)have achieved significant breakthroughs in Natural Language Processing(NLP),driven by the pre-training and fine-tuning *** this approach allows models to specialize in specific tasks with reduced training costs,the substantial memory requirements during fine-tuning present a barrier to broader ***-Efficient Fine-Tuning(PEFT)techniques,such as Low-Rank Adaptation(LoRA),and parameter quantization methods have emerged as solutions to address these challenges by optimizing memory usage and computational *** these,QLoRA,which combines PEFT and quantization,has demonstrated notable success in reducing memory footprints during fine-tuning,prompting the development of various QLoRA *** these advancements,the quantitative impact of key variables on the fine-tuning performance of quantized LLMs remains *** study presents a comprehensive analysis of these key variables,focusing on their influence across different layer types and depths within LLM *** investigation uncovers several critical findings:(1)Larger layers,such as MLP layers,can maintain performance despite reductions in adapter rank,while smaller layers,like self-attention layers,aremore sensitive to such changes;(2)The effectiveness of balancing factors depends more on specific values rather than layer type or depth;(3)In quantization-aware fine-tuning,larger layers can effectively utilize smaller adapters,whereas smaller layers struggle to do *** insights suggest that layer type is a more significant determinant of fine-tuning success than layer depth when optimizing quantized ***,for the same discount of trainable parameters,reducing the trainable parameters in a larger layer is more effective in preserving fine-tuning accuracy than in a smaller *** study provides valuable guidance for more efficient fine-tuning strategies and opens avenues for further research into optimizing LLM

关键词： Large-scale Language Model parameter-Efficient Fine-Tuning parameter quantization key variable trainable parameters experimental analysis

来源：评论

学校读者我要写书评

暂无评论

UAV Signal Modulation Recognition Algorithm Based on Joint Features

引用

IEEE ACCESS 2025年 13卷 43224-43237页

作者： Zheng, Yang Zhuo, Zhihai Beijing Informat Sci & Technol Univ Dept Informat & Commun Engn Beijing 100101 Peoples R China

The precise identification of drone modulation schemes serves as a fundamental basis for achieving intelligent drone recognition. To address the limitations of existing algorithms that overly rely on single features, resulting in low recognition rates and excessive model complexity, this paper proposes a drone signal modulation recognition algorithm based on joint features. The algorithm begins by employing Maximum Likelihood Estimation (MLE) to compensate for phase noise, mitigating its adverse effects on subsequent modulation recognition. Next, it combines the signal's time-frequency representation with IQ data derived from higher-order cumulants as joint features, which are then input into a recognition network composed of 2D convolutional layers (Conv2D) and Long Short-Term Memory (LSTM) networks. Additionally, parameter dynamic fixed-point quantization is applied to optimize the weights and biases of the model, reducing resource consumption during practical deployment. Experimental results demonstrate that when the Signal-to-Noise Ratio (SNR) exceeds 2 dB, the proposed algorithm achieves a recognition accuracy of up to 90% for nine common drone modulation schemes, substantially outperforming comparable models. After quantization, the recognition performance remains nearly unaffected, while computational resource requirements are greatly reduced, making the algorithm highly suitable for deployment in resource-constrained environments.

关键词： Modulation Drones Phase noise Feature extraction Rician channels Time-frequency analysis quantization (signal) Autonomous aerial vehicles Phase modulation Maximum likelihood estimation Modulation recognition higher-order cumulants time-frequency image parameter quantization

来源：评论

学校读者我要写书评

暂无评论

Joint-Guided Distillation Binary Neural Network via Dynamic Channel-Wise Diversity Enhancement for Object Detection

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2024年第1期34卷 448-460页

作者： Xie, Yefan Hou, Xuan Guo, Yanwei Wang, Xiuying Zheng, Jiangbin Northwestern Polytech Univ Sch Comp Sci Natl Engn Lab Integrated Aerosp Ground Ocean Big Xian 710129 Shaanxi Peoples R China Northwestern Polytech Univ Shaanxi Prov Key Lab Speech Image Informat Proc Xian 710129 Shaanxi Peoples R China Univ Sydney Sch Comp Sci Sydney NSW 2008 Australia

Through truncating the weights and activations of a deep neural network, conventional binary quantization imposes limitations on the representation capability of the network parameters, which hence deteriorates the detection performance of the network. In this paper, a joint-guided distillation binary neural network via dynamic channel-wise diversity enhancement for object detection (JDBNet) is proposed to mitigate the gap of quantization errors. Our JDBNet includes a dynamic channel-wise diversity scheme and real-valued joint-guided teacher assistance to enhance the representation capability of the binary neural network in the object detection tasks. In the dynamic diversity scheme, the learning channel-wise bias (LCB) layer supports adjusting the magnitude of the parameters in which the sensitivity of the model parameters to the arbitrary quantization method is reduced, thereby improving the diversity expression ability of the feature parameters. In the joint-guided strategy, the single-precision implicit knowledge from the guiding teacher in the multilevel layer is utilized to supervise and penalize the quantitative model, enhancing the fitting performance of parameters in the binary quantized model. Extensive experiments on the PASCAL VOC, MS COCO, and VisDrone-DET datasets demonstrate that our JDBNet outperforms the state-of-the-art binary object detection networks in terms of mean Average Precision.

关键词： Binary neural network parameter quantization object detection knowledge distillation

来源：评论

学校读者我要写书评

暂无评论

Efficient federated learning algorithm using sparse ternary compression based on layer variation classification

引用

COMPUTER NETWORKS 2024年 247卷

作者： Liu, Yuanyuan Chen, Kexin Zhu, Lu East China Jiaotong Univ Inst Informat Engn Nanchang Peoples R China

Federated learning is a distributed machine learning technique that ensures user privacy and enables multiple clients to jointly train a shared global model without transmitting local data. However, the frequent exchange of model parameters between numerous clients and the server results in heavy network delay and bandwidth limitation in federated learning. In view of that, we propose an efficient algorithm for federated learning using sparse ternary compression based on layer variation classification(LVC). First, use layer variation as a metric to assess the significance of each layer of the model parameters, and after client training, categorize the model parameters into different levels by the layer variation and sensitivity analysis. Then, during the upstream and downstream transmission of model parameters, we assign corresponding sparse and ternary quantization ratios for different levels to maximize compression efficiency while preserving crucial parameters. Finally, on the server side, a majority-layer aggregation strategy is adopted to further reduce the communication cost. Experimental results from image classification tasks conducted on MNIST and Fashion-MNIST datasets demonstrate that proposed LVC algorithm achieves high accuracy with minimal communication cost, thereby striking an optimal balance between communication efficiency and accuracy.

关键词： Federated learning Communication efficiency Model sparsity parameter quantization Algorithm optimization

来源：评论

学校读者我要写书评

暂无评论

Real-Time Aeromagnetic Compensation With Compressed and Accelerated Neural Networks

引用

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS 2022年 19卷

作者： Jiao, Jian Yu, Ping Zhao, Xiao Bi, Fengyi Jilin Univ Coll GeoExplorat Sci & Technol Changchun 130012 Peoples R China

As neural networks become an increasingly popular technique in the field of aeromagnetic compensation, there is an increasing demand for hardware systems with more computing power. Compared with the linear regression method, applying a neural network to the task of real-time compensation is difficult because of insufficient computing resources in the unmanned aerial vehicle (UAV) flight detection platform. To perform real-time compensation calculations with limited computing resources, we optimized back propagation neural network (OBPNN) through model compression and acceleration. In this study, we found that the most time-consuming part of network training is the iterative updating of the weights in the BPNN interference model. Using transfer learning, we replace the randomly initialized weights (RWs) with pretrained weights, thereby greatly reducing the number of iterations required. We also apply other model compression and acceleration algorithms. In a case study of our new technique, we implement the fast training of the OBPNN on a Raspberry Pi 4B system. This network processes approximately 316 samples per 0.1 s, which is fast enough to complete aeromagnetic compensation in real time.

关键词： Neural networks Real-time systems Interference Atmospheric modeling Autonomous aerial vehicles Training Magnetometers parameter quantization pruning real-time aeromagnetic compensation transfer learning unmanned aerial vehicle (UAV)

来源：评论

学校读者我要写书评

暂无评论

A novel MDL-based compression method for power quality applications

引用

IEEE TRANSACTIONS ON POWER DELIVERY 2007年第1期22卷 27-36页

作者： Ribeiro, Moises V. Park, Seop Hyeong Romano, Joao Marcos T. Mitra, Sanjit K. Univ Calif Santa Barbara Dept Elect & Comp Engn Santa Barbara CA 93106 USA Hallym Univ Dept Elect Engn Chunchon 200702 South Korea Univ Estadual Campinas Sch Elect & Comp Engn Dept Commun BR-13081970 Campinas SP Brazil Univ Calif Santa Barbara Dept Elect & Comp Engn Santa Barbara CA 93106 USA

This paper introduces a novel source coding method for voltage and current signals, called fundamental, harmonic and transient coding method (FHTCM), which is a generalization of the enhanced disturbance compression method (EDCM). The proposed method makes use of notch filtering-warped discrete Fourier transform (NF-WDFT) technique for estimating the parameters (amplitude, frequency, and phase) of the fundamental and harmonic components acquired from power lines so that only the transient components are compressed with wavelet transform (WT) coding technique. For the WT-based compression of transient components, we formulate a minimum description length (MDL) criterion, taking into account the selection of wavelet bases in a dictionary, wavelet decomposition structure, and quantization. Computational simulations have verified that the proposed method outperforms the EDCM as well as the traditional WT-based compression techniques.

关键词： data compression fundamental and harmonics minimum description length notch filtering parameter quantization power quality warped discrete Fourier transform (DFT) wavelet transforms (WTs)

来源：评论

学校读者我要写书评

暂无评论

PocketSUMMIT: Small-Footprint Continuous Speech Recognition

PocketSUMMIT: Small-Footprint Continuous Speech Recognition

引用

Interspeech Conference 2007

作者： Hetherington, I. Lee MIT Comp Sci & Artificial Intelligence Lab Cambridge MA 02139 USA

ISBN: (纸本)9781605603162

We present PocketSUMMIT, a small-footprint version of our SUMMIT continuous speech recognition system. With portable devices becoming smaller and more powerful, speech is increasingly becoming an important input modality on these devices. PocketSUMMIT is implemented as a variable-rate continuous density hidden Markov model with diphone context-dependent models. We explore various Gaussian parameter quantization schemes and find 8:1 compression or more is achievable with little reduction in accuracy. We also show how the quantized parameters can be used for rapid table lookup. We explore first-pass language model pruning in a finite-state transducer (FST) framework, as well as FST and n-gram weight quantization and bit packing, to further reduce memory usage. PocketSUMMIT is currently able to run a moderate vocabulary conversational speech recognition system in real time in a few MB on current PDAs and smart phones.

关键词： speech recognition small footprint parameter quantization finite-state transducer

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：