The relationships between objects in a network are typically diverse and complex, leading to the heterogeneous edges with different semantic information. In this paper, we focus on exploring the heterogeneous edges fo...
详细信息
FPGA has been considered as a promising solution to accelerate Convolutional Neural Networks (CNNs) for its excellent performance in energy efficiency and programmability. However, prior designs are usually designed f...
详细信息
ISBN:
(数字)9781728160245
ISBN:
(纸本)9781728160252
FPGA has been considered as a promising solution to accelerate Convolutional Neural Networks (CNNs) for its excellent performance in energy efficiency and programmability. However, prior designs are usually designed for inference only as designers can map pre-trained models to the hardware in a very efficient way. However, those approaches may not be suitable for training CNN models. In this paper, we propose FConv, in which the CPU and FPGA work together in a fine-grained manner. The FPGA accelerator in FConv uses one Winograd-based convolver, which reduces the design complexity and improves performance. We apply double-buffer for output routine to effectively overlap computation and data transfer. We also integrate multiple PEs to improve data parallelism. We propose our analytical model for prediction and use it as a guide in task scheduling. We find the upper limit of performance under the current design based on the analytical model. We evaluate our design on VGG-16 and Densnet-40 on ImageNet and CIFAR-10. We achieve 262.43 GOP/s on the VGG-16 model, which is 2.13× of the performance compared to FFT-based implementation on the same platform. We also achieve at most 4×+ performance improvement compared MKL with 20 threads running on 10 core Intel processors.
Cross-silo federated learning (FL) enables multiple institutions (clients) to collaboratively build a global model without sharing private data. To prevent privacy leakage during aggregation, homomorphic encryption (H...
详细信息
Recently a promising research direction of statistical learning has been advocated, i.e., the optimal margin distribution learning, with the central idea of optimizing the margin distribution. As the most representati...
详细信息
ISBN:
(数字)9781728160344
ISBN:
(纸本)9781728160351
Recently a promising research direction of statistical learning has been advocated, i.e., the optimal margin distribution learning, with the central idea of optimizing the margin distribution. As the most representative approach of this new learning paradigm, the optimal margin distribution machine (ODM) considers maximizing the margin mean and minimizing the margin variance simultaneously. The standard ODM exploits the ℓ_2-norm penalty, which gives rise to a dense decision boundary. However, in some situations, the model with parsimonious representation is more preferred, due to the redundant noisy features or limited computing resources. In this paper, we propose the sparse optimal margin distribution machine (Sparse ODM), which aims to achieve better generalization performance with moderate model size. For optimization, we extends an efficient coordinate descent method to solve the final problem since the variables are decoupled. In each iteration, we propose a modified Newton method to solve the one-variable sub-problem. Experimental results on both synthetic and real data sets show the superiority of the proposed method.
Timetabling and vehicle scheduling are two important activities in public transport (PT) operations planning. Traditionally, the timetabling problem is solved first before proceeding to the vehicle scheduling problem....
详细信息
World models significantly enhance hierarchical understanding, improving data integration and learning efficiency. To explore the potential of the world model in the remote sensing (RS) field, this paper proposes a la...
详细信息
Nowadays, web servers often face the threat of distributed denial of service attacks and their intrusion prevention systems cannot detect those attacks effectively. Many existing intrusion prevention systems detect at...
详细信息
To avoid data loss, data centers adopt disk failure prediction (DFP) technology to raise warnings ahead of actual disk failures, and process the warnings in the order they are raised, i.e., a first-in-first-out (FIFO)...
详细信息
In traditional database systems, data anonymization has been extensively studied, it provides an effective solution for data privacy preservation, and multidimensional anonymization scheme among them is widely used. H...
详细信息
Now, it is popular for people to share their feelings, activities tagged with geography and temporal information in Online Social Networks (OSNs). The spatial and temporal interactions occurred in OSNs contain a wealt...
详细信息
暂无评论