Mobile edge computing facilitates users to offload computation tasks to edge servers for meeting their stringent delay requirements. Previous works mainly explore task offloading when system-side information is given ...
详细信息
Due to its open-source nature, Android operating system has been the main target of attackers to exploit. Malware creators always perform different code obfuscations on their apps to hide malicious activities. Feature...
详细信息
FPGA has been considered as a promising solution to accelerate Convolutional Neural Networks (CNNs) for its excellent performance in energy efficiency and programmability. However, prior designs are usually designed f...
详细信息
ISBN:
(数字)9781728160245
ISBN:
(纸本)9781728160252
FPGA has been considered as a promising solution to accelerate Convolutional Neural Networks (CNNs) for its excellent performance in energy efficiency and programmability. However, prior designs are usually designed for inference only as designers can map pre-trained models to the hardware in a very efficient way. However, those approaches may not be suitable for training CNN models. In this paper, we propose FConv, in which the CPU and FPGA work together in a fine-grained manner. The FPGA accelerator in FConv uses one Winograd-based convolver, which reduces the design complexity and improves performance. We apply double-buffer for output routine to effectively overlap computation and data transfer. We also integrate multiple PEs to improve data parallelism. We propose our analytical model for prediction and use it as a guide in task scheduling. We find the upper limit of performance under the current design based on the analytical model. We evaluate our design on VGG-16 and Densnet-40 on ImageNet and CIFAR-10. We achieve 262.43 GOP/s on the VGG-16 model, which is 2.13× of the performance compared to FFT-based implementation on the same platform. We also achieve at most 4×+ performance improvement compared MKL with 20 threads running on 10 core Intel processors.
The modern network consists of thousands of network devices from different suppliers that perform distinct code-pendent functions, such as routing, switching, modifying header fields, and access control across physica...
详细信息
ISBN:
(纸本)9781665416597
The modern network consists of thousands of network devices from different suppliers that perform distinct code-pendent functions, such as routing, switching, modifying header fields, and access control across physical and virtual networks. Because of the network complexity, the network is prone to a wide range of errors, such as false-positive configuration, software errors, or unexpected interactions across protocols. These errors can lead to loops, sub-optimal routing, path leaks, black holes, and access control violations that make services unavailab.e, vulnerable to exploitation, or prone to attacks (e.g., DDoS attacks). To mitigate these problems, network operators deploy many different stateful network functions, like firewalls, NATs, load balancers, and intrusion-prevention boxes. They have become an important part of networks today, so it is critical to verify that these network functions are the same as expected deployments. All static network verification tools are meant to rigorously check network software or configuration for bugs before deployment. They usually use handwritten models or limited derivation models that are error-prone and ignore the fact that even the same type of network functions (from different vendors) still have different implementation details. In this paper, we propose a tool that can automatically synthesize more realistic and high-fidelity models that include stateful network functions with non-field attributes. We design an inferring algorithm, implement the transformation between data packages and symbolic packages, and obtain a finite state machine that can accurately express the actions of black-box network functions for a given configuration.
Recently a promising research direction of statistical learning has been advocated, i.e., the optimal margin distribution learning, with the central idea of optimizing the margin distribution. As the most representati...
详细信息
ISBN:
(数字)9781728160344
ISBN:
(纸本)9781728160351
Recently a promising research direction of statistical learning has been advocated, i.e., the optimal margin distribution learning, with the central idea of optimizing the margin distribution. As the most representative approach of this new learning paradigm, the optimal margin distribution machine (ODM) considers maximizing the margin mean and minimizing the margin variance simultaneously. The standard ODM exploits the ℓ_2-norm penalty, which gives rise to a dense decision boundary. However, in some situations, the model with parsimonious representation is more preferred, due to the redundant noisy features or limited computing resources. In this paper, we propose the sparse optimal margin distribution machine (Sparse ODM), which aims to achieve better generalization performance with moderate model size. For optimization, we extends an efficient coordinate descent method to solve the final problem since the variables are decoupled. In each iteration, we propose a modified Newton method to solve the one-variable sub-problem. Experimental results on both synthetic and real data sets show the superiority of the proposed method.
GPUs are essential to accelerating the latency-sensitive deep neural network (DNN) inference workloads in cloud datacenters. To fully utilize GPU resources, spatial sharing of GPUs among co-located DNN inference workl...
详细信息
Nowadays, web servers often face the threat of distributed denial of service attacks and their intrusion prevention systems cannot detect those attacks effectively. Many existing intrusion prevention systems detect at...
详细信息
To avoid data loss, data centers adopt disk failure prediction (DFP) technology to raise warnings ahead of actual disk failures, and process the warnings in the order they are raised, i.e., a first-in-first-out (FIFO)...
详细信息
In traditional database systems, data anonymization has been extensively studied, it provides an effective solution for data privacy preservation, and multidimensional anonymization scheme among them is widely used. H...
详细信息
Serverless computing, also known as “Function as a Service (FaaS)”, is emerging as an event-driven paradigm of cloud computing. In the FaaS model, applications are programmed in the form of functions that are execut...
详细信息
ISBN:
(数字)9781728168876
ISBN:
(纸本)9781728168883
Serverless computing, also known as “Function as a Service (FaaS)”, is emerging as an event-driven paradigm of cloud computing. In the FaaS model, applications are programmed in the form of functions that are executed and managed separately. Functions are triggered by cloud users and are provisioned dynamically through containers or virtual machines (VMs). The startup delays of containers or VMs usually lead to rather high latency of response to cloud users. Moreover, the communication between different functions generally relies on virtual net devices or shared memory, and may cause extremely high performance overhead. In this paper, we propose Unikernel-as-a-Function (UaaF), a much more lightweight approach to serverless computing. Applications are abstracted as a combination of different functions, and each function are built as an unikernel in which the function is linked with a specified minimum-sized library operating system (LibOS). UaaF offers extremely low startup latency to execute functions, and an efficient communication model to speed up inter-functions interactions. We exploit an new hardware technique (namely VMFUNC) to invoke functions in other unikernels seamlessly (mostly like inter-process communications), without suffering performance penalty of VM Exits. We implement our proof-of-concept prototype based on KVM and deploy UaaF in three unikernels (MirageOS, IncludeOS, and Solo5). Experimental results show that U aaF can significantly reduce the startup latency and memory usage of serverless cloud applications. Moreover, the VMFUNC-based communication model can also significantly improve the performance of function invocations between different unikernels.
暂无评论