In traditional database systems, data anonymization has been extensively studied, it provides an effective solution for data privacy preservation, and multidimensional anonymization scheme among them is widely used. H...
详细信息
Serverless computing, also known as “Function as a Service (FaaS)”, is emerging as an event-driven paradigm of cloud computing. In the FaaS model, applications are programmed in the form of functions that are execut...
详细信息
ISBN:
(数字)9781728168876
ISBN:
(纸本)9781728168883
Serverless computing, also known as “Function as a Service (FaaS)”, is emerging as an event-driven paradigm of cloud computing. In the FaaS model, applications are programmed in the form of functions that are executed and managed separately. Functions are triggered by cloud users and are provisioned dynamically through containers or virtual machines (VMs). The startup delays of containers or VMs usually lead to rather high latency of response to cloud users. Moreover, the communication between different functions generally relies on virtual net devices or shared memory, and may cause extremely high performance overhead. In this paper, we propose Unikernel-as-a-Function (UaaF), a much more lightweight approach to serverless computing. Applications are abstracted as a combination of different functions, and each function are built as an unikernel in which the function is linked with a specified minimum-sized library operating system (LibOS). UaaF offers extremely low startup latency to execute functions, and an efficient communication model to speed up inter-functions interactions. We exploit an new hardware technique (namely VMFUNC) to invoke functions in other unikernels seamlessly (mostly like inter-process communications), without suffering performance penalty of VM Exits. We implement our proof-of-concept prototype based on KVM and deploy UaaF in three unikernels (MirageOS, IncludeOS, and Solo5). Experimental results show that U aaF can significantly reduce the startup latency and memory usage of serverless cloud applications. Moreover, the VMFUNC-based communication model can also significantly improve the performance of function invocations between different unikernels.
Graph is a well known data structure to represent the associated relationships in a variety of applications,e.g.,data science and machine *** a wealth of existing efforts on developing graph processing systems for imp...
详细信息
Graph is a well known data structure to represent the associated relationships in a variety of applications,e.g.,data science and machine *** a wealth of existing efforts on developing graph processing systems for improving the performance and/or energy efficiency on traditional architectures,dedicated hardware solutions,also referred to as graph processing accelerators,are essential and emerging to provide the benefits significantly beyond what those pure software solutions can *** this paper,we conduct a systematical survey regarding the design and implementation of graph processing ***,we review the relevant techniques in three core components toward a graph processing accelerator:preprocessing,parallel graph computation,and runtime *** also examine the benchmarks and results in existing studies for evaluating a graph processing ***,we find that there is not an absolute winner for all three aspects in graph acceleration due to the diverse characteristics of graph processing and the complexity of hardware *** finally present and discuss several challenges in details,and further explore the opportunities for the future research.
The persistent memory (PM) requires maintaining the crash consistency and encrypting data, to ensure data recoverability and data confidentiality. The enforcement of these two goals does not only put more burden on pr...
详细信息
ISBN:
(数字)9783981926347
ISBN:
(纸本)9781728144689
The persistent memory (PM) requires maintaining the crash consistency and encrypting data, to ensure data recoverability and data confidentiality. The enforcement of these two goals does not only put more burden on programmers but also degrades performance. To address this issue, we propose a hardware-assisted encrypted persistent memory system. Specifically, logging and data encryption are assisted by hardware. Furthermore, we apply the counter-based encryption and the cipher feedback (CFB) mode encryption to data and log respectively, reducing the encryption overhead. Our primary experimental results show that the transaction throughput of the proposed design outperforms the baseline design by up to 34.4%.
Resistive random access memory (ReRAM) addresses the high memory bandwidth requirement challenge of graph analytics by integrating the computing logic in the memory. Due to the matrix-structured crossbar architecture,...
详细信息
ISBN:
(数字)9781728168760
ISBN:
(纸本)9781728168777
Resistive random access memory (ReRAM) addresses the high memory bandwidth requirement challenge of graph analytics by integrating the computing logic in the memory. Due to the matrix-structured crossbar architecture, existing ReRAM-based accelerators, when handling real-world graphs that often have the skewed degree distribution, suffer from the severe sparsity problem arising from zero fillings and activation nondeterminism, incurring substantial ineffectual *** this paper, we observe that the sparsity sources lie in the consecutive mapping of source and destination vertex index onto the wordline and bitline of a crossbar. Although exhaustive graph reordering improves the sparsity-induced inefficiency, its totally-random (source and destination) vertex mapping leads to expensive overheads. This work exploits the insight in a mid-point vertex mapping with the random wordlines and consecutive bitlines. A cost-effective preprocessing is proposed to exploit the insight by rapidly exploring the crossbar-fit vertex reorderings but ignores the sparsity arising from activation dynamics. We present a novel ReRAM-based graph analytics accelerator, named Spara, which can maximize the workload density of crossbars dynamically by using a tightly-coupled bank parallel architecture further proposed. Results on real-world and synthesized graphs show that Spara outperforms GraphR and GraphSAR by 8.21 × and 5.01 × in terms of performance, and by 8.97 × and 5.68× in terms of energy savings (on average), while incurring a reasonable (
Processing-In-Memory (PIM) is an emerging technology that addresses the memory bottleneck of graph processing. In general, analog memristor-based PIM promises high parallelism provided that the underlying matrix-struc...
详细信息
ISBN:
(数字)9781728168760
ISBN:
(纸本)9781728168777
Processing-In-Memory (PIM) is an emerging technology that addresses the memory bottleneck of graph processing. In general, analog memristor-based PIM promises high parallelism provided that the underlying matrix-structured crossbar can be fully utilized while digital CMOS-based PIM has a faster single-edge execution but its parallelism can be low. In this paper, we observe that there is no absolute winner between these two representative PIM technologies for graph applications, which often exhibit irregular workloads. To reap the best of both worlds, we introduce a new heterogeneous PIM hardware, called Hetraph, to facilitate energy-efficient graph processing. Hetraph incorporates memristor-based analog computation units (for high-parallelism computing) and CMOS-based digital computation cores (for efficient computing) on the same logic layer of a 3D die-stacked memory device. To maximize the hardware utilization, our software design offers a hardware heterogeneity-aware execution model and a workload offloading mechanism. For performance speedups, such a hardware-software co-design outperforms the state-of-the-art by 7.54 ×(CPU), 1.56 ×(GPU), 4.13× (memristor-based PIM) and 3.05× (CMOS-based PIM), on average. For energy savings, Hetraph reduces the energy consumption by 57.58× (CPU), 19.93× (GPU), 14.02 ×(memristor-based PIM) and 10.48 ×(CMOS-based PIM), on average.
Graph is one of the most important data structures to model social networks and becomes popular to find interesting relationships between individuals. Since graphs may contain sensitive information, data curators usua...
详细信息
Now, it is popular for people to share their feelings, activities tagged with geography and temporal information in Online Social Networks (OSNs). The spatial and temporal interactions occurred in OSNs contain a wealt...
详细信息
Graph mining is becoming increasingly important due to the ever-increasing demands on analyzing complex structures in graphs. Existing graph accelerators typically hold most of the randomly-accessed data in an on-chip...
详细信息
ISBN:
(数字)9781728173832
ISBN:
(纸本)9781728173849
Graph mining is becoming increasingly important due to the ever-increasing demands on analyzing complex structures in graphs. Existing graph accelerators typically hold most of the randomly-accessed data in an on-chip memory to avoid off-chip communications. However, graph mining exhibits substantial random accesses from not only vertex dimension but also edge dimension (with the latter being excessively more complex than the former), leading to significant degradations in terms of both performance and energy *** observe that the most random memory requests arising in graph mining come from accessing a small fraction of valuable (vertex and edge) data when handling real-world graphs. To exploit this extension locality with maximum parallelism, we architect GRAMER, the first graph mining accelerator. GRAMER contains a specialized memory hierarchy, where the valuable data (precisely identified through a cost-efficient heuristic) is permanently resident in a high-priority memory while others are maintained in a cache-like memory under a lightweight replacement policy. The specific pipelined processing units are carefully designed to maximize computational parallelism. GRAMER is also equipped with a work-stealing mechanism to reduce load imbalance. We have implemented GRAMER on a Xilinx Alveo U250 accelerator card. Compared with two state-of-the-art CPU-based graph mining systems, Fractal and RStream, running on a 14-core Intel E5-2680 v4 processor, GRAMER achieves not only considerable speedups (1.11 × ~ 129.95 ) but also significant energy savings (5.79 × ~ 678.34×)
Android has become the most popular mobile operating system. Correspondingly, an increasing number of Android malware has been developed and spread to steal users' private information. There exists one type of mal...
详细信息
Android has become the most popular mobile operating system. Correspondingly, an increasing number of Android malware has been developed and spread to steal users' private information. There exists one type of malware whose benign behaviors are developed to camouflage malicious behaviors. The malicious component occupies a small part of the entire code of the application (app for short), and the malicious part is strongly coupled with the benign part. In this case, the malware may cause false negatives when malware detectors extract features from the entire apps to conduct classification because the malicious features of these apps may be hidden among benign features. Moreover, some previous work aims to divide the entire app into several parts to discover the malicious part. However, the premise of these methods to commence app partition is that the connections between the normal part and the malicious part are weak (e.g., repackaged malware). In this paper, we call this type of malware as Android covert malware and generate the first dataset of covert malware. To detect covert malware samples, we first conduct static analysis to extract the function call graphs. Through the deep analysis on call graphs, we observe that although the correlations between the normal part and the malicious part in these graphs are high, the degree of these correlations has a unique range of distribution. Based on the observation, we design a novel system, HomDroid, to detect covert malware by analyzing the homophily of call graphs. We identify the ideal threshold of correlation to distinguish the normal part and the malicious part based on the evaluation results on a dataset of 4,840 benign apps and 3,385 covert malicious apps. According to our evaluation results, HomDroid is capable of detecting 96.8% of covert malware while the False Negative Rates of another four stateof- the-art systems (i.e., PerDroid, Drebin, MaMaDroid, and IntDroid) are 30.7%, 16.3%, 15.2%, and 10.4%, respectiv
暂无评论