Deep packet inspection via regular expression (RE) matching is a crucial task of network intrusion detection systems (IDSes), which secure Internet connection against attacks and suspicious network traffic. Monitoring...
详细信息
ISBN:
(纸本)9781728111315
Deep packet inspection via regular expression (RE) matching is a crucial task of network intrusion detection systems (IDSes), which secure Internet connection against attacks and suspicious network traffic. Monitoring high-speed computer networks (100 Gbps and faster) in a single-box solution demands that the RE matching, traditionally based on finite automata (FAs), is accelerated in hardware. In this paper, we describe a novel FPGA architecture for RE matching that is able to process network traffic beyond 100 Gbps. the key idea is to reduce the required FPGA resources by leveraging approximate nondeterministic FAs (NFAs). the NFAs are compiled into a multi-stage architecture starting withthe least precise stage with a highthroughput and ending withthe most precise stage with a low throughput. To obtain the reduced NFAs, we propose new approximate reduction techniques that take into account the profile of the network traffic. Our experiments showed that using our approach, we were able to perform matching of large sets of REs from SNORT, a popular IDS, on unprecedented network speeds.
Register allocation is a crucial step in the compilation pipeline that decides what program values occupy which physical registers. Single-path code's use of predicated instructions instead of branching control-fl...
详细信息
ISBN:
(纸本)9798350371291;9798350371284
Register allocation is a crucial step in the compilation pipeline that decides what program values occupy which physical registers. Single-path code's use of predicated instructions instead of branching control-flow means register allocation must also allocate predicate registers. In this paper, we improve the original single-path transformation to allow generic register allocators to allocate predicate registers. Our improved transformation splits register allocation into two. First, the general-purpose registers are allocated as usual using a generic register allocator. then, the main steps of the single-path transformation are performed while still using virtual predicate registers. Lastly, register allocation is rerun using the generic allocator to allocate the predicate registers. Our results show the improved single-path transformation increasing performance by up to 80 % and reducing code size by up to 43 % compared to the original transformation that uses a custom predicate allocator.
the proceedings contain 267 papers. the topics discussed include: network delay-Aware load balancing in selfish and cooperative distributed systems;an analysis framework for investigating the trade-offs between system...
the proceedings contain 267 papers. the topics discussed include: network delay-Aware load balancing in selfish and cooperative distributed systems;an analysis framework for investigating the trade-offs between system performance and energy consumption in a heterogeneous computing environment;scheduling tightly-coupled applications on heterogeneous desktop grids;an on-chip heterogeneous implementation of a general sparse linear solver;seeds for a heterogeneous interconnect;architecture exploration of high-performance floating-point fused multiply-Add units and their automatic use in high-level synthesis;a flexible memory controller supporting deep belief networks with fixed-point arithmetic;hardware supported adaptive data collection for networks on chip;automated partitioning for partial reconfiguration design of adaptive systems;and cross-Architectural study of custom reconfigurable devices using crowdsourcing.
Fog computing extends the Cloud computing paradigm to the edge of the network, developing a decentralized infrastructure in which services are distributed to locations that best meet the needs of the applications such...
详细信息
ISBN:
(纸本)9781538677698
Fog computing extends the Cloud computing paradigm to the edge of the network, developing a decentralized infrastructure in which services are distributed to locations that best meet the needs of the applications such as low communication latency, data caching or confidentiality. P2P-based platforms are good candidates to host Fog computing, but they usually lack important elements such as controlling where the data is stored and who will handle the computing tasks. As a consequence, controlling where the data is stored becomes as important as controlling who handle it. In this paper we propose different techniques to reinforce data-locality for P2P-based middlewares, and study how these techniques can be implemented. Experimental results demonstrate the interest of data locality on the data access performances.
the proceedings contain 26 papers. the topics discussed include: towards seismic wave modeling on heterogeneous many-core architectures using task-based runtime system;GPU-accelerated high-speed eye pupil tracking sys...
ISBN:
(纸本)9781467380119
the proceedings contain 26 papers. the topics discussed include: towards seismic wave modeling on heterogeneous many-core architectures using task-based runtime system;GPU-accelerated high-speed eye pupil tracking system;performance and energy efficient hardware-based scheduler for symmetric/asymmetric CMPs;analysis and optimization of engines for dynamically typed languages;progressive co-design of an architecture and compiler using a proxy application;unifying router power gating with data placement for energy-efficient NoC;fusion of calling sites;watt watcher: fine-grained power estimation for emerging workloads;performance characterization of modern databases on out-of-order CPUs;and non-stationary simulation of computer systems and dynamic performance evaluation: a concern-based approach and case study on cloud computing.
Cloud multi-tenancy is typically constrained to a single interactive service colocated with one or more batch, low-priority services, whose performance can be sacrificed when deemed necessary. Approximate computing ap...
详细信息
ISBN:
(纸本)9781728114446
Cloud multi-tenancy is typically constrained to a single interactive service colocated with one or more batch, low-priority services, whose performance can be sacrificed when deemed necessary. Approximate computing applications offer the opportunity to enable tighter colocation among multiple applications whose performance is important. We present Pliant, a lightweight cloud runtime that leverages the ability of approximate computing applications to tolerate some loss in their output quality to boost the utilization of shared servers. During periods of high resource contention, Pliant employs incremental and interference-aware approximation to reduce contention in shared resources, and prevent QoS violations for co-scheduled interactive, latency-critical services. We evaluate Pliant across different interactive and approximate computing applications, and show that it preserves QoS for all co-scheduled workloads, while incurring a 2.1% loss in output quality, on average.
the current trace-driven simulation approach to determine superscalar processor performance is widely used but has some shortcomings. Modern benchmarks generate extremely long traces, resulting in problems with data s...
详细信息
Large-scale data centers run latency-critical jobs with quality-of-service (QoS) requirements, and throughput-oriented background :jobs, which need to achieve highperformance. Previous works have proposed methods whi...
详细信息
ISBN:
(纸本)9781728161495
Large-scale data centers run latency-critical jobs with quality-of-service (QoS) requirements, and throughput-oriented background :jobs, which need to achieve highperformance. Previous works have proposed methods which cannot co-locate multiple latency-critical jobs with multiple backgrounds jobs while: (I) meeting the QoS requirements of all latency-critical jobs, and (2) maximizing the performance of the background jobs. this paper proposes CLITE, a Bayesian Optimization-based, multi-resource partitioning technique which achieves these goals.
Silicon-Photonics architectures have enabled high speed hardware implementations of Reservoir computing (RC). With a delayed feedback reservoir (DFR) model, only one non-linear node can be used to perform RC. However,...
详细信息
ISBN:
(纸本)9781728199245
Silicon-Photonics architectures have enabled high speed hardware implementations of Reservoir computing (RC). With a delayed feedback reservoir (DFR) model, only one non-linear node can be used to perform RC. However, the delay is often provided by using off-chip fiber optics which is not only space inconvenient but it also becomes architectural bottleneck and hinders to scalability. In this paper, we propose a completely on-chip photonic RC architecture for highperformancecomputing, employing multiple electronically tunable delay lines and micro-ring resonator (MRR) switch for multi-tasking. Proposed architecture provides 84% less error compared to the state-of-the-art standalone architecture in [8] for executing NARMA task. For multi-tasking, the proposed architecture shows 80% better performancethan [8]. the architecture outperforms all other proposed architectures as well. the on-chip area and power overhead of proposed architecture due to delay lines and MRR switch are 0.0184mm(2) and 26mW respectively.
With advances in technology, the issue of object detection and recognition has gained significant recognition in the field of computer vision. there are currently several algorithms that address this growing demand, n...
详细信息
ISBN:
(纸本)9798350371291;9798350371284
With advances in technology, the issue of object detection and recognition has gained significant recognition in the field of computer vision. there are currently several algorithms that address this growing demand, namely region-based convolutional neural networks (R-CNN) and the You Only Look Once (YOLO) technique. the R-CNN technique encompasses a range of methodologies designed to address object localization and recognition tasks. In addition, the YOLO technique is a distinct set of methodologies that focuses primarily on real-time object recognition and fast performance. the R-CNN and YOLO techniques, in particular, have undergone subsequent improvements, resulting in higher levels of accuracy and performancethan their predecessors. the aim of this article is to review these various object detection methods based on CNN.
暂无评论