the proceedings contain 35 papers. the topics discussed include: side effect mitigation algorithm for cache maintenance in opportunistic networks;simple floor plan construction via cloudsourcing;using open edX forum f...
ISBN:
(纸本)9781467375894
the proceedings contain 35 papers. the topics discussed include: side effect mitigation algorithm for cache maintenance in opportunistic networks;simple floor plan construction via cloudsourcing;using open edX forum for supporting collaborative learning;towards an automatic prediction of image processingalgorithms performances on embedded heterogeneous architectures;multi-core approach towards efficient biometric cryptosystems;sparse matrix sparse vector multiplication - a novel approach;thread-level value speculation for image-processing applications;a special sorting method for neighbor search procedure in smoothed particle hydrodynamics on GPUs;the design and implementation of embedded online laboratory;and spatial join query processing in cloud: analyzing design choices and performance comparisons.
We study the traffic characteristics of parallel and high performance computing applications in this paper. Applications that utilize multiple cores are more and more common nowadays due to the emergence of multicore ...
详细信息
ISBN:
(纸本)9789897581397
We study the traffic characteristics of parallel and high performance computing applications in this paper. Applications that utilize multiple cores are more and more common nowadays due to the emergence of multicore processors. However the design nature of single-threaded applications and multi-threaded applications can vary significantly. Furthermore the on-chip communication profile of multicore systems should be analysed and modelled for characterization and simulation purposes. We investigate several applications running on a full system simulation environment. the on-chip communication traces are gathered and analysed. We study the detailed low-level profiles of these applications. the applications are categorized into different groups according to various parallel programming paradigms. We discover that the trace data follow different parameters of power-law model. the problem is solved by applying least-squares linear regression. We propose a generic synthetic traffic model based on the analysis results.
GPGPU (General Purpose computing on Graphics processing Units) has marked a revolution in the field of parallel Computing allowing to achieve computational performance unimaginable until a few years ago. this hardware...
详细信息
ISBN:
(纸本)9781467394734
GPGPU (General Purpose computing on Graphics processing Units) has marked a revolution in the field of parallel Computing allowing to achieve computational performance unimaginable until a few years ago. this hardware has proven to be extremely reliable and suitable to simulate Cellular Automata (CA) models for modeling complex systems whose evolution can be described in terms of local interactions. Starting from previous GPGPU implementations of CA models with CUDA, this paper presents an effective implementation of a well-known numerical model for simulating lava flows on Graphical processing Units (GPU) based on the OpenCL (Open Computing Language) standard. In addition, a preliminary Civil Defence application related Hazard maps of an area located at Mt. Etna volcano (South Italy), confirms the validity of OpenCL and both low-cost and high-end graphics hardware as an alternative to expensive solutions for the simulation of CA models.
FPGA-based soft processors customized for operations on sparse graphs can deliver significant performance improvements over conventional organizations (ARMv7 CPUs) for bulk synchronous sparse graph algorithms. We deve...
详细信息
ISBN:
(纸本)9781479919253
FPGA-based soft processors customized for operations on sparse graphs can deliver significant performance improvements over conventional organizations (ARMv7 CPUs) for bulk synchronous sparse graph algorithms. We develop a stripped-down soft processor ISA to implement specific repetitive operations on graph nodes and edges that are commonly observed in sparse graph computations. In the processing core, we provide hardware support for rapidly fetching and processing state of local graph nodes and edges through spatial address generators and zero-overhead loop iterators. We interconnect a 2D array of these lightweight processors with a packet-switched network-on-chip to enable fine-grained operand routing along the graph edges and provide custom send/receive instructions in the soft processor. We develop the processor RTL using Vivado High-Level Synthesis and also provide an assembler and compilation flow to configure the processor instruction and data memories. We outperform a Microblaze (100MHz on Zedboard) and an NIOS-II/f (100MHz on DE2-115) by 6x (single processor design) as well as the ARMv7 dual-core CPU on the Zynq SoCs by as much as 10x on the Xilinx ZC706 board (100 processor design) across a range of matrix datasets.
Detection of stateful complex event patterns using parallel programming features is a challenging task because of statefulness of event detection operators. parallelization of event detection tasks needs to be impleme...
详细信息
ISBN:
(纸本)9781450340212
Detection of stateful complex event patterns using parallel programming features is a challenging task because of statefulness of event detection operators. parallelization of event detection tasks needs to be implemented in a way that keeps track of state changes by new arriving *** this paper, we describe our implementation for a customized complex event detection engine by using Open Multi-processing (OpenMP), a shared memory programming model. In our system event detection is implemented using Deterministic Finite Automata (DFAs). We implemented a data stream aggregator that merges 4 given event streams into a sequence of C++ objects in a buffer used as source event stream for event detection in a next processing step. We describe implementation details and 3 architectural variations for stream aggregation and parallelized of event processing. We conducted performance experiments with each of the variations and report some of our experimental results. A comparison of our performance results shows that for event processing on single machine with multi cores and limited memory, using mutli-threads with shared buffer has better stream processing performance than an implementation with multi-processes and shared memory.
Cloud computing technologies are bringing new scales of computational processing power and storage capacity to meet very demanding requirements of today's applications. One such family of applications is the one o...
详细信息
ISBN:
(纸本)9781467394734
Cloud computing technologies are bringing new scales of computational processing power and storage capacity to meet very demanding requirements of today's applications. One such family of applications is the one of analytics based on processing big data. More specifically, there is a large family of analytics applications from processing log data files. Indeed, log data files are commonplace in many Internet-based systems and applications, comprising system logs, server logs, application logs, databases logs, user activity logs, etc. these applications are analytics oriented applications based on processingthe various types of log files. While log data file processing has been recently an issue of investigation by many researchers and developers, the new feature is that of scale: Cloud based systems can enable processing unlimited amount of data either off-line or online in streaming mode. In this work we evaluate the performance of a MapReduce Hadoop-based implementation for processing large log data files of a Virtual Campus. the study aims to reveal the potential of using such implementations as a basis for learning analytics for use by a variety of users in a Virtual Campus.
the current development of high performance parallel supercomputing infrastructures are pushing the boundaries of applications of science and are bringing new paradigms into engineering practices and simulations. Eart...
详细信息
ISBN:
(纸本)9781479918768
the current development of high performance parallel supercomputing infrastructures are pushing the boundaries of applications of science and are bringing new paradigms into engineering practices and simulations. Earthquake engineering is also one of the major fields, which benefits from above by looking for solutions in grid computing and cloud computing techniques. Generally, earthquake simulations involve analysis of petabytes of data. Analyzing these large amounts of data in parallel in thousands of nodes in computer clusters results in gaining high performances. Open source cloud solutions such as Hadoop Map Reduce, which is highly scalable and capable of processing large amount of data rapidly in parallel on large clusters provide better solution compared to RDBDM. Both GPUs and MapReduce are designed to support vast data parallelism. For performance considerations, GPU computing could be adopted over low performing CPU systems. this paper discusses MapReduce system using Hadoop and Mars. Mars is a MapReduce framework on graphics processor. Hence, the proposition is to use GPU based systems for earthquake simulations in which Digital elevation model 3D data sets are fully materialized where scientist can make use of these data for various analysis and simulations.
Dynamic programming techniques are well-established and employed by various practical algorithms which are used as similarity measures, for instance the edit-distance algorithm or the dynamic time warping algorithm. T...
详细信息
ISBN:
(纸本)9783319250878;9783319250861
Dynamic programming techniques are well-established and employed by various practical algorithms which are used as similarity measures, for instance the edit-distance algorithm or the dynamic time warping algorithm. these algorithms usually operate in iteration-based fashion where new values are computed from values of the previous iteration, thus they cannot be processed by simple data-parallel approaches. In this paper, we propose a way how to utilize computational power of massively parallel GPUs to compute dynamic programming algorithms effectively and efficiently. We address boththe problem of computing one distance on large inputs concurrently and the problem of computing large number of distances simultaneously (e.g., when a similarity query is being resolved).
Hierarchical identity-based cryptography is an efficient technology to address the security issues in cloud storage. However, the inherent key escrow problem primarily hinders the widespread adoption of this cryptosys...
详细信息
ISBN:
(纸本)9783319271613;9783319271606
Hierarchical identity-based cryptography is an efficient technology to address the security issues in cloud storage. However, the inherent key escrow problem primarily hinders the widespread adoption of this cryptosystem in practice. To address the key escrow problem, this paper proposes an escrow-free hierarchical identity-based signature model, in which a user signs messages with a user-selected secret and PKG signing factor apart from the private key. For proving the full security, we formulate three security games with respect to our signature model. We instantiate the escrow-free model into a specific scheme based on the SHER-IBS scheme and prove that our scheme is secure against adaptive chosen ID and message attacks.
To realize conveniently deployed video surveillance applications, this paper designs a cloud service system employing ubiquitously available IoT nodes. Considering limited capacity of each IoT node, this paper first d...
详细信息
ISBN:
(纸本)9783319271613;9783319271606
To realize conveniently deployed video surveillance applications, this paper designs a cloud service system employing ubiquitously available IoT nodes. Considering limited capacity of each IoT node, this paper first describes the system architecture and operation procedure for application requests, and introduces the design of scheduler's function and typical video processingalgorithms. Further, for decreasing transmission conflicts among video/image processor nodes, this paper proposes a scheduling methods based on Genetic Algorithm to rationally utilize the cooperative IoT nodes. Simulation results show that, compared with common methods such as random scheduling and opportunity-balanced scheduling, this method yields much smaller processing delay and transmission delay, together with higher packet delivery ratio.
暂无评论