Serverless computing has introduced unprecedented levels of scalability and parallelism for the execution of High Throughput Computing tasks. This represents a challenge and an opportunity for different scientific wor...
详细信息
ISBN:
(纸本)9781450359337
Serverless computing has introduced unprecedented levels of scalability and parallelism for the execution of High Throughput Computing tasks. This represents a challenge and an opportunity for different scientific workloads to be adapted to upcoming programming models that simplify the usage of such platforms. In this paper we introduce a serverless model for highly-parallel file-processing applications. We also describe a middleware implementation that supports the execution of customized execution environments based on Docker images on AWS Lambda, the leading serverless computing platform. Moreover, this middleware offers tools to manage the input/output of the serverless infrastructure and the creation of HTTP endpoints in a transparent way to the user. To test the programming model proposed and the middleware, this paper describes two case studies. The first one analyzes medical images with a high degree of parallelism. The second one presents an architecture to process video keyframes. The results from both case studies are discussed and a cost analysis of the medical image architecture comparing different Cloud options is carried out. The results show that the combination of a high-level programming model with the scalable capabilities of AWS Lambda makes it easy for end users to efficiently exploit serverless computing for the optimized and cost-effective execution of loosely-coupled tasks.
A new block algorithm for triangularization of regular or singular matrices with dimension m × n is proposed. Taking benefit of fast block multiplication algorithms, it achieves the best known sequential complexi...
ISBN:
(纸本)9781581134094
A new block algorithm for triangularization of regular or singular matrices with dimension m × n is proposed. Taking benefit of fast block multiplication algorithms, it achieves the best known sequential complexity Ο(mw-1n) for any sizes and any rank. Moreover, the block strategy enables to improve locality with respect to previous algorithms as exhibited by practical performances.
The proceedings contain 47 papers. The topics discussed include: parallel minimum cuts in near-linear work and low depth;trees for vertex cuts, hypergraph cuts and minimum hypergraph bisection;dynamic representations ...
ISBN:
(纸本)9781450357999
The proceedings contain 47 papers. The topics discussed include: parallel minimum cuts in near-linear work and low depth;trees for vertex cuts, hypergraph cuts and minimum hypergraph bisection;dynamic representations of sparse distributed networks: a locality-sensitive approach;constant-depth and subcubic-size threshold circuits for matrix multiplication;integrated model, batch, and domain parallelism in training neural networks;brief announcement: on approximating pagerank locally with sublinear query complexity;brief announcement: coloring-based task mapping for dragonfly systems;brief announcement: parallel transitive closure within 3D crosspoint memory;and lock-free contention adapting search trees.
The proceedings contain 45 papers. The topics discussed include: randomized composable coresets for matching and vertex cover;almost optimal streaming algorithms for coverage problems;bicriteria distributed submodular...
ISBN:
(纸本)9781450345934
The proceedings contain 45 papers. The topics discussed include: randomized composable coresets for matching and vertex cover;almost optimal streaming algorithms for coverage problems;bicriteria distributed submodular maximization in a few rounds;on energy conservation in data centers;asymptotically optimal approximation algorithms for coflow scheduling;online flexible job scheduling for minimum span;minimizing total weighted flow time with calibrations;brief announcement: scheduling parallelizable jobs online to maximize throughput;brief announcement: a new improved bound for coflow scheduling;and a communication-avoiding parallel algorithm for the symmetric eigenvalue problem.
High-level parallel programming is a de-facto standard approach to develop parallel software with reduced time to development. High-level abstractions are provided by existing frameworks as pragma-based annotations in...
详细信息
ISBN:
(纸本)9781450344869
High-level parallel programming is a de-facto standard approach to develop parallel software with reduced time to development. High-level abstractions are provided by existing frameworks as pragma-based annotations in the source code, or through pre-built parallel patterns that recur frequently in parallelalgorithms, and that can be easily instantiated by the programmer to add a structure to the development of parallel software. In this paper we focus on this second approach and we propose P3ARSEC, a benchmark suite for parallel pattern-based frameworks consisting of a representative subset of PARSEC applications. We analyse the programmability advantages and the potential performance penalty of using such high-level methodology with respect to hand-made parallelisations using low-level mechanisms. The results are obtained on the new Intel Knights Landing multicore, and show a significantly reduced code complexity with comparable performance. Copyright 2017 acm.
The demand for computational power is constantly increasing, which requires financial investments and know-how for companies. The answer to this challenge is two-fold. First, companies can rely on cloud providers to p...
详细信息
The proceedings contain 52 papers. The topics discussed include: randomized approximate nearest neighbor search with limited adaptivity;encoding short ranges in TCAM without expansion: efficient algorithm and applicat...
ISBN:
(纸本)9781450342100
The proceedings contain 52 papers. The topics discussed include: randomized approximate nearest neighbor search with limited adaptivity;encoding short ranges in TCAM without expansion: efficient algorithm and applications;extending the nested parallel model to the nested dataflow model with provably efficient schedulers;latency-hiding work stealing: scheduling interacting parallel computations with work stealing;provably good and practically efficient parallel race detection for fork-join programs;dynamic determinacy race detection for task parallelism with futures;RUBIC: online parallelism tuning for co-located transactional memory applications;investigating the performance of hardware transactions on a multi-socket machine;parallelalgorithms for asymmetric read-write costs;general profit scheduling and the power of migration on heterogeneous machines;the power of migration in online machine minimization;fair online scheduling for selfish jobs on heterogeneous machines;scheduling parallelizable jobs online to minimize the maximum flow time;clairvoyant dynamic bin packing for job scheduling with minimum server usage time;parallel equivalence class sorting: algorithms, lower bounds, and distribution-based analysis;and parallel approaches to the string matching problem on the GPU.
Neuromorphic computing is a non-von Neumann architecture that mimics how the brain performs neural network types of computation in real hardware. It has been shown that this class of computing can execute data classif...
详细信息
It was experimentally observed that the majority of real-world networks are scale-free and follow power law degree distribution. The aim of this paper is to study the algorithmic complexity of such "typical"...
详细信息
The proceedings contain 42 papers. The topics discussed include: sorting with asymmetric read and write costs;myths and misconceptions about threads;new streaming algorithms for parameterized maximal matching & be...
ISBN:
(纸本)9781450335881
The proceedings contain 42 papers. The topics discussed include: sorting with asymmetric read and write costs;myths and misconceptions about threads;new streaming algorithms for parameterized maximal matching & beyond;efficient approximation algorithms for computing k disjoint restricted shortest paths;fast and better distributed MapReduce algorithms for k-center clustering;fair adaptive parallelism for concurrent transactional memory applications;towards a universal approach for the finite departure problem in overlay networks;a compiler-runtime application binary interface for pipe-while loops;the Cilkprof scalability profiler;efficiently detecting races in Cilk programs that use reducer hyperobjects;ThreadScan: automatic and scalable memory reclamation;temporal fairness of round robin: competitive analysis for Lk-norms of flow time;and space and time efficient parallel graph decomposition, clustering, and diameter approximation.
暂无评论