the proceedings contains 14 papers from the conference on the proceedings of the acmsigplansymposium on principles and practice of parallelprogramming, PPOPP. Topics discussed include: reference idempotency analysi...
详细信息
the proceedings contains 14 papers from the conference on the proceedings of the acmsigplansymposium on principles and practice of parallelprogramming, PPOPP. Topics discussed include: reference idempotency analysis: a framework for optimizing speculative execution;pointer and escape analysis for multithread programs;language support for motion-order matrices;efficient load balancing for wide-area divide-and-conquer applications;scalable queue-based spin locks with timeout;contention ellimination by replication of sequential sections in distributed shared memory programs;and accurate data redistribution cost estimation in software distributes shared memory systems.
the proceedings contain 26 papers. the topics discussed include: LogP: towards a realistic model of parallel computation;exploiting task and data parallelism on a multicomputer;ActorSpace: an open distributed programm...
ISBN:
(纸本)0897915895
the proceedings contain 26 papers. the topics discussed include: LogP: towards a realistic model of parallel computation;exploiting task and data parallelism on a multicomputer;ActorSpace: an open distributed programming paradigm;experiences using the ParaScope editor: an interactive parallelprogramming tool;perturbation analysis of high level instrumentation for SPMD programs;integrating message-passing and shared-memory: early experience;using scheduler information to achieve optimal barrier synchronization performance;and a concurrent copying garbage collector for languages that distinguish (im)mutable data.
the proceedings contain 58 papers. the topics discussed include: beyond human-level accuracy: computational challenges in deep learning;throughput-oriented GPU memory allocation;SEP-graph: finding shortest execution p...
ISBN:
(纸本)9781450362252
the proceedings contain 58 papers. the topics discussed include: beyond human-level accuracy: computational challenges in deep learning;throughput-oriented GPU memory allocation;SEP-graph: finding shortest execution paths for graph processing under a hybrid framework on GPU;incremental flattening for nested data parallelism;modular transactions: bounding mixed races in space and time;processing transactions in a predefined order;data-flow/dependence profiling for structured transformations;lightweight hardware transactional memory profiling;provably and practically efficient granularity control;semantics-aware scheduling policies for synchronization determinism;and a round-efficient distributed betweenness centrality algorithm.
the proceedings contain 43 papers. the topics discussed include: predator: predictive false sharing detection;concurrency testing using schedule bounding: an empirical study;trace driven dynamic deadlock detection and...
详细信息
ISBN:
(纸本)9781450326568
the proceedings contain 43 papers. the topics discussed include: predator: predictive false sharing detection;concurrency testing using schedule bounding: an empirical study;trace driven dynamic deadlock detection and reproduction;efficient search for inputs causing high floating-point errors;portable, MPI-interoperable coarray Fortran;eliminating global interpreter locks in ruby through hardware transactional memory;leveraging hardware message passing for efficient thread synchronization;well-structured futures and cache locality;time-warp: lightweight abort minimization in transactional memory;beyond parallelprogramming with domain specific languages;a decomposition for in-place matrix transposition;in-place transposition of rectangular matrices on accelerators;and parallelizing dynamic programmingthrough rank convergence.
the proceedings contain 44 papers. the topics discussed include: predicate RCU: an RCU for scalable concurrent updates;automatic scalable atomicity via semantic locking;a framework for practical parallel fast matrix m...
ISBN:
(纸本)9781450332057
the proceedings contain 44 papers. the topics discussed include: predicate RCU: an RCU for scalable concurrent updates;automatic scalable atomicity via semantic locking;a framework for practical parallel fast matrix multiplication;PLUTO+: near-complete modeling of affine transformations for parallelism and locality;distributed memory code generation for mixed irregular/regular computations;performance implications of dynamic memory allocators on transactional memory systems;low-overhead software transactional memory with progress guarantees and strong semantics∗;barrier elision for production parallel programs;scalable and efficient implementation of 3D unstructured meshes computation: a case study on matrix assembly;and diagnosing the causes and severity of one-sided message contention.
the proceedings contain 42 papers. the topics discussed include: automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories;type inference for locality anal...
ISBN:
(纸本)9781595939609
the proceedings contain 42 papers. the topics discussed include: automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories;type inference for locality analysis of distributed data structures;quasi-static scheduling for safe futures;scalable packet classification using interpreting: a cross-platform multi-core solution;FastForward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue;matrix product on heterogeneous master-worker platforms;high performance dense linear algebra on a spatially distributed processor;optimization principles and application performance evaluation of a multithreaded GPU using CUDA;a case study in SIMD text processing withparallel bit streams: UTF-8 to UTF-16 transcoding;programming with tiles;design and implementation of a high-performance MPI for C# and the common language infrastructure;and a portable runtime interface for multi-level memory hierarchies.
the proceedings contain 45 papers. the topics discussed include: a peta-scalable CPU-GPU algorithm for global atmospheric simulations;adoption protocols for fanout-optimal fault-tolerant termination detection;betweenn...
ISBN:
(纸本)9781450319225
the proceedings contain 45 papers. the topics discussed include: a peta-scalable CPU-GPU algorithm for global atmospheric simulations;adoption protocols for fanout-optimal fault-tolerant termination detection;betweenness centrality: algorithms and implementations;complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU;fast concurrent queues for x86 processors;FASTLANE: improving performance of software transactional memory for low thread counts;Ligra: a lightweight graph processing framework for shared memory;ownership passing: efficient distributed memory programming on multi-core systems;parallel suffix array and least common prefix for the GPU;Streamscan: fast scan algorithms for GPUs without global barrier synchronization;using hardware transactional memory to correct and simplify a readers-writer lock algorithm;and exploring different automata representations for efficient regular expression matching on GPUs.
the proceedings contain 39 papers. the topics discussed include: ordered vs. unordered: a comparison of parallelism and work-efficiency in irregular algorithms;programmingthe memory hierarchy revisited: supporting ir...
ISBN:
(纸本)9781450301190
the proceedings contain 39 papers. the topics discussed include: ordered vs. unordered: a comparison of parallelism and work-efficiency in irregular algorithms;programmingthe memory hierarchy revisited: supporting irregular parallelism in sequoia;compact data structure and scalable algorithms for the sparse grid technique;a domain-specific approach to heterogeneous parallelism;Copperhead: compiling an embedded data parallel language;OoOJava: software out-of-order execution;SpiceC: scalable parallelism via implicit copying and explicit commit;inferring ownership transfer for efficient message passing;all-window profiling and composable models of cache sharing;ULCC: a user-level facility for optimizing shared cache performance on multicores;ScalaExtrap: trace-based communication extrapolation for SPMD programs;and GRace: a low-overhead mechanism for detecting data races in GPU programs.
the proceedings contain 57 papers. the topics discussed include: scalable framework for mapping streaming applications onto multi-GPU systems;efficient performance evaluation of memory hierarchy for highly multithread...
ISBN:
(纸本)9781450311601
the proceedings contain 57 papers. the topics discussed include: scalable framework for mapping streaming applications onto multi-GPU systems;efficient performance evaluation of memory hierarchy for highly multithreaded graphics processors;extending a C-like language for portable SIMD programming;DOJ: dynamically parallelizing object-oriented programs;GPU-based NFA implementation for memory efficient high speed regular expression matching;concurrent tries with efficient non-blocking snapshots;deterministic parallel random-number generation for dynamic-multithreading platforms;algorithm-based fault tolerance for dense matrix factorizations;revisiting the combining synchronization technique;FlexBFS: a parallelism-aware implementation of breadth-first search on GPU;optimizing remote accesses for offloaded kernels: application to high-level synthesis for FPGA;and the boat hull model: adapting the roofline model to enable performance prediction for parallel computing.
暂无评论