The proceedings contains 27 papers from the 1997 international conference on parallel architectures and compilation techniques. Topics discussed include: locality analysis for parallel C programs;heap analysis;limited...
详细信息
The proceedings contains 27 papers from the 1997 international conference on parallel architectures and compilation techniques. Topics discussed include: locality analysis for parallel C programs;heap analysis;limited network bandwidth;register pressure sensitive instruction scheduler for dynamic issue processors;path profile guided partial dead code elimination;fine-grain communication;buffer-safe communication optimization;shared memory multiprocessors;compiler algorithms;vector registers in advanced vector architectures;cache management;Monte Carlo photon transport codes;stream-oriented processing;and tiling techniques.
The proceedings contains 26 papers from the 2001 international conference on parallel architectures and compilation techniques. The topics discussed include: basic block distribution analysis to find periodic behavior...
详细信息
The proceedings contains 26 papers from the 2001 international conference on parallel architectures and compilation techniques. The topics discussed include: basic block distribution analysis to find periodic behavior and simulation;modeling superscalar processors via statistical simulation;filtering techniques to improve trace-cache efficiency;reactive-associative caches;recovery mechanism for latency misprediction and compiling for the impulse memory controller.
The proceedings contain 23 papers from the 13th international conference on parallel architectures and compilation techniques (PACT 2004). The topics discussed include: code generation in the polyhedral model is easie...
详细信息
The proceedings contain 23 papers from the 13th international conference on parallel architectures and compilation techniques (PACT 2004). The topics discussed include: code generation in the polyhedral model is easier than you think;adding limited reconfigurability to superscalar processors;architectural support for enhanced SMT job scheduling;the energy impact of aggressive loop fusion;scalable high performance cross-module inlining and fast paths in concurrent programs.
The proceedings contain 34 papers. The topics discussed include: adaptive locks: combining transactions and locks for efficient concurrency;Anaphase: a fine-grain thread decomposition scheme for speculative multithrea...
ISBN:
(纸本)9780769537719
The proceedings contain 34 papers. The topics discussed include: adaptive locks: combining transactions and locks for efficient concurrency;Anaphase: a fine-grain thread decomposition scheme for speculative multithreading;characterizing the TLB behavior of emerging parallel workloads on chip multiprocessors;interprocedural load elimination for dynamic optimization of parallel programs;algorithmic skeletons within an embedded domain specific language for the CELL processor;SHIP: scalable hierarchical power control for large-scale data centers;exploring phase change memory and 3D die-stacking for power/thermal friendly, fast and durable memory architectures;Chainsaw: using binary matching for relative instruction mix comparison;Stealthtest: low overhead online software testing using transactional memory;Flextream: adaptive compilation of streaming applications for heterogeneous architectures;and soft-OLP: improving hardware cache performance through software-controlled object-level partitioning.
The proceedings contain 68 papers. The topics discussed include: towards a science of parallel programming;raising the level of many-core programming with compiler technology - meeting a grand challenge;power and ther...
ISBN:
(纸本)9781450301787
The proceedings contain 68 papers. The topics discussed include: towards a science of parallel programming;raising the level of many-core programming with compiler technology - meeting a grand challenge;power and thermal characterization of POWER6 system;system-level max power (SYMPO) - a systematic approach for escalating system-level power consumption using synthetic benchmarks;scalable thread scheduling and global power management for heterogeneous many-core architectures;dynamically managed multithreaded reconfigurable architectures for chip multiprocessors;accelerating multicore reuse distance analysis with sampling and parallelization;simple and fast biased locks;avoiding deadlock avoidance;DAFT: decoupled acyclic fault tolerance;WAYPOINT: scaling coherence to 1000-core architectures;and subspace snooping: filtering snoops with operating system support.
The proceedings contain 27 papers. The topics discussed include: SZKP: a scalable accelerator architecture for zero-knowledge proofs;recompiling QAOA circuits on various rotational directions;MIREncoder: multi-modal I...
ISBN:
(纸本)9798400706318
The proceedings contain 27 papers. The topics discussed include: SZKP: a scalable accelerator architecture for zero-knowledge proofs;recompiling QAOA circuits on various rotational directions;MIREncoder: multi-modal IR-based pretrained embeddings for performance optimizations;a parallel hash table for streaming applications;PipeGen: automated transformation of a single-core pipeline into a multicore pipeline for a given memory consistency model;NavCim: comprehensive design space exploration for analog computing-in-memory architectures;optimizing tensor computation graphs with equality saturation and Monte Carlo tree search;chimera: leveraging hybrid offsets for efficient data prefetching;toast: a heterogeneous memory management system;and a transducers-based programming framework for efficient data transformation.
The proceedings contain 25 papers. The topics discussed include: a flexible approach to autotuning multi-pass machine learning compilers;program lifting using gray-box behavior;HERTI: a reinforcement learning-augmente...
ISBN:
(纸本)9781665442787
The proceedings contain 25 papers. The topics discussed include: a flexible approach to autotuning multi-pass machine learning compilers;program lifting using gray-box behavior;HERTI: a reinforcement learning-augmented system for efficient real-time inference on heterogeneous embedded systems;X-Layer: building composable pipelined dataflows for low-rank convolutions;precision batching: bitserial decomposition for efficient neural network inference on GPUs;google neural network models for edge devices: analyzing and mitigating machine learning inference bottlenecks;and ultra efficient acceleration for de novo genome assembly via near-memory computing.
The proceedings contain 68 papers. The topics discussed include: dynamic fine-grain scheduling of pipeline parallelism;SPATL: honey, i shrunk the coherence directory;an OpenCL framework for homogeneous manycores with ...
ISBN:
(纸本)9780769545660
The proceedings contain 68 papers. The topics discussed include: dynamic fine-grain scheduling of pipeline parallelism;SPATL: honey, i shrunk the coherence directory;an OpenCL framework for homogeneous manycores with no hardware cache coherence;compiling dynamic data structures in python to enable the use of multi-core and many-core libraries;PEPSC: a power-efficient processor for scientific computing;improving throughput of power-constrained GPUs using dynamic voltage/frequency and core scaling;phase-based application-driven hierarchical power management on the single-chip cloud computer;DeNovo: rethinking the memory hierarchy for disciplined parallelism;building retargetable and efficient compilers for multimedia instruction sets;understanding the behavior of Pthread applications on non-uniform cache architectures;and parameterized micro-benchmarking: an auto-tuning approach for complex applications.
The proceedings contain 43 papers. The topics discussed include: parallel flow-sensitive pointer analysis by graph-rewriting;interprocedural strength reduction of critical sections in explicitly-parallel programs;coor...
ISBN:
(纸本)9781479910212
The proceedings contain 43 papers. The topics discussed include: parallel flow-sensitive pointer analysis by graph-rewriting;interprocedural strength reduction of critical sections in explicitly-parallel programs;coordinated power-performance optimization in manycores;exploring hybrid memory for GPU energy efficiency through software-hardware co-design;writeback-aware bandwidth partitioning for multi-core systems with PCM;a unified view of non-monotonic core selection and application steering in heterogeneous chip multiprocessors;memory-centric system interconnect design with hybrid memory cubes;SMT-centric power-aware thread placement in chip multiprocessors;an empirical model for predicting cross-core performance interference on multicore processors;and managing shared last-level cache in a heterogeneous multicore processor.
暂无评论