Welcome to the 6th ieee international workshop on multi-/many-core computing systems (MuCoCoS), held on 7 September 2013 in Edinburgh, Scotland, UK, in conjunction with the 22nd international Conference on Parallel Ar...
Welcome to the 6th ieee international workshop on multi-/many-core computing systems (MuCoCoS), held on 7 September 2013 in Edinburgh, Scotland, UK, in conjunction with the 22nd international Conference on Parallel Architectures and Compilation Techniques (PACT-2013).
The proceedings contain 10 papers. The topics discussed include: SDAFT: a novel scalable data access framework for parallel BLAST;large data and computation in a hazard map workflow using Hadoop and Neteeza architectu...
ISBN:
(纸本)9781450325066
The proceedings contain 10 papers. The topics discussed include: SDAFT: a novel scalable data access framework for parallel BLAST;large data and computation in a hazard map workflow using Hadoop and Neteeza architectures;a framework for an in-depth comparison of scale-up and scale-out;BDMPI: conquering big data with small clusters using mpi;on the core affinity and file upload performance of Hadoop;enhancing both fairness and performance using rate-aware dynamic storage cache partitioning;design of an active storage cluster file system for dag workflows;and novel parallel method for mining frequent patterns on multi-core shared memory systems.
Visualizing large molecular data requires efficient means of rendering millions of data elements that combine glyphs, geometry and volumetric techniques. The geometric and volumetric loads challenge traditional raster...
详细信息
ISBN:
(纸本)9781450325004
Visualizing large molecular data requires efficient means of rendering millions of data elements that combine glyphs, geometry and volumetric techniques. The geometric and volumetric loads challenge traditional rasterization-based vis methods. Ray casting presents a scalable and memoryefficient alternative, but modern techniques typically rely on GPU-based acceleration to achieve interactive rendering rates. In this paper, we present bnsView, a molecular visualization ray tracing framework that delivers fast volume rendering and ball-and-stick ray casting on both multi-core CPUs and many-core Intel®Xeon Phi™ co-processors, implemented in a SPMD language that generates efficient SIMD vector code for multiple platforms without source modification. We show that our approach running on coprocessors is competitive with similar techniques running on GPU accelerators, and we demonstrate large-scale parallel remote visualization from TACC's Stampede supercomputer to large-format display walls using this system. Copyright 2013 ACM.
Spin glasses - theoretical models used to capture several physical properties of real glasses -are mostly studied by Monte Carlo simulations. The associated algorithms have a very large and easily identifiable degree ...
详细信息
ISBN:
(纸本)9781479907298
Spin glasses - theoretical models used to capture several physical properties of real glasses -are mostly studied by Monte Carlo simulations. The associated algorithms have a very large and easily identifiable degree of available parallelism, that can also be easily cast in SIMD form. State-of-the-art multi- and many-core processors and accelerators are therefore a promising computational platform to support these Grand Challenge applications. In this paper we port and optimize for many-core processors a Monte Carlo code for the simulation of the 3D Edwards Anderson spin glass, focusing on a dual eight-core Sandy Bridge processor, and on a Xeon-Phi co-processor based on the new many Integrated core architecture. We present performance results, discuss bottlenecks preventing further performance gains and compare with the corresponding figures for GPU-based implementations and for application-specific dedicated machines.
The MapReduce programming model is introduced for bigdata processing, where the data nodes perform both data storing and computation. Thus, we need to understand different resource requirements of data storing and com...
详细信息
ISBN:
(纸本)9781450325066
The MapReduce programming model is introduced for bigdata processing, where the data nodes perform both data storing and computation. Thus, we need to understand different resource requirements of data storing and computation tasks and schedule these efficiently over multi-core processors. The core affinity defines mapping between a set of cores and a given task. The core affinity can be decided based on resource requirements of a task because this largely affects the efficiency of computation, memory, and I/O resource utilization. In this paper, we analyze the impact of core affinity on the file upload performance of Hadoop Distributed File System (HDFS). Our study can provide the insight into the process scheduling issues on big-data processing systems. We also suggest a framework for dynamic core affinity based on our observations and show that a preliminary implementation can improve the throughput more than 40% compared with default Linux system. Copyright 2013 ACM.
The proceedings contain 8 papers. The topics discussed include: many-core architectures boost the pricing of basket options on adaptive sparse grids;heterogeneous COS pricing of rainbow options;accounting for secondar...
ISBN:
(纸本)9781450325073
The proceedings contain 8 papers. The topics discussed include: many-core architectures boost the pricing of basket options on adaptive sparse grids;heterogeneous COS pricing of rainbow options;accounting for secondary uncertainty: efficient computation of portfolio risk measures on multi and manycore architectures;system architecture for on-line optimization of automated trading strategies;pricing American options with least squares Monte Carlo on GPUs;calibration of stochastic volatility models on a multi-core CPU cluster;Intel version of STAC-A2 benchmark: toward better performance with fewer efforts;and optimizing IBM algorithmics' mark-to-future engine for real-time counterparty credit risk scoring.
Packet classification matches a packet header against the predefined rules in a rule set;it is a kernel function that has been studied for decades. A recent trend in packet classification is to match a large number of...
详细信息
ISBN:
(纸本)9781479929276
Packet classification matches a packet header against the predefined rules in a rule set;it is a kernel function that has been studied for decades. A recent trend in packet classification is to match a large number of packet header fields. For example, the flow table lookup in Software Defined Networking (SDN) requires 15 fields of the packet header to be examined. Another trend in packet classification is to use software-based solutions employing multi-core general purpose processors and virtual machines. Although packet classification has been widely studied, most existing solutions on multi-coresystems target the classic 5-field packet classification;their performance cannot be easily scaled up for a larger number of packet header fields. In this paper, we propose a decomposition-based packet classification approach;it supports large rule sets consisting of a large number of packet header fields. We first use range-tree and hashing to search each field of the input packet header individually in parallel. The partial results from all the fields are represented by bit vectors;they are merged in parallel to produce the final packet header match. We also balance the search and merge latencies, and employ software pipelining to further enhance the overall performance. We implement our approach on state-of-the-art multi-core processors;we evaluate its performance with respect to throughput and latency for rule set size ranging from 1K to 32K. Experimental results show that, for a 32K rule set, our algorithms can achieve an average processing latency of 2000 ns per packet and an overall throughput of 30 million packets per second on a state-of-the-art 16-core platform.
This paper presents the implementation of ray-tracing-based algorithms for multi-objective geospatial optimization targeting various many-core processing technologies such as graphics processing units, x86 multi-cores...
详细信息
ISBN:
(纸本)9781479948970
This paper presents the implementation of ray-tracing-based algorithms for multi-objective geospatial optimization targeting various many-core processing technologies such as graphics processing units, x86 multi-cores, and ARM processors. High performance is achieved through highly parallel core algorithms, executed on multiple compute devices across a heterogeneous architecture using low-level OpenCL kernels. Algorithms for calculating line-of-sight ballistic threat, visual observability, ground plane extraction, and Markov chain Monte Carlo optimization provide an augmented geospatial intelligence and situational awareness in three-dimensional urban environments.
The stock market index is a valuable financial tool to measure the state of a segment of the stock market. With high input data rates, real-time index computation is a challenging task. It cannot be done in real-time ...
详细信息
ISBN:
(纸本)9781479909087;9781479909094
The stock market index is a valuable financial tool to measure the state of a segment of the stock market. With high input data rates, real-time index computation is a challenging task. It cannot be done in real-time with today's reasonably high-end computers with many CPU cores. Doing so with CPU-based systems will require server farms with lots of computing power and therefore costly. Thus currently this index value is computed periodically (non-real-time). In this paper we describe our attempt in fast index computation using Graphics Processing Units (GPUs), which usually have several hundreds of processing cores and are much less expensive than CPU-based solutions. The computation itself is data parallel and therefore suitable for GPU processing. Preliminary results indicate our approach is promising as we can compute much faster using GPUs than using multi-core CPUs.
暂无评论