检索结果-内蒙古大学图书馆

您好，读者！请登录

咨询与建议

检索条件"主题词=data-parallel execution"

共 4 条记录，以下是1-10 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Revisiting Thread Execution Methods for GPU-oriented OpenCL ...

引用

6th International Symposium on Computing and Networking (CANDAR) - Across Practical Development and Theoretical Research

作者： Miyazaki, Takafumi Hidari, Hayato Hojo, Naohisa Taniguchi, Ittetsu Tomiyama, Hiroyuki Ritsumeikan Univ Dept Elect & Comp Engn Kyoto Japan Osaka Univ Grad Sch Informat Sci & Technol Suita Osaka Japan

ISBN: (纸本)9781538691847

OpenCL is one of the most popular frameworks for parallel computing. OpenCL is platform independent in principle, and OpenCL programs can be executed on various hardware platforms such as GPUs, multicore processors and FPGAs. However, OpenCL programs written for GPUs are often poorly executed on multicore processors in terms of performance due to the granularity of threads. This paper addresses efficient execution of GPU-oriented OpenCL programs on multicore processors. This paper solves a couple of drawbacks in an existing OpenCL framework and shows the effectiveness of this work through experiments.

关键词： OpenCL thread execution data-parallel execution multicore processors

来源：评论

学校读者我要写书评

暂无评论

In-Memory data parallel Processor 18

In-Memory Data Parallel Processor

引用

23rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)

作者： Fujiki, Daichi Mahlke, Scott Das, Reetuparna Univ Michigan Ann Arbor MI 48109 USA

ISBN: (纸本)9781450349116

Recent developments in Non-Volatile Memories (NVMs) have opened up a new horizon for in-memory computing. Despite the significant performance gain offered by computational NVMs, previous works have relied on manual mapping of specialized kernels to the memory arrays, making it infeasible to execute more general workloads. We combat this problem by proposing a programmable in-memory processor architecture and data-parallel programming framework. The efficiency of the proposed in-memory processor comes from two sources: massive parallelism and reduction in data movement. A compact instruction set provides generalized computation capabilities for the memory array. The proposed programming framework seeks to leverage the underlying parallelism in the hardware by merging the concepts of data-flow and vector processing. To facilitate in-memory programming, we develop a compilation framework that takes a TensorFlow input and generates code for our in-memory processor. Our results demonstrate 7.5x speedup over a multi-core CPU server for a set of applications from Parsec and 763x speedup over a server-class GPU for a set of Rodinia benchmarks.

关键词： reram in-memory computing compilation support data-parallel execution spatial architecture

来源：评论

学校读者我要写书评

暂无评论

Emma in Action: Declarative dataflows for Scalable data Analysis 16

Emma in Action: Declarative Dataflows for Scalable Data Anal...

引用

ACM SIGMOD International Conference on Management of data

作者： Alexandrov, Alexander Salzmann, Andreas Krastev, Georgi Katsifodimos, Asterios Markl, Volker TU Berlin Berlin Germany

ISBN: (纸本)9781450335317

parallel dataflow APIs based on second-order functions were originally seen as a flexible alternative to SQL. Over time, however, their complexity increased due to the number of physical aspects that had to be exposed by the underlying engines in order to facilitate efficient execution. To retain a sufficient level of abstraction and lower the barrier of entry for data scientists, projects like Spark and Flink currently offer domain-specific APIs on top of their parallel collection abstractions. This demonstration highlights the benefits of an alternative design based on deep language embedding. We showcase Emma a programming language embedded in Scala. Emma promotes parallel collection processing through native constructs like Scala's for-comprehensions a declarative syntax akin to SQL. In addition, Emma also advocates quoting the entire data analysis algorithm rather than its individual dataflow expressions. This allows for decomposing the quoted code into (sequential) control flow and (parallel) dataflow fragments, optimizing the dataflows in context, and transparently offloading them to an engine like Spark or Flink. The proposed design promises increased programmer productivity due to avoiding an impedance mismatch, thereby reducing the lag times and cost of data analysis.

关键词： parallel dataflows emma scala macros large-scale data analysis monad comprehensions data-parallel execution mapreduce

来源：评论

学校读者我要写书评

暂无评论

Implicit parallelism through Deep Language Embedding 15

Implicit Parallelism through Deep Language Embedding

引用

ACM SIGMOD International Conference on Management of data

作者： Alexandrov, Alexander Kunft, Andreas Katsifodimos, Asterios Schueler, Felix Thamsen, Lauritz Kao, Odej Herb, Tobias Markl, Volker TU Berlin Berlin Germany

ISBN: (纸本)9781450327589

The appeal of MapReduce has spawned a family of systems that implement or extend it. In order to enable parallel collection processing with User-Defined Functions (UDFs), these systems expose extensions of the MapReduce programming model as library-based dataflow APIs that are tightly coupled to their underlying runtime engine. Expressing data analysis algorithms with complex data and control flow structure using such APIs reveals a number of limitations that impede programmer's productivity. In this paper we show that the design of data analysis languages and APIs from a runtime engine point of view bloats the APIs with low-level primitives and affects programmer's productivity. Instead, we argue that an approach based on deeply embedding the APIs in a host language can address the shortcomings of current data analysis languages. To demonstrate this, we propose a language for complex data analysis embedded in Scala, which (i) allows for declarative specification of dataflows and (ii) hides the notion of dataparallelism and distributed runtime behind a suitable intermediate representation. We describe a compiler pipeline that facilitates efficient data-parallel processing without imposing runtime engine-bound syntactic or semantic restrictions on the structure of the input programs. We present a series of experiments with two state-of-the-art systems that demonstrate the optimization potential of our approach.

关键词： data-parallel execution scala macros control flow mapreduce monad comprehensions large-scale data analysis

来源：评论

学校读者我要写书评

暂无评论

全选清除本页清除全部题录导出标记到“检索档案”

共1页 << < 1 > >>

回到顶部

执行限定条件

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：