检索结果-内蒙古大学图书馆

Predictable Low-Latency Event Detection With Parallel Complex Event Processing

IEEE INTERNET OF THINGS JOURNAL 2015年第4期2卷 274-286页

作者： Mayer, Ruben Koldehofe, Boris Rothermel, Kurt Univ Stuttgart Inst Parallel & Distributed Syst D-70569 Stuttgart Germany Tech Univ Darmstadt Multimedia Commun Lab D-64283 Darmstadt Germany

The tremendous number of sensors and smart objects being deployed in the Internet of Things (IoT) pose the potential for IT systems to detect and react to live-situations. For using this hidden potential, complex event processing (CEP) systems offer means to efficiently detect event patterns (complex events) in the sensor streams and therefore, help in realizing a "distributed intelligence" in the IoT. With the increasing number of data sources and the increasing volume at which data is produced, parallelization of event detection is crucial to limit the time events need to be buffered before they actually can be processed. In this paper, we propose a pattern-sensitive partitioning model for data streams that is capable of achieving a high degree of parallelism in detecting event patterns, which formerly could only consistently be detected in a sequential manner or at a low parallelization degree. Moreover, we propose methods to dynamically adapt the parallelization degree to limit the buffering imposed on event detection in the presence of dynamic changes to the workload. Extensive evaluations of the system behavior show that the proposed partitioning model allows for a high degree of parallelism and that the proposed adaptation methods are able to meet a buffering limit for event detection under high and dynamic workloads.

关键词： Complex event processing (CEP) data parallelization quality of service self-adaptation stream processing

来源：评论

学校读者我要写书评

暂无评论

Big data Applications Using Workflows for data Parallel Computing

引用

COMPUTING IN SCIENCE & ENGINEERING 2014年第4期16卷 11-21页

作者： Wang, Jianwu Crawl, Daniel Altintas, Ilkay Li, Weizhong Univ Calif San Diego Workflows Data Sci WorDS Ctr Excellence San Diego CA 92103 USA Univ Calif San Diego San Diego Supercomp Ctr San Diego CA 92103 USA Univ Calif San Diego Workflows Data Sci Ctr Excellence San Diego Supercomp Ctr San Diego CA 92103 USA Univ Calif San Diego Ctr Res Biol Syst San Diego CA 92103 USA

In the Big data era, workflow systems must embrace data parallel computing techniques for efficient data analysis and analytics. Here, an easy-to-use, scalable approach is presented to build and execute Big data applications using actor-oriented modeling in data parallel computing. Two bioinformatics use cases for next-generation sequencing data analysis demonstrate the approach's feasibility.

关键词： Big data workflow actor-oriented programming bioinformatics application data parallelization scientific computing scientific programming

来源：评论

学校读者我要写书评

暂无评论

Meeting Predictable Buffer Limits in the Parallel Execution of Event Processing Operators 2

Meeting Predictable Buffer Limits in the Parallel Execution ...

引用

IEEE International Conference on Big data

作者： Mayer, Ruben Koldehofe, Boris Rothermel, Kurt Univ Stuttgart Inst Parallel & Distributed Syst Stuttgart Germany

ISBN: (纸本)9781479956661

Complex Event Processing (CEP) systems enable applications to react to live-situations by detecting event patterns (complex events) in data streams. With the increasing number of data sources and the increasing volume at which data is produced, parallelization of event detection is becoming of tremendous importance to limit the time events need to be buffered before they actually can be processed by an event detector-named event processing operator. In this paper, we propose a pattern-sensitive partitioning model for data streams that is capable of achieving a high degree of parallelism for event patterns which formerly could only be consistently detected in a sequential manner or at a low parallelization degree. Moreover, we propose methods to dynamically adapt the parallelization degree to limit the buffering imposed on event detection in the presence of dynamic changes to the workload. Extensive evaluations of the system behavior show that the proposed partitioning model allows for a high degree of parallelism and that the proposed adaptation methods are able to meet the buffering level for event detection under high and dynamic workloads.

关键词： buffer storage data handling parallel processing CEP systems buffering complex event processing systems data sources data streams data volume dynamic workloads event detection parallelization event patterns detection event processing operators parallel execution parallelization degree pattern-sensitive partitioning model predictable buffer limits system behavior time events Adaptation models Corporate acquisitions Correlation Delays Event detection Middleware Monitoring Complex Event Processing data parallelization Quality of Service Self-Adaptation Stream Processing

来源：评论

学校读者我要写书评

暂无评论

Optimizing two-dimensional DMA transfers for scratchpad Based MPSoCs platforms

引用

MICROPROCESSORS AND MICROSYSTEMS 2013年第8期37卷 848-857页

作者： Saidi, Selma Tendulkar, Pranav Lepley, Thierry Maler, Oded Univ Grenoble CNRS VERIMAG Lab Gieres France STMicroelectronics Grenoble France

Reducing the effects of off-chip memory access latency is a key factor in exploiting efficiently embedded multi-core platforms. We consider architectures that admit a multi-core computation fabric, having its own fast and small memory to which the data blocks to be processed are fetched from external memory using a DMA (direct memory access) engine, employing a double- or multiple-buffering scheme to avoid processor idling. In this paper we focus on application programs that process two-dimensional data arrays and we determine automatically the size and shape of the portions of the data array which are subject to a single DMA call, based on hardware and applications parameters. When the computation on different array elements are completely independent, the asymmetry of memory structure leads always to prefer one-dimensional horizontal pieces of memory, while when the computation of a data element shares some data with its neighbors, there is a pressure for more "square" shapes to reduce the amount of redundant data transfers. We provide an analytic model for this optimization problem and validate our results by running a mean filter application on the au. simulator. (C) 2013 Elsevier B.V. All rights reserved.

关键词： data parallelization Direct memory access (DMA) Double buffering Cell processor MPSoCs

来源：评论

学校读者我要写书评

暂无评论

FPGA-based Fast Image Warping with data-parallelization Schemes

引用

IEEE TRANSACTIONS ON CONSUMER ELECTRONICS 2008年第4期54卷 2053-2059页

作者： Oh, Sungchan Kim, Gyeonghwan Sogang Univ Dept Elect Engn Seoul South Korea

In this paper, we present an FPGA-based fast image warping method by applying data parallelization schemes. The parallelization of accesses to pixels relieves not only latency problem of the warping, but also bandwidth requirements of off-chip memory. The LUT data parallelization scheme efficiently replaces parallel arithmetic operations with neither of increased memory size for LUT entries nor clock frequency. Two implementations with different characteristics prove the effectiveness and efficiency of the proposed method.

关键词： data parallelization image transformation image warping lookup tables FPGA

来源：评论

学校读者我要写书评

暂无评论

Optimizing Explicit data Transfers for data Parallel Applications on the Cell Architecture

引用

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION 2012年第4期8卷 37-37页

作者： Saidi, Selma Tendulkar, Pranav Lepley, Thierry Maler, Oded STMicroelect Grenoble Grenoble France CNRS Verimag Lab Grenoble France Univ Grenoble Verimag Lab Grenoble France

In this paper we investigate a general approach to automate some deployment decisions for a certain class of applications on multi-core computers. We consider data-parallelizable programs that use the well-known double buffering technique to bring the data from the off-chip slow memory to the local memory of the cores via a DMA (direct memory access) mechanism. Based on the computation time and size of elementary data items as well as DMA characteristics, we derive optimal and near optimal values for the number of blocks that should be clustered in a single DMA command. We then extend the results to the case where a computation for one data item needs some data in its neighborhood. In this setting we characterize the performance of several alternative mechanisms for data sharing. Our models are validated experimentally using a cycle-accurate simulator of the Cell Broadband Engine architecture.

关键词： Design Algorithms Performance data parallelization double buffering direct memory access (DMA) Cell BE

来源：评论

学校读者我要写书评

暂无评论

Parallel Training of Neural Networks for Speech Recognition

Parallel Training of Neural Networks for Speech Recognition

引用

13th International Conference on Text, Speech and Dialogue

作者： Vesely, Karel Burget, Lukas Grezl, Frantisek Brno Univ Technol Speech FIT Brno 61266 Czech Republic

The feed-forward multi-layer neural networks have significant importance in speech recognition. A new parallel-training tool TNet was designed and optimized for multiprocessor computers. The training acceleration rate... 详细信息

ISBN: (纸本)9783642157592

关键词： neural network phoneme classification posterior features backpropagation training data parallelization

来源：评论

学校读者我要写书评

暂无评论

Implementing Block Cipher on Embedded Multiprocessors Platform

Implementing Block Cipher on Embedded Multiprocessors Platfo...

引用

International Conference on Multimedia Computing and Systems

作者： Khaddour, M. Wang, Z. Hammami, O. ENSTA Lab Elect & Informat Paris France

ISBN: (纸本)9781424437566

Multiprocessor platforms are gaining markets as a solution to boost general performance of processor beyond technological limitations that are present in single processors chips, Multi-processor in embedded systems also have a future in particular with applications like SDR(Software Defined Radio) where both high performance and high adaptability are required. Cryptographic algorithms implementation on embedded systems is also a hot topic for the rapidly developing wireless communication networks. In this paper we examine the implementation of the computation-intensive block ciphers AES and TDES algorithms on a 16 processors platform implemented on FPGA, we implemented a CBC operation mode which suites mass encryption on the platform and we obtained a linear speedup in computation time

关键词： AES CBC mode data parallelization Embedded Multiprocessors FPGA TDES Virtex4

来源：评论

学校读者我要写书评

暂无评论

Paralelní trénování neuronových sítí pro rozpoznávání řeči

Paralelní trénování neuronových sítí pro rozpoznává...

引用

作者： Veselý, Karel Brno University of Technology

Tato diplomová práce je zaměřena na paralelizaci trénování neuronových sítí pro rozpoznávání řeči. V rámci této diplomové práce byly implemen... 详细信息

Tato diplomová práce je zaměřena na paralelizaci trénování neuronových sítí pro rozpoznávání řeči. V rámci této diplomové práce byly implementovány a porovnány dvě strategie paralelizace. První strategií je paralelizace dat s využitím rozdělení trénování do několika POSIX vláken. Druhou strategií je paralelizace uzlů s využitím platformy pro obecné výpočty na grafických kartách CUDA. V případě první strategie bylo dosaženo 4x urychlení, v případě využití platformy CUDA bylo dosaženo téměř 10x urychlení. Pro trénování byl použit algoritmus Stochastic Gradient Descent se zpětným šířením chyb. Po krátkém úvodu následuje druhá kapitola práce, která je motivační a zasazuje probém do kontextu rozpoznávání řeči. Třetí kapitola práce je teoretická a diskutuje neuronové sítě a metodu trénování. Následující kapitoly jsou zaměřené na návrh a implementaci a popisují iterativní vývoj tohoto projektu. Poslední obsáhlá kapitola popisuje testovací systém a uvádí výsledky provedených experimentů. V závěru jsou krátce zhodnoceny dosažené výsledky a nastíněna perspektiva dalšího vývoje projektu.

关键词： neuronová síť akustický model rozpoznávání řeči rychlé trénování paralelizace dat POSIX vlákna paralelizace uzlů CUDA GPGPU neural network acoustic model speech recognition fast training data parallelization POSIX threads node parallelizati

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：