检索结果-内蒙古大学图书馆

Scalable multi-dimensional RNN query processing

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2015年第16期27卷 4156-4171页

作者： Ji, Changqing Qu, Wenyu Li, Zhiyang Xu, Yujie Li, Yuanyuan Wu, Junfeng Dalian Maritime Univ Sch Informat Sci & Technol Dalian Peoples R China Dalian Univ Sch Phys Sci & Technol Dalian 116012 Peoples R China Dalian Jiaotong Univ Sch Software Dalian Peoples R China Dalian Ocean Univ Sch Educ Technol Dalian Peoples R China Dalian Ocean Univ Ctr Comp Dalian Peoples R China

Reverse nearest neighbor (RNN) queries are the complimentary problem and particular interest in the past few years, such as location-based services, profile-based marketing, resource allocation, and traffic monitoring system. the one major drawback for the existing RNN is that it has inherent sequential nature and uses in-memory algorithm, which limits its applicability to large-scale spatial data queries. this paper proposes scalable algorithms for RNN queries in a distributed environment. Firstly, we investigate the Basic-scalable reverse nearest neighbor (SRNN) initialization query method based on the inverted grid index. Secondly, two optimization methods Lazy-SRNN and Eager-SRNN are proposed to effectively process scalable multi-dimensional RNN queries. Among them, Lazy-SRNN prunes the search space when all RNN objects are discovered in one pass;Eager-SRNN attempts to prune spatial objects incrementally as soon as they are visited. In addition, the SRNN algorithm is proved to be the first attempt for the exact scalable RNN algorithms in a distributed environment on multi-dimensional data sets. We show in an extensive experimental evaluation on real-world and synthetic data the scalability and the performance of our novel approach. Copyright (c) 2015 John Wiley & Sons, Ltd.

关键词： reverse nearest neighbor big data MapReduce cloud computing

来源：评论

学校读者我要写书评

暂无评论

parallel Numerical algorithms for Simulation of Rectangular Waveguides by Using GPU 1

引用

10th international conference on parallel processing and Applied Mathematics (PPAM)

作者： Ciegis, Raimondas Bugajev, Andrej Kancleris, Zilvinas Slekas, Gediminas Vilnius Gediminas Tech Univ LT-10223 Vilnius Lithuania

ISBN: (数字)9783642551956

ISBN: (纸本)9783642551956

In this article we consider parallel numerical algorithms to solve the 3D mathematical model, that describes a wave propagation in rectangular waveguide. the main goal is to formulate and analyze a minimal algorithmic template to solve this problem by using the CUDA platform. this template is based on explicit finite difference schemes obtained after approximation of systems of differential equations on the staggered grid. the parallelization of the discrete algorithm is based on the domain decomposition method. the theoretical complexity model is derived and the scalability of the parallel algorithm is investigated. Results of numerical simulations are presented.

关键词： parallel algorithms Numerical simulation Wave propagation GPU CUDA Scalability analysis

来源：评论

学校读者我要写书评

暂无评论

Scalable and Efficient parallel Selection

Scalable and Efficient Parallel Selection

引用

10th international conference on parallel processing and Applied Mathematics (PPAM)

作者： Siebert, Christian Rhein Westfal TH Aachen Dept Comp Sci Lab Parallel Programming Aachen Germany

ISBN: (纸本)9783642552243

Selection algorithms find the kth smallest element from a set of elements. Although there are optimal parallel selection algorithms available for theoretical machines, these algorithms are not only difficult to implement but also inefficient in practice. Consequently, scalable applications can only use few special cases such as minimum and maximum, where efficient implementations exist. To overcome such limitations, we propose a general parallel selection algorithm that scales even on today's largest supercomputers. Our approach is based on an efficient, unbiased median approximation method, recently introduced as median-of-3 reduction, and Hoare's sequential QuickSelect idea from 1961. the resulting algorithm scales with a time complexity of O(log(2) n) for n distributed elements while needing only O(1) space. Furthermore, we prove it to be a practical solution by explaining implementation details and showing performance results for up to 458, 752 processor cores.

关键词： Selection QuickSelect Median parallel algorithms MPI

来源：评论

学校读者我要写书评

暂无评论

GPU-Accelerated Verification of the Collatz Conjecture

GPU-Accelerated Verification of the Collatz Conjecture

引用

14th international conference on algorithms and architectures for parallel processing (ICA3PP)

作者： Honda, Takumi Ito, Yasuaki Nakano, Koji Hiroshima Univ Dept Informat Engn Higashihiroshima 7398527 Japan

ISBN: (纸本)9783319111971;9783319111964

the main contribution of this paper is to present an implementation that performs the exhaustive search to verify the Collatz conjecture using a GPU. Consider the following operation on an arbitrary positive number: if the number is even, divide it by two, and if the number is odd, triple it and add one. the Collatz conjecture asserts that, starting from any positive number m, repeated iteration of the operations eventually produces the value 1. We have implemented it on NVIDIA GeForce GTX TITAN and evaluated the performance. the experimental results show that, our GPU implementation can verify 5.01x10(11) 64-bit numbers per second, while the CPU implementation on Intel Xeon X7460 can verify 1.80 x 10(9) 64-bit numbers per second. thus, our implementation on the GPU attains a speed-up factor of 278 over the single CPU implementation.

关键词： Collatz conjecture GPGPU parallel processing Exhaustive verification

来源：评论

学校读者我要写书评

暂无评论

KLA: A New Algorithmic Paradigm for parallel Graph Computations 14

KLA: A New Algorithmic Paradigm for Parallel Graph Computati...

引用

23rd international conference on parallel architectures and Compilation Techniques (PACT)

作者： Harshvardhan Fidel, Adam Amato, Nancy M. Rauchwerger, Lawrence Texas A&M Univ Deptartment Comp Sci & Engn Parasol Lab College Stn TX 77843 USA

ISBN: (纸本)9781450328098

this paper proposes a new algorithmic paradigm - k-level asynchronous (KLA) - that bridges level-synchronous and asynchronous paradigms for processing graphs. the KLA paradigm enables the level of asynchrony in parallel graph algorithms to be parametrically varied from none (levelsynchronous) to full (asynchronous). the motivation is to improve execution times through an appropriate trade-off between the use of fewer, but more expensive global synchronizations, as in level-synchronous algorithms, and more, but less expensive local synchronizations (and perhaps also redundant work), as in asynchronous algorithms. We show how common patterns in graph algorithms can be expressed in the KLA pardigm and provide techniques for determining k, the number of asynchronous steps allowed between global synchronizations. Results of an implementation of KLA in the stapl Graph Library show excellent scalability on up to 96K cores and improvements of 10x or more over levelsynchronous and asynchronous versions for graph algorithms such as breadth-first search, PageRank, k-core decomposition and others on certain classes of real-world graphs.

关键词： parallel algorithms Asynchronous Graph algorithms Graph Analytics Big Data Distributed Computing

来源：评论

学校读者我要写书评

暂无评论

parallel Object-Oriented Implementation of the TestU01 Statistical Test Suites 10

Parallel Object-Oriented Implementation of the TestU01 Stati...

引用

IEEE 10th international conference on Intelligent Computer Communication and processing (ICCP)

作者： Suciu, Alin Toma, Radu Alexandru Marton, Kinga Tech Univ Cluj Napoca Dept Comp Sci Cluj Napoca Romania

ISBN: (纸本)9781479965694

Evaluation of the randomness quality of a random number generator requires an efficient suite of statistical tests which takes advantage of the processing power of today's multi-core processing power in order to cope with the large amount of data to be processed. While, in theory, most complex processing algorithms can be tuned for concurrent execution, the solution will eventually reach a state in which a compromise needs to be made between the overall performance and the configurability and usability of the application. Our solution is based on completely re-designing the TestU01 architecture to include the notion of parallel computing as part of the general requirements, and not as a tool used for increasing performance. Implementation of this design is done using concepts from the object-oriented paradigm, and uses the. NET Task parallel Library. Experimental results show that the parallel OOP based implementation of the TestU01 library not only obtains similar results as the previous parallel version, but in some cases a better speedup is obtained.

关键词： random number sequences statistical tests parallel implementation object-oriented paradigm TestU01

来源：评论

学校读者我要写书评

暂无评论

Subsquares Approach - A Simple Scheme for Solving Overdetermined Interval Linear Systems 1

引用

10th international conference on parallel processing and Applied Mathematics (PPAM)

作者： Horacek, Jaroslav Hladik, Milan Charles Univ Prague Dept Appl Math Fac Math & Phys CR-11800 Prague Czech Republic

ISBN: (数字)9783642551956

ISBN: (纸本)9783642551956

In this work we present a new simple but efficient scheme - Subsquares approach - for development of algorithms for enclosing the solution set of overdetermined interval linear systems. We are going to show two algorithms based on this scheme and discuss their features. We start with a simple algorithm as a motivation, then we continue with an improved algorithm. Both algorithms can be easily parallelized. the features of both algorithms will be discussed and numerically tested.

关键词： Interval linear systems Interval enclosure Overdetermined systems parallel computing

来源：评论

学校读者我要写书评

暂无评论

5 Gbit/s real-time processing using π/4-shift DQPSK for bidirectional radio-over-fibre system

5 Gbit/s real-time processing using π/4-shift DQPSK for bid...

引用

international conference on Transparent Optical Networks

作者： Kai Habel Luz Fernandez del Rosal Stefan Weide Jonas Hilt Volker Jungnickel Robert Elschner Colja Schubert Felix Frey Johannes Karl Fischer Ronald Freund Fraunhofer Institute for Telecommunications Heinrich Hertz Institute Berlin Germany

Future converged fixed-mobile networks need high-speed radio links in deployment scenarios where fibre is not available or too expensive. In this paper, we present a field-programmable gate array (FPGA)-based real-time transmission system using standard 10G Ethernet interfaces. the system comprises two parallel complex-valued data channels in each direction. Standard FPGAs and low-cost multi-channel analogue-to-digital converters (ADCs) and digital-to-analogue converters (DACs) have been used. For enhanced robustness and optimal usage of the power amplifier, π/4-shift differential quaternary phase-shift keying (DQPSK) modulation is used. All digital signal processing routines for synchronization, equalization, forward error correction etc. have been fully implemented and tested. Using a protocol analyzer, error-free bidirectional transmission of Ethernet frames at 5 Gbit/s is verified. Error-vector magnitude (EVM) values below -30 dB indicate that even higher speeds could be realized.

关键词： Synchronization Field programmable gate arrays Bit error rate Signal processing algorithms Boards Forward error correction Signal to noise ratio

来源：评论

学校读者我要写书评

暂无评论

A parallel Algorithm of Kirchhoff Pre-stack Depth Migration Based on GPU 1

引用

14th international conference on algorithms and architectures for parallel processing (ICA3PP)

作者： Wang, Yida Li, Chao Tian, Yang Yan, Haihua Zhao, Changhai Zhang, Jianlei Beihang Univ Sch Comp Sci & Engn Beijing 100191 Peoples R China

ISBN: (数字)9783319111940

ISBN: (纸本)9783319111940;9783319111933

Kirchhoff pre-stack depth migration (KPSDM) algorithm, as one of the most widely used migration algorithms, plays an important part in getting the real image of the earth. However, this program takes considerable time due to its high computational cost;hence the working efficiency of the oil industry is affected. the general purpose Graphic processing Unit (GPU) and the Compute Unified Device Architecture (CUDA) developed by NVIDIA have provided a new solution to this problem. In this study, we have proposed a parallel algorithm of the Kirchhoff pre-stack depth migration and an optimization strategy based on the CUDA technology. Our experiments indicate that for large data computations, the accelerated algorithm achieves a speedup of 8 similar to 15 times compared with NVIDIA GPU.

关键词： Kirchhoff pre-stack depth migration GPU CUDA parallel algorithm optimization

来源：评论

学校读者我要写书评

暂无评论

Main memory adaptive indexing for multi-core systems

Main memory adaptive indexing for multi-core systems

引用

10th international Workshop on Data Management on New Hardware, DaMoN 2014 - In Conjunction with the ACM SIGMOD/PODS conference

作者： Alvarez, Victor Schuhknecht, Felix Martin Dittrich, Jens Richter, Stefan Information Systems Group Saarland University Germany

ISBN: (纸本)9781450329712

Adaptive indexing is a concept that considers index creation in databases as a by-product of query processing;as opposed to traditional full index creation where the indexing effort is performed up front before answering any queries. Adaptive indexing has received a considerable amount of attention, and several algorithms have been proposed over the past few years;including a recent experimental study comparing a large number of existing methods. Until now, however, most adaptive indexing algorithms have been designed single- threaded, yet with multi-core systems already well established, the idea of designing parallel algorithms for adaptive indexing is very natural. In this regard, and to the best of our knowledge, only one parallel algorithm for adaptive indexing has recently appeared in the literature: the parallel version of standard cracking. In this paper we describe three alternative parallel algorithms for adaptive indexing, including a second variant of a parallel standard cracking algorithm. Additionally, we describe a hybrid parallel sorting algorithm, and a NUMA- Aware method based on sorting. We then thoroughly compare all these algorithms experimentally. parallel sorting algorithms serve as a realistic baseline for multithreaded adaptive indexing techniques. In total we experimentally compare seven parallel algorithms. the initial set of experiments considered in this paper indicates that our parallel algorithms significantly improve over previously known ones. Our results also suggest that, although adaptive indexing algorithms are a good design choice in single- threaded environments, the rules change considerably in the parallel case. that is, in future highly-parallel environments, sorting algorithms could be serious alternatives to adaptive indexing. Copyright 2014 ACM.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：