检索结果-内蒙古大学图书馆

High-Performance Psychometrics: The parallel-E parallel-M Algorithm for Generalized Latent Variable Models

ETS Research Report Series 2016年第2期2016卷

作者： Matthias von Davier Educational Testing Service Princeton NJ

This report presents results on a parallel implementation of the expectation-maximization (EM) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response theory, diagnostic classification models, multitrait–multimethod (MTMM) models, and discrete mixture distribution models. These types of models are frequently applied to the analysis of multidimensional responses of test takers to a set of items, for example, in the context of proficiency testing. The algorithm presented here is based on a direct implementation of massive parallelism using a paradigm that allows the distribution of work among a number of processor cores. Modern desktop computers as well as many laptops are using processors that contain 2–4 cores and potentially twice the number of virtual cores. Many servers use 2, 4, or more multicore #central processing units (CPUs), which brings the number of cores to 8, 12, 32, or even 64 or more. The algorithm presented here scales the time reduction in the most calculation-intense part of the program almost linearly for some problems, which means that a server with 32 physical cores executes the parallel-E step algorithm up to 24 times faster than a single-core computer or the equivalent nonparallel algorithm. The overall gain (including parts of the program that cannot be executed in parallel) can reach a reduction in time by a factor of 6 or more for a 12-core machine. The basic approach is to utilize the architecture of modern CPUs, which often involves the design of processors with multiple cores that can run programs simultaneously. The use of this type of architecture for algorithms that produce posterior moments has straightforward appeal: The calculations conducted for each respondent or each distinct response pattern can be split up into simultaneous calculations

关键词： parallel programming EM algorithm high-performance computation (HPC) efficient estimation modern psychometric models

来源：评论

学校读者我要写书评

暂无评论

PROVING OPACITY OF TRANSACTIONAL MEMORY WITH EARLY RELEASE

引用

FOUNDATIONS OF COMPUTING AND DECISION SCIENCES 2015年第4期40卷 317-335页

作者： Siek, Konrad Wojciechowski, Pawel T.

Transactional Memory (TM) is an alternative way of synchronizing concurrent accesses to shared memory by adopting the abstraction of transactions in place of low-level mechanisms like locks and barriers. TMs usually apply optimistic concurrency control to provide a universal and easy-to-use method of maintaining correctness. However, this approach performs a high number of aborts in high contention workloads, which can adversely affect perform Optimistic TMs can cause problems when transactions contain irrevocable operations. Hence, pessimistic TMs were proposed to solve some of these problems. However, an important way of achieving efficiency in pessimistic TMs is to use early release. On the other hand, early release is seemingly at odds with opacity, the gold standard of TM safety properties, which does not allow transactions to make their state visible until they commit. In this paper we propose a proof technique that makes it possible to demonstrate that a TM with early release can be opaque as long as it prevents inconsistent views.

关键词： Concurrency parallel programming Software Transactional Memory Safety Early Release

来源：评论

学校读者我要写书评

暂无评论

Performance Comparison of OpenMP, MPI, and MapReduce in Practical Problems

引用

ADVANCES IN MULTIMEDIA 2015年第1期2015卷

作者： Kang, Sol Ji Lee, Sang Yeon Lee, Keon Myung Chungbuk Natl Univ Dept Comp Sci Cheongju 361763 Chungbuk South Korea

With problem size and complexity increasing, several parallel and distributed programming models and frameworks have been developed to efficiently handle such problems. This paper briefly reviews the parallel computing models and describes three widely recognized parallel programming frameworks: OpenMP, MPI, and MapReduce. OpenMP is the de facto standard for parallel programming on shared memory systems. MPI is the de facto industry standard for distributed memory systems. MapReduce framework has become the de facto standard for large scale data-intensive applications. Qualitative pros and cons of each framework are known, but quantitative performance indexes help get a good picture of which framework to use for the applications. As benchmark problems to compare those frameworks, two problems are chosen: all-pairs-shortest-path problem and data join problem. This paper presents the parallel programs for the problems implemented on the three frameworks, respectively. It shows the experiment results on a cluster of computers. It also discusses which is the right tool for the jobs by analyzing the characteristics and performance of the paradigms.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

parallel preconditioned conjugate gradient method for large sparse and highly ill-conditioned systems arising in computational geomechanics

引用

INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING 2015年第4期11卷 409-419页

作者： Kardani, Omid Lyamin, Andrei V. Krabbenhoft, Kristian Univ Newcastle Ctr Excellence Geotech Sci & Engn Callaghan NSW 2287 Australia

The efficiency of parallel preconditioned conjugate gradient (PCG) algorithm for solving large sparse linear systems arising from application of interior point methods to conic optimisation problems in the context of nonlinear finite element limit analysis (FELA) for computational geomechanics is studied. For large 3D problems, the use of direct solvers in general becomes prohibitively expensive owing to exponentially growing memory requirements and computational time. And the so-called saddle-point systems resulting from use of optimisation framework is not an exemption. On the other hand, although preconditioned iterative methods have moderate storage requirements and therefore can be applied to much larger problems than direct methods, they usually exhibit high number of iterations to reach convergence. In the present paper, we show that this problem can be effectively tackled using efficient variants of sparse approximate inverse preconditioners along with an elaborate parallel implementation on multicore CPUs and significant improvements can be achieved by parallel implementation on graphic processing unit (GPU). Furthermore, the efficiency of our proposed implementation is verified by the presented numerical results.

关键词： approximate inverse preconditioner limit analysis preconditioned conjugate gradient method cone programming multicore processors graphic processing unit GPU parallel programming computational geomechanics

来源：评论

学校读者我要写书评

暂无评论

parallel programming applied to the N Scheme for solving FE cases without assembling an A x = b system

Parallel programming applied to the N Scheme for solving FE ...

引用

Biennial Institute of Electrical and Electronics Engineers Conference on Electomagnetic Field Computation

作者： J. Eyng J. P. A. Bastos N. Sadowski M. Fischborn M. A. R. Dantas D. J. Ferreira GRUCAD/EEL/CTC Universidade Federal de Santa Catarina UTFPR Universidade Tecnologica Federal do Parana Campus Medianeira LaPeSD/INE/CTC Universidade Federal de Santa Catarina

ISBN: (纸本)9781424470594

The classical solution of electromagnetic problems using the finite element (FE) method needs to assemble, store and solve an Ax = b matrix system. A new technique for solving FE cases, considered much simpler than traditional methods, shows that the assembling of the matrix A is unnecessary [1]. The difference between these two techniques is the computation and processing time. The new one requires more iterations to converge, observing, nevertheless, that the results are reliable. One possible way to improve its performance is the application of parallelization techniques.

关键词： Iron GTF2E1 gene assemblies parallel programming

来源：评论

学校读者我要写书评

暂无评论

Nested parallelism in transactional memory

Lecture Notes in Computer Science (including subseries Lectu...

引用

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2015年 8913卷 192-209页

作者： Filipe, Ricardo Barreto, João Instituto Superior Técnico Universidade de Lisboa/INESC-ID Portugal

We are witnessing an increase in the parallel power of computers for the foreseeable future, which requires parallel programming tools and models that can take advantage of the higher number of hardware threads. For some applications, reaching up to such high parallelism requires going beyond the typical monolithic parallel model: it calls for exposing fine-grained parallel tasks that might exist in a program, possibly nested within memory *** most current mainstream transactional memory (TM) systems do not yet support nested parallel transactions, recent research has proposed approaches that leverage TM with support for fine-grained parallel transactional nesting. These novel solutions promise to unleash the parallel power of TM to unprecedented levels. This chapter addresses parallel nesting models in transactional memory from two distinct *** start fromthe programmer’s perspective, studying the spectrum of parallelnested models that are available to programmers, and giving a practical tutorial on the utility of each model, as well as the languages, tools and frameworks that help programmers build nested-parallel programs. We then turn to the perspective of a TM runtime designer, focusing on state-of-the art algorithms that support nested parallelism. © Springer International Publishing Switzerland 2015.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Key agreement under tropical parallels

引用

GROUPS COMPLEXITY CRYPTOLOGY 2015年第2期7卷 194-198页

作者： Chauvet, Jean-Marie Mahe, Eric MassiveRand 62 Ave Pierre Grenier F-92100 Boulogne France

A semiring is an algebraic structure satisfying the usual axioms for a not necessarily commutative ring, but without the requirement that addition be invertible. Aside from rings, well-studied instances in cryptographic applications include the Boolean semiring and the tropical semiring. The latter, in particular, behaves to a large extent like a field and exhibits interesting properties in the cryptographic context. This short note explores a GPU-based highly parallel implementation of a protocol recently proposed by Grigoriev and Shpilrain [7], in the context of Diffie-Hellman key agreements.

关键词： Cryptography Diffie-Hellman key exchange tropical algebra GPU parallel programming

来源：评论

学校读者我要写书评

暂无评论

parallel programming with Transactional Memory

引用

COMMUNICATIONS OF THE ACM 2009年第2期52卷 38-43页

作者： Drepper, Ulrich Red Hat

While still primarily a research project, transactional memory shows promise for making parallel programming easier.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Multicore processor — Architecture and programming

Multicore processor — Architecture and programming

引用

International Symposium on VLSI Design and Test (VDAT)

作者： N. Sudha Senior Development Engineer XMOS Semiconductor Ltd Chennai Tamil Nadu India

In the past, speedup has been achieved in a processor by increasing clock speed. Multicore processors are the new direction semiconductor companies are focusing on to get a boost in the performance. This tutorial first covers the concept of multicore, introducing its need and the challenges. The key aspects of multicore architecture design and the detailed architecture with reference to XMOS multicore microcontroller will be presented. The tutorial then covers the parallel programming concepts and introduces the language constructs that exploits the architectural features specific to XMOS processors. A few case studies on the application-specific design in the domains of industrial communication and image processing will be presented. Sample programs will be demonstrated to get a clear understanding of programming on multicores. The participants will also try these demos for getting hands-on experience in multicore programming.

关键词： Multicore processing Tutorials Microcontrollers Clocks parallel programming

来源：评论

学校读者我要写书评

暂无评论

Performance Evaluation of Unscented Kalman Filter Using Multi-Core Processors Environment

Performance Evaluation of Unscented Kalman Filter Using Mult...

引用

International Conference on Computer, Communication and Control

作者： Suresh Kumar Sharma Manisha J. Nene Defence Inst. of Adv. Technol. Pune India

ISBN: (纸本)9781479981656

The Unscented Kalman Filter (UKF) is widely used to solve nonlinear systems, like submarine tracking, aircraft surveillance, autonomous robotics and mobile systems. One of the typical problems solved using UKF is Bearing-Only Target Motion Analysis (BOTMA) for manoeuvring and non manoeuvring targets. This paper proposes a methodology for parallel execution of UKF with an aim to enhance its performance in terms of computational throughput. parallel algorithm and its execution of UKF for BOTMA will use multi-core processor environment. The study concentrate on identifying the phases of UKF enabled BOTMA that can be parallelized to execute on the hardware underneath to enhance the response time. The performance is observed and results are verified.

关键词： Unscented Kalman Filter (UKF) Bearing-Only Target Motion Analysis (BOTMA) parallel programming Time complexity Computational complexity

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：