检索结果-内蒙古大学图书馆

32nd IEEE international parallel and distributed Processing symposium (IPDPS)

作者： Dinda, Peter Hetland, Conor Northwestern Univ Evanston IL 60208 USA

ISBN: (纸本)9781538643686

Floating point arithmetic, as specified in the IEEE standard, is used extensively in programs for science and engineering. This use is expanding rapidly into other domains, for example with the growing application of machine learning everywhere. While floating point arithmetic often appears to be arithmetic using real numbers, or at least numbers in scientific notation, it actually has a wide range of gotchas. Compiler and hardware implementations of floating point inject additional surprises. This complexity is only increasing as different levels of precision are becoming more common and there are even proposals to automatically reduce program precision (reducing power/energy and increasing performance) when results are deemed "good enough." Are software developers who depend on floating point aware of these issues? Do they understand how floating point can bite them? To find out, we conducted an anonymous study of different groups from academia, national labs, and industry. The participants in our sample did only slightly better than chance in correctly identifying key unusual behaviors of the floating point standard, and poorly understood which compiler and architectural optimizations were nonstandard. These surprising results and others strongly suggest caution in the face of the expanding complexity and use of floating point arithmetic.

关键词： floating point arithmetic software development user studies correctness IEEE 754

来源：评论

学校读者我要写书评

暂无评论

Teaching parallel Programming with Active Learning 32

Teaching Parallel Programming with Active Learning

引用

32nd IEEE international parallel and distributed Processing symposium (IPDPS)

作者： Kuhail, Mohammad Amin Cook, Spencer Neustrom, Joshua W. Rao, Praveen Univ Missouri Sch Comp & Engn Kansas City MO 64110 USA

ISBN: (纸本)9781538655559

Today parallel computing is essential for the success of many real-world applications and software systems. Nonetheless, most computer science undergraduate courses teach students how to think and program sequentially. Further, software professionals have complained about the computer science curriculum's lag behind industry in their failing to cover modern programming technologies such as parallel programming. The emphasis on parallel programming has become even more important due to the increasing adoption of horizontal scaling approaches to cope with massive datasets. In order to help students coming from a serial curriculum comprehend parallel concepts, we used an innovative approach that utilized active learning, visualizations, examples, discussions, and practical exercises. Further, we conducted an experiment to examine the effect of active learning on students' understanding of parallel programming. Results indicate that the students that were actively engaged with the material performed better in terms of understanding parallel programming concepts than other students.

关键词： parallel programming teaching OpenMP data structures visualizations active learning

来源：评论

学校读者我要写书评

暂无评论

Commodity clusters: Performance comparison between PC's and workstations

Commodity clusters: Performance comparison between PC's and ...

引用

5th IEEE international symposium on High Performance distributed Computing

作者： Carter, R Laroco, J Armstrong, R SANDIA NATL LABS LIVERMORECA 94551

ISBN: (纸本)0818675829

DAISy (distributed Array of Inexpensive systems) is a 16 node PC cluster running a full UNIX compatible operating system. The network media used includes standard 10Mb/s (10BASE-2) Ethernet (used for client node NFS mounts and any client node interactive work users find necessary), and, switched 100Mbs/ (100BASE-TX) Fast Ethernet (used for user program message passing traffic). The DAISy cluster is used to investigate the viability of commodity PC technology to perform computation of scientific and engineering problems traditionally performed on 'Supercomputers,' and more recently high performance RISC workstations and clusters of RISC workstations. Performance analysis of the various single node subsystems were carried out, along with performance analysis of the cluster as a whole on a number of parallel applications. The results show that the current Pentium 90MHz CPU and motherboards used are well within that of many low-end workstations offered by traditional workstation vendors.

关键词： distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

Data structures for the distributed iterative solution of non-conventional finite element models

引用

ADVANCES IN engineering software 2007年第11-12期38卷 750-762页

作者： Cismasiu, Ildi Moitinho de Almelda, J. P. Univ Nova Lisboa Fac Ciencias & Tecnol Ctr Invest Estruturas & Construcao UNIC Dept Civil Engn P-2829 Monte De Caparica Portugal Univ Tecn Lisboa Inst Super Tecn Dept Engn Civil & Arquitectura P-1049 Lisbon Portugal

A class of specialised data structures designed for the distributed solution of non-conventional finite element formulations, which are equally effective when used in conjunction with conventional formulations, is presented. We begin by briefly discussing how the non-conventional finite element formulations being developed within the structural analysis group at IST [Freitas JAT, Almeida JPM, Pereira EMBR. Non-conventional formulations for the finite element method. Comput Mech 1999;23(5-6):488-501] lead to systems of equations that appear to be naturally suited for parallel processing, but we also recognise that to take full advantage of the characteristics of these systems - large dimension, non-overlapping block structure and sparsity - it is necessary to use appropriate data structures. The approach presented, which references the logical subdivisions of the system matrices, was designed to fulfil these objectives. Examples of parallel performance and efficiency on an homogeneous distributed platform are presented. (c) 2006 Published by Elsevier Ltd.

关键词： parallel processing matrix handling data structures hybrid finite elements domain decomposition

来源：评论

学校读者我要写书评

暂无评论

Optimizing OpenMP programs on software distributed shared memory systems

引用

international JOURNAL OF parallel PROGRAMMING 2003年第3期31卷 225-249页

作者： Min, SJ Basumallik, A Eigenmann, R Purdue Univ Sch Elect & Comp Engn W Lafayette IN 47907 USA

This paper describes compiler techniques that can translate standard OpenMP applications into code for distributed computer systems. OpenMP has emerged as an important model and language extension for shared-memory parallel programming. However, despite OpenMP's success on these platforms, it is not currently being used on distributed system. The long-term goal of our project is to quantify the degree to which such a use is possible and develop supporting compiler techniques. Our present compiler techniques translate OpenMP programs into a form suitable for execution on a software DSM system. We have implemented a compiler that performs this basic translation, and we have studied a number of hand optimizations that improve the baseline performance. Our approach complements related efforts that have proposed language extensions for efficient execution of OpenMP programs on distributed systems. Our results show that, while kernel benchmarks can show high efficiency of OpenMP programs on distributed systems, full applications need careful consideration of shared data access patterns. A naive translation ( similar to OpenMP compilers for SMPs) leads to acceptable performance in very few applications only. However, additional optimizations, including access privatization, selective touch, and dynamic scheduling, resulting in 31% average improvement on our benchmarks.

关键词： OpenMP applications software distributed shared memory benchmarks performance characteristics optimizations

来源：评论

学校读者我要写书评

暂无评论

Byzantine Fault-Tolerant Implementation of a Multi-Writer Regular Register

Byzantine Fault-Tolerant Implementation of a Multi-Writer Re...

引用

23rd IEEE international parallel and distributed Processing symposium

作者： Kanjani, Khushboo Lee, Hyunyoung Welch, Jennifer L. Oracle Corporation United States Dept. of Computer Science and Engineering Texas A and M University United States

ISBN: (纸本)9781424437511

distributed storage systems have become popular for handling the enormous amounts of data in network-centric systems. A distributed storage system provides client processes with the abstraction of a shared variable that satisfies some consistency and reliability properties. Typically the properties are ensured through a replication-based implementation. This paper presents an algorithm for a replicated read-write register that cat? tolerate Byzantine failures of some of the replica servers. The targeted consistency condition is a version of regularity that supports multiple writers. Although regularity is weaker than the more frequently supported condition of atomicity it is still strong enough to be useful in some important applications. By weakening the consistency condition, the algorithm can support multiple writers more efficiently than the known multi-writer algorithms for atomic consistency.

关键词： Multiprocessing systems

来源：评论

学校读者我要写书评

暂无评论

Porting industrial codes and developing sparse linear solvers on parallel computers

引用

COMPUTING systems IN engineering 1995年第4-5期6卷 295-305页

作者： Dayde, MJ Duff, IS CERFACS F-31057 TOULOUSEFRANCE RUTHERFORD APPLETON LAB DIDCOT OX11 0QXOXONENGLAND

We address the main issues when porting existing codes from serial to parallel computers and when developing portable parallel software on MIMD multiprocessors (shared memory, virtual shared memory, and distributed memory multiprocessors, and networks of computers). We discuss the use of numerical libraries as a way of developing portable and efficient parallel code. We illustrate this by using examples from our experience in porting industrial codes and in designing parallel numerical libraries. We report in some detail on the parallelization of scientific applications coming from Centre National d'Etudes Spatiales and from Aerospatiale, and we illustrate how it is possible to develop portable and efficient numerical software by considering the parallel solution of sparse linear systems of equations.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

parallel Computing for Machine Learning in Social Network Analysis 31

Parallel Computing for Machine Learning in Social Network An...

引用

31st IEEE international parallel and distributed Processing symposium Workshops (IPDPS)

作者： Cybenko, George Dartmouth Coll Thayer Sch Engn Hanover NH 03755 USA

ISBN: (纸本)9780769561493

Machine learning, especially deep learning, is revolutionizing how many engineering problems are being solved. Three critical ingredients are needed to apply deep machine learning to significant real world problems: i.) large data sets;ii.) software to implement deep learning and;iii.) significant computing cycles. This paper discusses the state of each ingredient with a specific focus on: a.) how deep learning can apply to large-scale social network analysis and;b.) the computing resources required to make such analyses feasible.

关键词： Social network analysis machine learning deep learning parallel computing

来源：评论

学校读者我要写书评

暂无评论

The Research and Analysis of Hungarian Algorithm in the Structure Index Reduction for DAE

The Research and Analysis of Hungarian Algorithm in the Stru...

引用

11th international symposium on distributed Computing and Applications to Business, engineering and Science (DCABES)

作者： Zeng, Yan Wu, Xuesong Cao, Jianwen Chinese Acad Sci Inst Software Lab Parallel Software & Computat Sci Software Beijing 100190 Peoples R China

ISBN: (纸本)9780769548180

Modeling of complex physical systems with Modelica usually leads to the high-index differential algebraic equation system (DAE), index reduction is an important part of solving the high-index DAE. The structure index reduction algorithm is one of the popular methods, but in special cases, it fails. Combinatorial relaxation algorithm can detect and correct the breakdown situation. And the maximum weight matching of bipartite graph is an important part of the combinatorial relaxation algorithm. In order to choose the proper method for the large-scale, dense bipartite graph, this paper provides three implementations of the Hungarian algorithm. The experiment results and the theory show that the BFS single-augmented method is better than others.

关键词： Modelica Bipartite Graph DAE Hungarian Algorithm Augmenting Path

来源：评论

学校读者我要写书评

暂无评论

High-performance parallel computation of flows past a space plane using NWT

引用

IEICE TRANSACTIONS ON INFORMATION AND systems 1997年第4期E80D卷 524-530页

作者： Matsushima, K Takanashi, S HPC Systems Engineering Div. Fujitsu Ltd. Chiba-shi 261 Japan National Aerospace Laboratory Chofu-shi 182 Japan

Compressible viscous flows past a space plane have been elucidated by parallel computation on the NWT. The NWT is a vector-parallel architecture computer system which achieves remarkably high performance in processing speed and memory storage. We have examined the advantages of the NWT in order to simulate realistic how problems in engineering, such as the investigation of global and local aerodynamic characteristics of a space plane. The accuracy of the computational results has been verified by comparison with experimental data. The simplified domain-decomposition technique introduced here is easy to apply for parallel implementation to significantly improve the acceleration rate of computations. The larger available memory storage enables us to conduct a grid refinement study through which several points concerning CFD simulation of a space plane are obtained.

关键词： computational fluid dynamics (CFD) Navier-Stokes equations space plane vector-parallel computation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：