检索结果-内蒙古大学图书馆

Contextual contracts for component-oriented resource abstraction in a cloud of high performance computing services

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2021年第18期33卷 e6225-e6225页

作者： de Carvalho Junior, Francisco Heron Al-Alam, Wagner Guimaraes de Oliveira Dantas, Allberson B. Univ Fed Ceara Posgrad Ciencia Comp Fortaleza Ceara Brazil Univ Fed Ceara Campus Quixada Quixada Brazil Univ Integracao Int Lusofonia Afro Brasileira Inst Educ Distancia Redencao Brazil

Efforts to support high performance computing (HPC) applications' requirements in the context of cloud computing have motivated us to design HPC Shelf, a cloud computing services platform to build and deploy large-scale parallel computing systems. We introduce Alite, the contextual contract system of HPC Shelf, to select component implementations according to requirements of the host application, target parallel computing platform characteristics (e.g., clusters and MPPs), quality of service (QoS) properties, and cost restrictions. It is evaluated through a small-scale case study employing two complementary component-based frameworks. The first one aims to represent components that implement linear algebra computations based on the BLAS interface. In turn, the second one aims to represent parallel computing platforms on the IaaS cloud offered by Amazon EC2 Service.

关键词： component‐ based software engineering high performance computing parallel computing parallel programming

来源：评论

学校读者我要写书评

暂无评论

Matrix bidiagonalization on the Trident processor

Matrix bidiagonalization on the Trident processor

引用

International Symposium on parallel and Distributed Processing (IPDPS)

作者： M.I. Soliman S.G. Sedukhin Graduate School of Computer Science and Engineering University of Aizu Fukushima Japan

This paper discusses the implementation and evaluation of the reduction of a dense matrix to bidiagonal form on the Trident processor. The standard Golub and Kahan Householder bidiagonalization algorithm, which is rich in matrix-vector operations, and the LAPACK subroutine /spl ***/GEBRD, which is rich in a mixture of vector, matrix-vector, and matrix operations, are simulated on the Trident processor. We show how to use the Trident parallel execution units, ring, and communication registers to effectively perform vector, matrix-vector, and matrix operations needed for bidiagonalizing a matrix. The number of clock cycles per FLOP is used as a metric to evaluate the performance of the Trident processor. Our results show that increasing the number of the Trident lanes proportionally decreases the number of cycles needed per FLOP. On a 32 K/spl times/32 K matrix and 128 Trident lanes, the speedup of using matrix-vector operations on the standard Golub and Kahan algorithm is around 1.5 times over using vector operations. However, using matrix operations on the GEBRD subroutine gives speedup around 3 times over vector operations, and 2 times over using matrix-vector operations on the standard Golub and Kahan algorithm.

关键词： Registers Architecture Matrix decomposition parallel processing Algorithms Hardware parallel programming Computer science Cities and towns Clocks

来源：评论

学校读者我要写书评

暂无评论

MapReduce programming with apache Hadoop

MapReduce programming with apache Hadoop

引用

International Symposium on parallel and Distributed Processing (IPDPS)

作者： Milind Bhandarkar Hadoop Solutions Architect Yahoo Inc. USA

Summary form of only given: Apache Hadoop has become the platform of choice for developing large-scale data-intensive applications. In this tutorial, we will discuss design philosophy of Hadoop, describe how to design and develop Hadoop applications and higher-level application frameworks to crunch several terabytes of data, using anywhere from four to 4,000 computers. We will discuss solutions to common problems encountered in maximizing Hadoop application performance. We will also describe several frameworks and utilities developed using Hadoop that increase programmer-productivity and application-performance.

关键词： Application software Large-scale systems Biographies parallel programming Computational modeling Rockets

来源：评论

学校读者我要写书评

暂无评论

ARV-based Array Grouping and Data Distribution in OpenMP/JIAJIA

ARV-based Array Grouping and Data Distribution in OpenMP/JIA...

引用

IEEE International Conference on parallel and Distributed Computing, Applications and Technologies (PDCAT)

作者： Zeng Lifang Yang Xuejun H. Huangchun National Laboratory for Parallel and Distributed Processing China

In order to improve the performance of applications on OpenMP/JIAJIA, we present a new abstraction, Array Relation Vector (ARV), to describe the relation between the data elements of two consistent shared arrays accessed in one computation phase. Based on ARV, we use array grouping to eliminate the pseudo data distributing of small shared data and improve the page locality. Experimental results show that ARV-based array grouping can greatly improve the performance of applications with non-continuous data access and strict access affinity on OpenMP/JIAJIA cluster. For applications with small shared arrays, array grouping can improve the performance obviously when the processor number is small.

关键词： Phased arrays Concurrent computing Distributed computing Application software Laboratories Distributed processing parallel programming parallel architectures programming profession Emulation

来源：评论

学校读者我要写书评

暂无评论

A performance comparison between stop-the-world and multithreaded concurrent generational garbage collection for Java

A performance comparison between stop-the-world and multithr...

引用

IEEE International Conference on Performance, Computing and Communications (IPCCC)

作者： C.-T.D. Lo W. Srisa-an J.M. Chang Department of Computer Science Illinois Institute of Technology Chicago IL USA

ISBN: (纸本)0780373715

The recent popularity of the Java programming language has brought automatic dynamic memory management (a.k.a., the garbage collection) into the mainstream. Traditional garbage collectors suffer from long garbage collection pauses (stop-the-world mark-sweep algorithm) or inability of collecting cyclic garbage (reference counting approach). Generational garbage collection, however, is based only on the weak generational hypothesis that most objects die young. In this paper, the performance evaluation of a new multithreaded concurrent generational garbage collector (MCGC) based on mark-sweep with the assistance of reference counting is reported. The MCGC can take advantage of multiple CPUs in an SMP system and the merits of lightweight processes. Furthermore, the long garbage collection pause can be reduced and the garbage collection efficiency can be enhanced. Measurement results indicate that the MCGC improves the garbage collection pause time up to 96.75% over the traditional stop-the-world mark-sweep garbage collector. Moreover, the MCGC receives minimal time and space penalties as shown in the report of the total execution time, the memory footprint and the sticky reference count rate.

关键词： Java Dynamic programming Memory management Object oriented programming parallel programming Virtual machining Computer science Computer languages Technology management Time measurement

来源：评论

学校读者我要写书评

暂无评论

A Simulator For Reconfigurable Massively parallel Architectures

A Simulator For Reconfigurable Massively Parallel Architectu...

引用

Euromicro Workshop on parallel and Distributed Processing

作者： P. Baglietto M. Maresca M. Migliardi DIST-University of Genoa Genoa Italy

来源：评论

学校读者我要写书评

暂无评论

Cyclic networks: A family of versatile fixed-degree interconnection architectures

Cyclic networks: A family of versatile fixed-degree intercon...

引用

International Symposium on parallel Processing

作者： C.-H. Yeh B. Parhami Department of Electrical and Computer Engineering University of California Santa Barbara CA USA

In this paper, we propose a new family of interconnection networks, called cyclic networks (CNs), in which an intercluster connection is defined on a set of nodes whose addresses are cyclic shifts of one another. The node degrees of basic CNs are independent of system size, but can vary from a small constant (e.g., 3) to as large as required, thus providing flexibility and effective tradeoff between cost and performance. The diameters of suitably constructed CNs can be asymptotically optimal within their lower bounds, given the degrees. We show that packet routing and ascend/descend algorithms can be performed in /spl Theta/(log/sub d/ N) communication steps on some CNs with N nodes of degree /spl Theta/(d). Moreover CNs can also efficiently emulate homogeneous product networks (e.g., hypercubes and high dimensional meshes). As a consequence, we obtain a variety of efficient algorithms on such networks, thus proving the versatility of CNs.

关键词： Costs Hypercubes Multiprocessor interconnection networks Computer architecture Routing Network topology Algorithm design and analysis parallel programming Scalability Hardware

来源：评论

学校读者我要写书评

暂无评论

EFFICIENT IMPLEMENTATION OF INSAR TIME-CONSUMING ALGORITHM KERNELS ON GPU ENVIRONMENT

EFFICIENT IMPLEMENTATION OF INSAR TIME-CONSUMING ALGORITHM K...

引用

IEEE International Geoscience and Remote Sensing Symposium

作者： Andrea Guerriero Vito Walter Anelli Alessandro Pagliara Raffaele Nutricato Davide Oscar Nitti Dipartimento di Ingegneria Elettrica e dell'Informazione Politecnico di Bari Geophysical Applications Processing s.r.l.

ISBN: (纸本)9781479979301

Satellite remote sensing radar technologies provide powerful tools for geohazard monitoring and risk management at synoptic scale. In particular, advanced Multi-Temporal SAR Interferometric algorithms are capable to detect ground deformations and structural instabilities with millimetric precision, but impose strong requirements in terms of hardware re-sources. Recent advances in GPU computing and programming hold promise for time efficient implementation of imaging algorithms, thus enhancing the development of advanced Emergency Management Services based on Earth Observation technologies. In this study, a preliminary assessment of the potentials of GPU processing is carried out, by comparing CPU (single- and multi-thread) and GPU implementations of InSAR time-consuming algorithm kernels. In particular, it is focused on the fine coregistration of SAR interferometric pairs, a crucial step in the interferogram generation process. Experimental results are discussed.

关键词： SAR interferometry Image Matching parallel programming GPU Heterogeneous Computing Green Computing

来源：评论

学校读者我要写书评

暂无评论

Formal description techniques in robot programming

Formal description techniques in robot programming

引用

IEE Colloquium on Application of CASE Tools

作者： D.R.W. Holton J.D.M. McKeever R.M. McKeag Dept. of Comput. Sci. Queen's Univ. Belfast UK Department of Computer Science Queen''s University Belfast Belfast UK

A study is reported whose aim was to produce a system to facilitate offline programming of robots and to provide a testbed for alternative algorithms for the services provided. The system was specified using the formal description technique LOTOS (language of temporal ordering specification). LOTOS is best known for its use in the description of OSI protocols and is supported by an ISO standard. LOTOS consists of a process algebra for specifying the structure of the system and the interactions between components of the system, and an algebraic data typing mechanism for specifying the operations the system carries out. The description of the system was heavily influenced by techniques used in the design of operating systems. Concurrency was introduced at the initial design stage, there was an explicit separation of concerns and the specification was structured hierarchically, with actions at one level appearing atomic to the next higher level. Each level in the hierarchy provides an increasingly abstract view of the robot. The resulting description was executed, or animated, using the SEDOS tool, to help determine that the correct behaviour had been encapsulated by the description. The specification was then implemented on a network of transputers, using 3L parallel Pascal.< >

关键词： Data structures Software requirements and specifications Open systems parallel programming Robot programming Specification languages

来源：评论

学校读者我要写书评

暂无评论

Using the Parsec environment to implement a high-performance processor farm

Using the Parsec environment to implement a high-performance...

引用

Annual Hawaii International Conference on System Sciences (HICSS)

作者： D. Feldcamp A. Wagner Department of Computer Science University of British Columbia Vancouver BC Canada

Parsec is a parallel programming environment whose goal is to simplify the development of multicomputer programs without, as is often the case, sacrificing performance. We have reconciled these objectives by "compiling" the structure of parallel applications into information to configure each of a small set of communication primitives on a context-sensitive basis. In this paper, we show how Parsec can be used to implement a high-performance processor farm and compare Parsec and hand-optimized implementations to demonstrate that Parsec can achieve a similar level of performance. Extensive static analysis and optimization is necessary to achieve these results. We discuss both the tools which perform these tasks as well as the user interface that provides the necessary declarative structural information. Using the processor farm, we show how Parsec simplifies the task of specifying the structure of a parallel application and improves the result by supporting abstraction, reuse and scalability.< >

关键词： Libraries Message passing programming environments parallel programming Optimizing compilers Computer science Electronic mail Context User interfaces Scalability

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：