检索结果-内蒙古大学图书馆

43rd International Conference of the Chilean computer Science Society, SCCC 2024

作者： Tirado, Felipe Wong, Alvaro Rexachs, Dolores Luque, Emilio Universidad Católica Del Maule Departamento de Computación e Industrias Talca Chile Universidad Autónoma de Barcelona Computer Architecture and Operating System Department Barcelona Spain

ISBN: (纸本)9798331527891

Performance modeling of parallel applications is essential for optimizing resource usage in high-performance computing (HPC) systems. However, some scientific applications exhibit irregular performance behaviors, which complicates cre-ating accurate characteristic models. This irregularity is mainly due to these applications' nondeterministic computational and communication patterns. Tools such as PAS2P (Parallel Application Signatures for Performance Prediction) are used to extract detailed information about parallel applications. PAS2P is based on the repetitive behavior of the application to analyze and predict the application's performance, using the same resources that the parallel application uses for its execution. This paper presents a characterization model based on the PAS2P methodology for irregular applications that groups the repeatability patterns of all the processes running the application into a single characteristic model. To achieve this, we consolidate the different characterizations performed by each process independently, using metrics such as the number of instructions, the execution time of relevant sections, and the topological characteristics of the application. By grouping these repeatability patterns of all processes, we can obtain a concise and accurate representation of the behavior of irregular applications, thus improving predictability and performance optimization in HPC systems. © 2024 IEEE.

关键词： Application performance analysis irregular parallel applications MPI parallel application parallel application modeling

来源：评论

学校读者我要写书评

暂无评论

Performance model for irregular parallel applications. 42

Performance model for irregular parallel applications.

引用

42nd IEEE International Conference of the Chilean computer Science Society, SCCC 2023

作者： Tirado, Felipe Wong, Alvaro Rexachs, Dolores Luque, Emilio Universidad Autónoma de Barcelona Computer Architecture and Operating System Department Barcelona Spain Universidad Católica Del Maule Departamento de Computación e Industrias Talca Chile

ISBN: (纸本)9798350313895

The performance modeling of a parallel application is crucial for the better use of the HPC resources. However, certain scientific applications exhibit irregular performance characteristics, posing challenges in accurately modeling their behavior. This irregularity primarily arises from these applications' non-deterministic computation and communication patterns. This article introduces a performance modeling methodology designed for irregular parallel applications based on the PAS2P methodology. The PAS2P tool generates an application signature and utilizes it to analyze and predict performance. Our approach is based on process-based data analysis to characterize these applications according to the behavior of individual processes, proposing a model to group processes at the time of signature construction. This model allowed us to obtain a reduced number of phases and weights in a limited time, allowing us to characterize the application. © 2023 IEEE.

关键词： Application performance analysis irregular parallel applications MPI parallel application parallel application modeling

来源：评论

学校读者我要写书评

暂无评论

Characterization Model for Irregular MPI Applications Based on PAS2P

Characterization Model for Irregular MPI Applications Based ...

引用

computer Science Society (SCCC) International Conference Chilean

作者： Felipe Tirado Alvaro Wong Dolores Rexachs Emilio Luque Departamento de Computación e Industrias Universidad Católica del Maule Talca Chile Computer Architecture and Operating System Department Universidad Autónoma de Barcelona Barcelona Spain

ISBN: (数字)9798331527891

ISBN: (纸本)9798331527907

关键词： Measurement computer science Analytical models Accuracy Computational modeling High performance computing Predictive models Data mining Optimization

来源：评论

学校读者我要写书评

暂无评论

Performance model for irregular parallel applications.

Performance model for irregular parallel applications.

引用

computer Science Society (SCCC) International Conference Chilean

作者： Felipe Tirado Alvaro Wong Dolores Rexachs Emilio Luque Computer Architecture and Operating System Department Universidad Autónoma de Barcelona Barcelona Spain Departamento de Computación e Industrias Universidad Católica del Maule Talca Chile

The performance modeling of a parallel application is crucial for the better use of the HPC resources. However, certain scientific applications exhibit irregular performance characteristics, posing challenges in accurately modeling their behavior. This irregularity primarily arises from these applications’ non-deterministic computation and communication patterns. This article introduces a performance modeling methodology designed for irregular parallel applications based on the PAS2P methodology. The PAS2P tool generates an application signature and utilizes it to analyze and predict performance. Our approach is based on process-based data analysis to characterize these applications according to the behavior of individual processes, proposing a model to group processes at the time of signature construction. This model allowed us to obtain a reduced number of phases and weights in a limited time, allowing us to characterize the application.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Analyzing the data behavior of parallel application for extracting performance knowledge.

Analyzing the data behavior of parallel application for extr...

引用

2019 International Conference on High Performance Computing and Simulation, HPCS 2019

作者： Tirado, Felipe Wong, Alvaro Rexachs, Dolores Luque, Emilio Universidad Autónoma de Barcelona Computer Architecture and Operating System Department Barcelona Spain Departamento de Computacion e Industrias Universidad Catolica Del Maule Talca Chile

ISBN: (纸本)9781728144849

When performance tools are used to analyze an application with thousands of processes, the data generated can be bigger than the memory size of the cluster node, causing this data to be loaded in swap memory. In HPC systems, moving data to swap is not always an option. This problem causes scalability limitations that affect the user experience and it presents serious restrictions for executing on a large scale. In order to obtain knowledge about the application's performance, the performance tools usually instrument the application to generate the data. When the instrumented parallel application is executed with thousands of processes, the data generated may be higher than the memory size of the compute node used to analyze the data in order to obtain the knowledge. Performance tools such as PAS2P predict the execution time in target machines. In order to predict the performance, PAS2P carries out a data analysis with the data in each application process. The data collected is analyzed sequentially, which results in an inefficient use of system resources. To solve this, we propose designing a parallel method to solve the problem when we manage a high volume of data, decreasing its execution time and increasing scalability, improving the PAS2P toolkit to generate performance knowledge defined by the application's behavior phases. © 2019 IEEE.

关键词： Scalability

来源：评论

学校读者我要写书评

暂无评论

Benchmark based on application signature to analyze and predict their behavior 7th

Benchmark based on application signature to analyze and pred...

引用

7th International Conference on Cloud Computing and Big Data, JCC and BD 2019

作者： Tirado, Felipe Wong, Alvaro Rexachs, Dolores Luque, Emilio Departamento de Computación e Industrias Universidad Católica del Maule Talca Chile Computer Architecture and Operating System Department Universidad Autónoma de Barcelona Barcelona Spain

ISBN: (纸本)9783030277123

Currently, there are benchmark sets that measure the performance of HPC systems under specific computing and communication properties. These benchmarks represent the kernels of applications that measure specific hardware components. If the user’s application is not represented by any benchmark, it is not possible to obtain an equivalent performance metric. In this work, we propose a benchmark based on the signature of an MPI application obtained by the PAS2P method. PAS2P creates the application signature in order to predict the execution time, which we believe will be very adjusted in relation to the execution time of the full application. The signature has two performance qualities: the bounded time to execute it (a benchmark property) and the quality of prediction. Therefore, we propose to extend the signature by giving the benchmark capacities such as the efficiency of the application over the HPC system. The performance metrics will be performed by the benchmark proposed. The experimentation validates our proposal with an average error of prediction close to 7%. © Springer Nature Switzerland AG 2019.

关键词： Benchmarking

来源：评论

学校读者我要写书评

暂无评论

Analyzing the data behavior of parallel application for extracting performance knowledge.

Analyzing the data behavior of parallel application for extr...

引用

International Conference on High Performance Computing & Simulation (HPCS)

作者： Felipe Tirado Alvaro Wong Dolores Rexachs Emilio Luque Computer Architecture and Operating System Department Universidad Autónoma de Barcelona Barcelona Spain

ISBN: (数字)9781728144849

ISBN: (纸本)9781728144856

关键词： Tools Instruments Memory management Scalability Performance analysis Data mining Libraries

来源：评论

学校读者我要写书评

暂无评论

Accelerating BWA Aligner Using Multistage Data Parallelization on Multicore and Manycore architectures

引用

Procedia computer Science 2016年 80卷 2438-2442页

作者： Shaolong Chen Miquel A. Senar Department of Computer Architecture & Operating System Universitat Autònoma de Barcelona Spain

Nowadays, rapid progress in next generation sequencing (NGS) technologies has drastically decreased the cost and time required to obtain genome sequences. A series of powerful computing accelerators, such as GPUs and Xeon Phi MIC, are becoming a common platform to reduce the computational cost of the most demanding processes when genomic data is analyzed. GPU has received more attention at literature so far. However, Xeon Phi constitutes a very attractive approach to improve performance because applications don’t need to be rewritten in a different programming language specifically oriented to it. Sequence alignment is a fundamental step in any variant analysis study and there are many tools that cope with this problem. We have selected BWA, one of the most popular sequence aligner, and studied different data management strategies to improve its execution time on hybrid systems made of multicore CPUs and Xeon Phi accelerators. Our main contributions are focused on designing new strategies that combine data splitting and index replication in order to achieve a better balance in the use of system memory and reduce latency penalties. Our experimental results show significant speed-up improvements when such strategies are executed in our hybrid platform, taking advantage of the combined computing power of a standard multicore CPU and a Xeon Phi accelerator.

关键词： Xeon Phi Sequence alignment Data parallelization Multicore processors Manycore processors

来源：评论

学校读者我要写书评

暂无评论

Accelerating preconditioned conjugate gradient solver in wind field calculation

Accelerating preconditioned conjugate gradient solver in win...

引用

International Conference on High Performance Computing & Simulation (HPCS)

作者： Gemma Sanjuan Tomás Margalef Ana Cortés Computer Architecture & Operating System department Universitat Autònoma de Barcelona Cerdanyola del Valls Spain

Wind field calculation is a common problem in different environmental applications from design of wind farms to forest fire propagation prediction. Calculating the wind field is a complex problem that involves solving huge linear systems. Solving such systems requires the use of iterative methods, such as Preconditioned Conjugate Gradient (PCG) that in most cases take long execution time. The PCG solver with different preconditioners has been analyzed and the performance and scalability of this solver has been determined. The most time consuming operations have been identified and a new method has been developed to improve the parallelization reducing the execution time and increasing the scalability. The new method has been applied on a wind field simulator, called WindNinja, usually coupled to forest fire propagation models. The results are very promising and the new parallelization method appears as a key point to be integrated in other approaches.

关键词： Jacobian matrices Mathematical model Sparse matrices Iterative methods Scalability Atmospheric modeling Program processors

来源：评论

学校读者我要写书评

暂无评论

POSTER: “Analysis of scalability: A parallel application model approach”

POSTER: “Analysis of scalability: A parallel application mo...

引用

IEEE International Conference on Cluster Computing

作者： Javier Panadero Alvaro Wong Dolores Rexachs Emilio Luque Department of Computer Architecture and Operating System (CAOS) Universitat Autonoma de Barcelona Barcelona Spain

In this paper we propose a methodology that allows us to predict the application scalability behavior in a specific system, providing information to select the most appropriate resources to run the application. We explain the general methodology, focusing on the presentation of a novel method to model the logical application trace for a large number of processes. This method is based on the projection of a set of executions of the application signature for a small number of processes. The generated traces are validated by comparing them with the real traces obtained with PAS2P tool. We present the experimental validation for the BT Nas Parallel Benchmark. The signatures for 16, 36, 64, 81 and 100 processes were executed and used to model and project the logical trace for 1024 processes. The results obtained show the accuracy of the method. The communication pattern was predicted without error, while the predicted error is less than 10% for the communication volume and less than 5% for the number of instructions.

关键词： Scalability Computational modeling Benchmark testing Predictive models Analytical models Focusing Multicore processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：