I am excited to share with you the dod high performance computing modernization program (HPCMP) Annual highlights for 2019 and2020. As you will see, the HPCMP has had another outstanding 2 years of accomplishments and...
详细信息
I am excited to share with you the dod high performance computing modernization program (HPCMP) Annual highlights for 2019 and2020. As you will see, the HPCMP has had another outstanding 2 years of accomplishments and growth. Using the 2018 National DefenseStrategy as a guidepost, we have focused on supporting our customers from the government, industry, and academia with state-of-the-arthigh-performancecomputing (HPC) services and resources to help them find solutions to the most challenging science and technology(S and T), test and evaluation (T and E), and acquisition engineering problems facing our Department of Defense. With national defense prioritiesshifting to potential conflict with near-peer adversaries like China and Russia, the HPCMP ecosystem that has been developed for the past25 years has proven it is ideally suited to these challenges.
From aircraft design to certification, a significant volume of aerodynamic data is required to ensure optimal performance, meet regulatory standards, and maintain structural integrity. These data must span the entire ...
详细信息
The Department of Defense (dod) highperformancecomputingmodernizationprogram (HPCMP) Computational Research and Engineering Acquisition Tools and Environments (CREATE) program is developing and deploying a suite o...
详细信息
Wide vector units in Intel's Xeon Phi accelerator cards can significantly boost application performance when used effectively. However, there is a lack of performance tools that provide programmers accurate inform...
详细信息
Wide vector units in Intel's Xeon Phi accelerator cards can significantly boost application performance when used effectively. However, there is a lack of performance tools that provide programmers accurate information about the level of vectorization in their codes. This paper presents VecMeter, an easy-to-use tool to measure vectorization on the Xeon Phi. VecMeter utilizes binary instrumentation and therefore does not require source code modifications. This paper describes the design of VecMeter, demonstrates its accuracy, defines a metric for quantifying vectorization, and provides an example where the tool can guide code optimization to improve performance by up to 33%.
Deploying large numbers of small, low-power cores has been gaining traction recently as a system design strategy in highperformancecomputing (HPC). The ARM platform that dominates the embedded and mobile computing s...
详细信息
Accelerators are becoming prevalent in highperformancecomputing as a way of achieving increased computational capacity within a smaller power budget. Effectively utilizing the raw compute capacity made available by ...
详细信息
Accelerators are becoming prevalent in highperformancecomputing as a way of achieving increased computational capacity within a smaller power budget. Effectively utilizing the raw compute capacity made available by these systems, however, remains a challenge because it can require a substantial investment of programmer time to port and optimize code to effectively use novel accelerator hardware. In this paper we present a methodology for isolating and modeling the performance of common performance-critical patterns of code (so-called idioms) and other relevant behavioral characteristics from large scale HPC applications which are likely to perform favorably on Intel Xeon Phi. The benefits of the methodology are twofold: (1) it directs programmer efforts toward the regions of code most likely to benefit from porting to the Xeon Phi and (2) provides speedup estimates for porting those regions of code. We then apply the methodology to the stencil idiom, showing performance improvements of up to a factor of 4.7× on stencil-based benchmark codes.
The U.S. Department of Defense highperformancecomputingmodernizationprogram (HPCMP) has implemented sustained systems performance testing on highperformancecomputing systems in use at dod Supercomputing Resource...
详细信息
ISBN:
(纸本)9781450311397
The U.S. Department of Defense highperformancecomputingmodernizationprogram (HPCMP) has implemented sustained systems performance testing on highperformancecomputing systems in use at dod Supercomputing Resource Centers. The intent is to monitor performance improvements by updates to the operating system, compiler suites, and numerical and communications libraries, and to monitor penalties arising from security patches. In practice, each system's workload is simulated by appropriate choices of user application codes representative of the HPCMP computational technical areas. Past successes include surfacing an imminent failure of an OST in a Cray XT3, incomplete configuration of a scheduler update on an SGI Altix 4700, performance issues associated with a communications library update for a Linux Networx Advanced Technology Cluster, and intermittent resetting of Intel Nehalem cores to standard mode from turbo mode. This history demonstrates that SSP testing is critical to deliver the highest quality of service to the HPCMP users. Copyright 2011 ACM.
In order to achieve a high level of performance, data intensive applications such as the real-time processing of surveillance feeds from unmanned aerial vehicles will require the strategic application of multi/many-co...
In order to achieve a high level of performance, data intensive applications such as the real-time processing of surveillance feeds from unmanned aerial vehicles will require the strategic application of multi/many-core processors and coprocessors using a hybrid of inter-process message passing (e.g. MPI and SHMEM) and intra-process threading (e.g. pthreads and OpenMP). To facilitate program design decisions, memory traces gathered through binary instrumentation can be used to understand the low-level interactions between a data intensive code and the memory subsystem of a multi-core processor or many-core co-processor. Toward this end, this paper introduces the addition of threading support for PMaCs Efficient Binary Instrumentation Toolkit for Linux/x86 (PEBIL) and compares PEBILs threading model to the threading models of two other popular Linux/x86 binary instrumentation platforms - Pin and Dyninst - on both theoretical and empirical grounds. The empirical comparisons are based on experiments which collect memory address traces for the OpenMP-threaded implementations of the NASA Advanced Supercomputing Parallel Benchmarks (NPBs). This work shows that the overhead of collecting full memory address traces for multithreaded programs is higher in PEBIL (7.7x) than in Pin (4.7x), both of which are significantly lower than Dyninst (897x). This work also shows that PEBIL, uniquely, is able to take advantage of interval-based sampling of a memory address trace by rapidly disabling and re-enabling instrumentation at the transitions into and out of sampling periods in order to achieve significant decreases in the overhead of memory address trace collection. For collecting the memory address streams of each of the NPBs at a 10% sampling rate, PEBIL incurs an average slowdown of 2.9x compared to 4.4x with Pin and 897x with Dyninst.
The Computational Research and Engineering Acquisition Tools and Environments (CREATE) program was established as a new 12-year program in FY 2008 by the Department of Defense (dod). The CREATE goal is to enable major...
详细信息
The Computational Research and Engineering Acquisition Tools and Environments (CREATE) program was established as a new 12-year program in FY 2008 by the Department of Defense (dod). The CREATE goal is to enable major improvements in dod's acquisition engineering design and analysis processes by developing and deploying scalable, multi-disciplinary, physics-based computational engineering software products for the design and analysis of dod Ships, Air Vehicles, and Radio Frequency Antennas. Meshing and Geometry (MG) generation is being provided by a fourth project, MG. CREATE is a multi-institutional, multi-service, multi-agency and multi-disciplinary program with participation by the Navy, Air Force, Army, the Office of the Secretary of Defense, industry, and academia. The CREATE products are being developed and released on an annual cycle. In 2010, the program released five new products: SENTRI 1.0 - RF antenna design; NESM 0.1 - Ship Shock Analysis; IHDE 1.0 - Ship Hydrodynamic Design and Analysis; Kestrel 1.0 - Fixed-wing air vehicle analysis; and Helios 1.0 -- Rotorcraft analysis. Enhanced versions of these products will be released every year starting in 2011. In 2011, five additional products will begin annual releases: DaVinci - a tool for the rapid physics-based design of air vehicles; RDI - an integrated suite of tools to enable rapid physics-based design of naval ships; Firebolt - components to provide models for gas turbine propulsion systems for Kestrel and Helios; NavyFoam - a high- fidelity hydrodynamics analysis tool for predicting drag and resistance, sea-keeping and seaway loads; and Capstone - components to enable the generation of geometries and meshes for all of the other products. The CREATE products are designed to be modular, maintainable, extensible, and scalable. To accomplish this, the CREATE team1 has developed a set of software engineering and software project management practices and processes that strike the appropriate balance between
暂无评论