Many high-performancecomputingapplications are of high consequence to society. Global climate modeling is a historic example of this. In 2020, the societal issue of greatest concern, the still-raging COVID-19 pandem...
详细信息
Many high-performancecomputingapplications are of high consequence to society. Global climate modeling is a historic example of this. In 2020, the societal issue of greatest concern, the still-raging COVID-19 pandemic, saw a legion of computational scientists turning their endeavors to new research projects in this direction. applications of such high consequence highlight the need for building trustworthy computational models.
high Throughput computing (HTC) applications become the new loadings with the rapid rising of web services. In HTC applications, as we observed, a significant proportion of memory accesses are in small granularity, su...
详细信息
ISBN:
(纸本)9781509001729
high Throughput computing (HTC) applications become the new loadings with the rapid rising of web services. In HTC applications, as we observed, a significant proportion of memory accesses are in small granularity, such as IB or 2B. However, the link width is usually designed as 128 bits or even larger to achieve high throughput in traditional NoCs. The entire bandwidth is occupied no matter how large the flit is. Therefore, using traditional NoCs for HTC applications will lead to the waste of bandwidth. In this paper, to address the above-mentioned problem, we proposed high-Density NoC (HD-NoC). In HD-NoC, traditional link is split into several narrow channels, such as 8 or 16 bits. If the slice is 16 bits wide, there will be 8 or more separately self-governed small channels running simultaneously in one direction. Cooperating with our Greedy Transfer Mechanism (GTM), flits in the same direction can be transferred parallel, which will alleviate the congestion and improve effective utilization of bandwidth. Experiments show that for HTC applications, our proposed HD-NoC improves throughput rate by 22.2% in average and 32.4% for Grep application with little extra hardware resources. The HD-NoC is also able to improve throughput rate by 13.5% for traditional SPLASH-2 benchmarks.
This paper presents the underlying theory and the performance of a cluster using a new 2-hop network topology. This topology is constructed using a symmetric equation and Singer Difference Sets and is called SymSig. T...
详细信息
ISBN:
(纸本)9781479907298
This paper presents the underlying theory and the performance of a cluster using a new 2-hop network topology. This topology is constructed using a symmetric equation and Singer Difference Sets and is called SymSig. The degree of connections at each node with SymSig is about half compared to previous methods using Singer Difference Sets. A comparison with a cluster of Clos topology shows significant advantages. The worst case congestion in SymSig topology for unicast permutation is 2, where as in Clos it is proportional to the radix of the building block switches used. The number of switches required is smaller by about 25%, the size of the cluster is larger by about 15% and the worst bandwidth is better by about 50% for SymSig. These advantages are retained for peta and exascale systems. Its performance on a set of collectives like exchange-all, shift-all, broadcast-all and all-to-all send/receive shows improvements ranging from 39% to 83%. Its performance on a molecular dynamics application GROMMACS shows improvement of upto 33%. This network is particularly suitable for applications that require global all to all communications. The low latency of this network makes it scaleable and an attractive alternative for building peta and exascale systems.
The most important aspect that affects the reliability of environmental simulations is the uncertainty on the parameter settings describing the environmental conditions, which may involve important biases between simu...
详细信息
The most important aspect that affects the reliability of environmental simulations is the uncertainty on the parameter settings describing the environmental conditions, which may involve important biases between simulation and reality. To relieve such arbitrariness, a two-stage prediction method was developed, based on the adjustment of the input parameters according to the real observed evolution. This method enhances the quality of the predictions, but it is very demanding in terms of time and computational resources needed. In this work, we describe a methodology developed for response time assessment in the case of fire spread prediction, based on evolutionary computation. In addition, a parallelization of one of the most used fire spread simulators, FARSITE, was carried out to take advantage of multicore architectures. This allows us to design proper allocation policies that significantly reduce simulation time and reach successful predictions much faster. A multi-platform performance study is reported to analyze the benefits of the methodology.
The most important aspect that affects the reliability of environmental simulations is the un- certainty on the parameter settings describing the environmental conditions, which may involve important biases between si...
详细信息
The most important aspect that affects the reliability of environmental simulations is the un- certainty on the parameter settings describing the environmental conditions, which may involve important biases between simulation and reality. To relieve such arbitrariness, a two-stage pre- diction method was developed, based on the adjustment of the input parameters according to the real observed evolution. This method enhances the quality of the predictions, but it is very demanding in terms of time and computational resources needed. In this work, we describe a methodology developed for response time assessment in the case of fire spread prediction, based on evolutionary computation. In addition, a parallelization of one of the most used fire spread simulators, FARSITE, was carried out to take advantage of multicore architectures. This al- lows us to design proper allocation policies that significantly reduce simulation time and reach successful predictions much faster. A multi-platform performance study is reported to analyze the benefits of the methodology.
暂无评论