In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and networkinterface chips, and highlight a set of hardware and software features e...
详细信息
In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and networkinterface chips, and highlight a set of hardware and software features effectively supporting high performance communications, ranging over remote direct memory access, collective optimization, hardwareenable reliable end-to-end communication, user-level message passing services, etc. Measured hardware performance results are also presented.
This article describes the ibm blue gene/q interconnection network and message unit. Blue gene/q is the third generation in the ibm blue gene line of massively parallel supercomputers and can be scaled to 20 petaflops...
详细信息
This article describes the ibm blue gene/q interconnection network and message unit. Blue gene/q is the third generation in the ibm blue gene line of massively parallel supercomputers and can be scaled to 20 petaflops and beyond. For better application scalability and performance, blue gene/q has new routing algorithms and techniques to parallelize the injection and reception of packets in the networkinterface.
Cluster computing is still the most cost-effective solution to meet the increasing demand for computing power. Clusters are typically based on commodity computing hardware with specialized interconnection networks (IN...
详细信息
ISBN:
(纸本)9780889866959
Cluster computing is still the most cost-effective solution to meet the increasing demand for computing power. Clusters are typically based on commodity computing hardware with specialized interconnection networks (IN). These cluster interconnects differ from commodity networks by higher bandwidth, lower latency, lower CPU utilization and improved scalability. But even with these sophisticated INs the latency of a message transfer between two nodes is still decades higher than a local memory access. Especially for fine grain communication the latency of a message transfer is crucial. An analysis of the latency shows that the main component originates from the I/O system. The goal of this paper is to present a new mechanism called Ultra Low Latency Message Transfer (ULTRA), which allows message passing with lowest latencies possible. Beside the usage of well-known techniques like User-Level Communication this work focuses on improving the networkinterface by an optimized and most efficient usage of the I/O system. The ULTRA mechanism and architecture presented here show a topmost optimized approach for low latencies, limited only by the used standard I/O system. With it a much closer coupling of the cluster nodes is possible and fine grain communication schemes are more suitable for cluster computing.
暂无评论