An array antenna design based on a gridded parasitic patch stacked microstrip antenna element is presented. The radiating part in the top layer of one antenna element has nine rectangular metal patches placed in a gri...
详细信息
An array antenna design based on a gridded parasitic patch stacked microstrip antenna element is presented. The radiating part in the top layer of one antenna element has nine rectangular metal patches placed in a grid with three rows and three columns with a separation of 0.1 mm between the patches. Two means of realising the array are investigated. In a regular 2 x 2 concept, the parasitic patches of individual elements are placed next to each other, forming on the top layer a uniform grid of parasitic patches with six rows and six columns. In an alternative concept named as shared array, neighbouring antenna elements share one row and one column of parasitic patches, forming on the top layer a uniform grid of parasitic patches with five rows and five columns. The antenna arrays are designed for the 60 GHz band. The measured bandwidth is 13.8 GHz for the single antenna, 15.3 GHz for the 2 x 2 array, and 12.4 GHz for the shared array. The measured realised gain at 60 GHz is 6 dBi for the single antenna, 11 dBi for the 2 x 2 array, and 10.5 dBi for the shared array. The measured radiation patterns have good agreement with simulations.
As the core counts on modern multiprocessor systems increase, so does the memory contention with all the processes/threads trying to access the main memory simultaneously. This is typical of UMA (Uniform Memory Access...
详细信息
ISBN:
(纸本)9781618397881
As the core counts on modern multiprocessor systems increase, so does the memory contention with all the processes/threads trying to access the main memory simultaneously. This is typical of UMA (Uniform Memory Access) architectures with a single physical memory bank leading to poor scalability in multithreaded applications. To alleviate this problem, modern systems are moving increasingly towards Nonuniform Memory Access (NUMA) architectures, in which the physical memory is split into several (typically two or four) banks. Each memory bank is associated with a set of cores enabling threads to operate from their own physical memory banks while retaining the concept of a shared virtual address space. However, accessing shared data structures from the remote memory banks may become increasingly slow. This paper proposes a way to determine and pin certain parts of the shared data to specific memory banks, thus minimizing remote accesses. To achieve this, the existing application code may be supplied with the proposed interface to set up and distribute shared data appropriately among memory banks. Experiments with the NAS CG benchmark as well as with a realistic large-scale application calculating ab initio nuclear structure have been performed. Speedups of up to 3.5 times were observed with the proposed approach compared with the default memory placement policy.
As the core counts on modern multi-processor systems increase, so does the memory contention with all the processes/threads trying to access the main memory simultaneously. This is typical of UMA (Uniform Memory Acces...
详细信息
ISBN:
(纸本)9781618397881
As the core counts on modern multi-processor systems increase, so does the memory contention with all the processes/threads trying to access the main memory simultaneously. This is typical of UMA (Uniform Memory Access) architectures with a single physical memory bank leading to poor scalability in multi-threaded applications. To palliate this problem, modern systems are moving increasingly towards Non-Uniform Memory Access (NUMA) architectures, in which the physical memory is split into several (typically two or four) banks. Each memory bank is associated with a set of cores enabling threads to operate from their own physical memory banks while retaining the concept of a shared virtual address space. However, accessing shared data structures from the remote memory banks may become increasingly slow. This paper proposes a way to determine and pin certain parts of the shared data to specific memory banks, thus minimizing remote accesses. To achieve this, the existing application code has be supplied with the proposed interface to set-up and distribute the shared data appropriately among memory banks. Experiments with NAS benchmark as well as with a realistic large-scale application calculating ab-initio nuclear structure have been performed. Speedups of up to 3.5 times were observed with the proposed approach compared with the default memory placement policy.
暂无评论