As the system scales up continuously, the problem of power consumption for high performance computing (HPC) system becomes more severe. Heterogeneous system integrating two or more kinds of processors, could be better...
详细信息
As the system scales up continuously, the problem of power consumption for high performance computing (HPC) system becomes more severe. Heterogeneous system integrating two or more kinds of processors, could be better adapted to heterogeneity in applications and provide much higher energy efficiency in theory. Many studies have shown heterogeneous system is preferable on energy consumption to homogeneous system in a multi-programmed computing environment. However, how to exploit energy efficiency (Flops/Watt) of heterogeneous system for a single application or even for a single phase in an application has not been well studied. This paper proposes a power-efficient work distribution method for single application on a CPU-GPU heterogeneous system. The proposed method could coordinate inter-processor work distribution and per-processor's frequency scaling to minimize energy consumption under a given scheduling length constraint. We conduct our experiment on a real system, which equips with a multi-core CPU and a multi-threaded GPU. Experimental results show that, with reasonably distributing work over CPU and GPU, the method achieves 14% reduction in energy consumption than static mappings for several typical benchmarks. We also demonstrate that our method could adapt to changes in scheduling length constraint and hardware configurations.
The influence of on-chip metal interconnections, power grids, heat sink together with packaging, and metal dummy fills on the transmission characteristics of a 2mm-long integrated dipole antenna pair has been investig...
详细信息
The influence of on-chip metal interconnections, power grids, heat sink together with packaging, and metal dummy fills on the transmission characteristics of a 2mm-long integrated dipole antenna pair has been investigated in this paper. These metal structures and placements have been classified and particular simulations are performed to explore the interference effects of neighboring various metal structures on transmission gain, phase, impedance and radiation pattern for on-chip dipole antenna pair. By virtue of the experimental results and analyses, several experiential linear expressions for antenna pair gain and phase in interference circumstances are obtained using numerical fit. A set of design rules is concluded accordingly for guiding on-chip antenna layout and design targeting wireless interconnect.
Many systems, such as Synthetic Aperture Radar (SAR) processing, two-dimensional image processing, 2d-FFT calculation, need access the row and column data of their matrix alternately. The DRAM memory should be used du...
详细信息
Many systems, such as Synthetic Aperture Radar (SAR) processing, two-dimensional image processing, 2d-FFT calculation, need access the row and column data of their matrix alternately. The DRAM memory should be used due to huge data in these systems. To improve the usage of memory bandwidth in such systems, this paper theoretically analyses the optimal window size to minimize the total number of opening/closing pages when performing in such instances by balancing the number of handling physical pages between row and column accesses. This paper presents a window-based optimal memory access method, and we implemented an FPGA-based SDRAM controller with eight simple ports, which is based on window accessing mechanism and supports commercialized SDARM. The experimental results show that the effective I/O bandwidth of external SDRAM using our window layout approach increases from 114.2MB/s of naive implementation to 730.2MB/s with over 6X speedup. In addition, we implemented two SAR processing systems with four FFT processing elements using our window-based SDRAM controller and Corner Turn method separately in FPGA chip. Results show window-based method can achieve a speedup of 2.6 compared to Corner Turn method.
Insects build architecturally complex nests and search for remote food by collaboration work despite their limited sensors, minimal individual intelligence and the lack of a central control system. Insets' co...
详细信息
ISBN:
(纸本)9781424472796
Insects build architecturally complex nests and search for remote food by collaboration work despite their limited sensors, minimal individual intelligence and the lack of a central control system. Insets' collaborations emerge as a response of the individual insects to Stigmergy. A sign-based model of Stigmergy to discuss collaboration is proposed in this paper where we picked up "sign" as a key notion to understand it. Therefore, sign is the link of all the components in a Stigmergic complex adaptive system. Based on this understanding, we propose a definition that reveals the nature of signs and exploit the significations and relationships carried by the notion of sign. Then, a sign-based model of Stigmergy is consequently reached, which captures the essentials of Stigmergy. A basic architecture of Stigmergy as well as its constituents are presented and discussed. At last, some applications of the model are discussed.
Multi-island single electron transistor (MISET) is a kind of single electron transistor (SET), which has advantages of the room temperature operating. A novel semi-empirical compact model for MISET is proposed. The ne...
详细信息
Multi-island single electron transistor (MISET) is a kind of single electron transistor (SET), which has advantages of the room temperature operating. A novel semi-empirical compact model for MISET is proposed. The new approach combines the orthodox theory of single electron tunneling for single Coulomb island and a novel empirical analysis for a chain of Coulomb islands. The model is verified by the Monte-Carlo method in SIMON simulator, and is much faster than the traditional multi-island SET simulator, which has the advantages for the large scale multi-island SET circuit simulation.
Encryption technology has become an important mechanism of securing data stored in the outsourced database. However, it is a difficulty to query efficiently the encrypted data and many researchers take it into conside...
详细信息
ISBN:
(纸本)9781424477050
Encryption technology has become an important mechanism of securing data stored in the outsourced database. However, it is a difficulty to query efficiently the encrypted data and many researchers take it into consideration. To solve the problem, an encrypted schema, based on the Postgresql DBMS, is proposed. Through the security dictionary and the extended SQL, the approach implements the encrypted storage and efficiently query over the encrypted data in the outsourced databases. Results of experiments validate the efficiency and feasibility of our approach.
Single-electronic transistors (SETs) are considered as the attractive candidates for post-CMOS VLSI due to their ultra-small size and low power consumption. Because SETs with single island can not work at room tempera...
详细信息
ISBN:
(纸本)9781424435432
Single-electronic transistors (SETs) are considered as the attractive candidates for post-CMOS VLSI due to their ultra-small size and low power consumption. Because SETs with single island can not work at room temperature normally, more and more researchers begin to make research on the SETs with 1-dimension multi-islands. A new simulation method-nSET, is introduced in this paper. Compared with other methods, nSET can simulate the SET device with 1-dimension multiple islands with high speed and accuracy. Through the comparison, it can be get that nSET is accurate and fast compared with the classical Monte Carlo (MC) simulator, and is very useful for the ASIC design of SET devices.
We present a high performance and memory efficient hardware implementation of matrix multiplication for dense matrices of any size on the FPGA devices. By applying a series of transformations and optimizations on the ...
详细信息
We present a high performance and memory efficient hardware implementation of matrix multiplication for dense matrices of any size on the FPGA devices. By applying a series of transformations and optimizations on the original serial algorithm, we can obtain an I/O and memory optimized block algorithm for matrix multiplication on FPGAs. A linear array of processing elements (PEs) is proposed to implement this block algorithm. We show significant reduction in hardware resources consuming compared to the related work while increasing clock frequency. Moreover, the memory requirement can be reduced to O(S) from O(S 2 ), where S is the block size. Therefore, more PEs can be integrated into the same FPGA devices.
In this paper, we introduce a generic model to deal with the event matching problem of content-based publish/subscribe systems over structured P2P overlays. In this model, we claim that there are three methods (event-...
详细信息
In this paper, we introduce a generic model to deal with the event matching problem of content-based publish/subscribe systems over structured P2P overlays. In this model, we claim that there are three methods (event-oriented, subscription-oriented and hybrid) to make all the matched pairs (event, subscription) meet in a system. By theoretically analyzing the inherent problem of both event-oriented and subscription-oriented methods, we propose PEM (Popularity-based Event Matching), a variant of hybrid method. PEM can achieve better trade-off between event processing load and subscription storage load of a system. PEM has been verified through both mathematical and simulation-based evaluation.
暂无评论