The influence of on-chip metal interconnections, power grids, heat sink together with packaging, and metal dummy fills on the transmission characteristics of a 2mm-long integrated dipole antenna pair has been investig...
详细信息
In this paper, we explore a parallel block multigrid preconditioner based on factorization of the coefficient matrix generated in three-dimensional unstructured grids system. This preconditioner is robust with respect...
详细信息
Single-electronic transistors (SETs) are considered as the attractive candidates for post-COMS VLSI due to their ultra-small size and low power consumption. Because SETs with single island can not work at room tempera...
详细信息
ISBN:
(纸本)9781424435432
Single-electronic transistors (SETs) are considered as the attractive candidates for post-COMS VLSI due to their ultra-small size and low power consumption. Because SETs with single island can not work at room temperature normally, more and more researchers begin to make research on the SETs with 1-dimension multi-islands. A new simulation method-nSET, is introduced in this paper Compared with other methods, nSET can simulate the SET device with 1-Dimension multiple islands with high speed and accuracy. Through the comparison, it can be get that nSET is accurate and fast compared with the classical Monte Carlo(MC) simulator, and is very useful for the ASIC design of SET devices.
Encryption technology has become an important mechanism of securing data stored in the outsourced database. However, it is a difficulty to query efficiently the encrypted data and many researchers take it into conside...
详细信息
According to Moore's law the complexity of VLSI circuits has doubled approximately every two years, resulting in simulation becoming the major bottleneck in the circuit design process. parallel and distributed sim...
详细信息
Existing routing protocols for Wireless Mesh Networks (WMNs) are generally optimized with statistical link measures, while not addressing on the intrinsic uncertainty of wireless links. We show evidence that, with the...
详细信息
ISBN:
(纸本)9781424459889
Existing routing protocols for Wireless Mesh Networks (WMNs) are generally optimized with statistical link measures, while not addressing on the intrinsic uncertainty of wireless links. We show evidence that, with the transient link uncertainties at PHY and MAC layers, a pseudo-deterministic routing protocol that relies on average or historic statistics can hardly explore the full potentials of a multi-hop wireless mesh. We study optimal WMN routing using probing-based online anypath forwarding, with explicit consideration of transient link uncertainties. We show the underlying connection between WMN routing and the classic Canadian Traveller Problem (CTP) [1]. Inspired by a stochastic recoverable version of CTP (SRCTP), we develop a practical SRCTP-based online routing algorithm under link uncertainties. We study how dynamic next hop selection can be done with low cost, and derive a systematic selection order for minimizing transmission delay. We conduct simulation studies to verify the effectiveness of the SRCTP algorithms under diverse network configurations. In particular, compared to deterministic routing, reduction of end-to-end delay (51:15∼73:02%) and improvement on packet delivery ratio (99:76%) are observed.
In order to improve the efficiency of the communication networks, we used the Kruskal algorithm and the Prim algorithm through algorithm comparison and analysis methods of data structure. A dynamic framework for the c...
详细信息
Reputation systems provide a promising way to build trust relationships between users in distributed cooperation systems, such as file sharing, streaming, distributed computing and social network, through which a user...
详细信息
With fast development of transistor technology, Graphic processing Unit(GPU) is increasingly used in the non-graphics applications, and major GPU hardware vendors have introduced software stacks for their own GPUs, su...
详细信息
ISBN:
(纸本)9781424456789;9780769539584
With fast development of transistor technology, Graphic processing Unit(GPU) is increasingly used in the non-graphics applications, and major GPU hardware vendors have introduced software stacks for their own GPUs, such as Brook+ for AMD GPU. Compared with the traditional parallel systems, heterogeneous systems integerating stream-based multi-threaded GPUs provide higher parallel computing capabilities with lower cost. However, porting traditional applications to the heterogeneous systems makes new demand of application optimization on GPU. Based on the AMD's Brook+ platform, we explored application optimization features on AMD GPU by optimizing and implementing the benchmark LBM from SPEC2006. To improve the program locality, we optimized the original data layout of LBM. Using the short vector data types mechanism provided by Brook+, we also optimized the GPU's bandwidth utilization and its thread processors' efficiency. Through the branch elimination technique, we reduced the performance lose caused by branch divergences in the kernel, which is due to the GPU's SIMD executing mode. The experiment results show that data layout, memory bandwidth, branch paths and other factors have a close effect on the performance of program execution on the GPU. Through all the optimizations, we finally got a speedup of 22x (single-precision) and 19x (double-precision) over the original serial benchmark code on a Quad-core CPU, and a speedup of 4x (single-precision) and 8.7x (double-precision) over the original OMP benchmark code on a 8-core CPU.
This paper quantitatively studies the trace effects to the performance and accuracy of the BigSim Emulator, a scalable parallel emulator for large-scale computers. To assess the accuracy effect we modify the emulator ...
详细信息
This paper quantitatively studies the trace effects to the performance and accuracy of the BigSim Emulator, a scalable parallel emulator for large-scale computers. To assess the accuracy effect we modify the emulator code to collect the predicted computation time. Four MPI programs with different computation to communication ratios are used as benchmarks. The emulation time and the predicted computation time, both when trace generation are enabled and disabled, are collected on two parallel host machines. The results show that although the BigSim Emulator only traces communication events and dependencies, trace generation still evidently degrades the emulation performance for programs with high communication to computation ratios. Trace generation also significantly affects the accuracy of the predicted computation time for communication intensive programs, which is an issue that can not be overlooked.
暂无评论