There are many researches use peer-to-peer model to organize the Grid Information Service (GIS) and have been testified which be able to improve scalability and reliability of Grid environment. However, Data Grid Info...
详细信息
The paper presents a novel framework for scalable model checking of concurrent C programs. With the idea of verification reuse, it shows an integrated approach to efficient reduction of state space by abstraction, sym...
详细信息
The advancement in the process leads to more concern about the Single Event(SE) sensitivity of the Differential Cascade Voltage Switch Logic(DCVSL) circuits. The simulation results indicate that the Single Event Trans...
详细信息
The advancement in the process leads to more concern about the Single Event(SE) sensitivity of the Differential Cascade Voltage Switch Logic(DCVSL) circuits. The simulation results indicate that the Single Event Transient(SET) generated at the DCVSL gate is much larger than that at the ordinary CMOS gate, and their SET variation is different. Based on charge collection, in this paper, the effective collection time theory is proposed to set forth the SET pulse generated at the DCVSL gate. Through 3D TCAD mixed-mode simulation in 65 nm twin-well bulk CMOS process, the effects on SET variation of device parameters such as well contact size and environment parameters such as voltage are investigated.
Recent years,the hardening of combinational circuits is becoming a common *** the transistor-level hardening technique,the cell-level hardening technique,a divide and conquer strategy,can substantially make use of som...
详细信息
Recent years,the hardening of combinational circuits is becoming a common *** the transistor-level hardening technique,the cell-level hardening technique,a divide and conquer strategy,can substantially make use of some typical character in the cell-circuit module to mitigate single event transient(SET)*** mirror image(MI)technique proposed in this paper can adequately enhance the charge sharing in those cell-circuits with stage-by-stage inverter-like structure.3D TCAD mixed-mode simulation have been performed in 65 nm twinwell bulk CMOS process,the results indicate that the MI technique can almost reduce the SET pulse width from the anterior-stage PMOS over 25%,and can mitigate the SET pulse width from the posterior-stage PMOS about 10%.The MI technique,a represent of the cell-level technique,may be the future of the hardening of combinational circuits.
Recently, GPU has been widely used in High Performance Computing (HPC). In order to improve computational performance, several GPUs are integrated into one computer node in practical system. However, power consumption...
详细信息
ISBN:
(纸本)9783642283079;9783642283086
Recently, GPU has been widely used in High Performance Computing (HPC). In order to improve computational performance, several GPUs are integrated into one computer node in practical system. However, power consumption of GPUs is very high and becomes as bottleneck to its further development. In doing so, optimizing power consumption have been draw broad attention in the research area and industry community. In this paper, we present an energy optimization model considering performance constraint for homogeneous multi-GPUs, and propose a performance prediction model when task partitioning policy is specified. Experiment results validate that the model can accurately predict the execution of program for single or multiple GPUs, and thus reduce static power consumption by the guide of task partition.
On virtualization platforms, peak memory de- mand caused by hotspot applications often triggers page swapping in guest OS, causing performance degradation in- side and outside of this virtual machine (VM). Even thou...
详细信息
On virtualization platforms, peak memory de- mand caused by hotspot applications often triggers page swapping in guest OS, causing performance degradation in- side and outside of this virtual machine (VM). Even though host holds sufficient memory pages, guest OS is unable to utilize free pages in host directly due to the semantic gap between virtual machine monitor (MM) and guest operat- ing system (OS). Our work aims at utilizing the free memory scattered in multiple hosts in a virtualization environment to improve the performance of guest swapping in a transparent and implicit way. Based on the insightful analysis of behav- ioral characteristics of guest swapping, we design and im- plement a distributed and scalable framework HybridSwap. It dynamically constructs virtual swap pools using various policies, and builds up a synthetic swapping mechanism in a peer-to-peer way, which can adaptively choose different vir- tual swap pools. We implement the prototype of HybridSwap and evaluate it with some benchmarks in different scenar- ios. The evaluation results demonstrate that our solution has the ability to promote the guest swapping efficiency indeed and shows a double performance promotion in some cases. Even in the worst case, the system overhead brought by Hy- bridSwap is acceptable.
The Internet-based Virtual Computing Environment (iVCE) provides on-demand aggregation and autonomic collaboration mechanisms to facilitate the utilization of autonomous and dynamic Internet resources. Load balancing ...
详细信息
On June 17, 2013, MilkyWay-2 (Tianhe-2) supercomputer was crowned as the fastest supercomputer in the world on the 41th TOP500 list. This paper provides an overview of the MilkyWay-2 project and describes the design...
详细信息
On June 17, 2013, MilkyWay-2 (Tianhe-2) supercomputer was crowned as the fastest supercomputer in the world on the 41th TOP500 list. This paper provides an overview of the MilkyWay-2 project and describes the design of hardware and software systems. The key architecture features of MilkyWay-2 are highlighted, including neo-heterogeneous compute nodes integrating commodity- off-the-shelf processors and accelerators that share similar instruction set architecture, powerful networks that employ proprietary interconnection chips to support the massively parallel message-passing communications, proprietary 16- core processor designed for scientific computing, efficient software stacks that provide high performance file system, emerging programming model for heterogeneous systems, and intelligent system administration. We perform extensive evaluation with wide-ranging applications from LINPACK and Graph500 benchmarks to massively parallel software deployed in the system.
Interconnection network plays an important role in scalable high performance computer (HPC) systems. The TH Express-2 interconnect has been used in MilkyWay-2 system to provide high-bandwidth and low-latency interpr...
详细信息
Interconnection network plays an important role in scalable high performance computer (HPC) systems. The TH Express-2 interconnect has been used in MilkyWay-2 system to provide high-bandwidth and low-latency interprocessot communications, and continuous efforts are devoted to the development of our proprietary interconnect. This paper describes the state-of-the-art of our proprietary interconnect, especially emphasizing on the design of network interface. Several key features are introduced, such as user-level communication, remote direct memory access, offload collective operation, and hardware reliable end-to-end communication, etc. The design of a low level message passing infrastructures and an upper message passing services are also proposed. The preliminary performance results demonstrate the efficiency of the TH interconnect interface.
Edge extraction is an indispensable task in digital image processing. With the sharp increase in the image data, real-time problem has become a limitation of the state of the art of edge extraction *** this paper, QSo...
详细信息
Edge extraction is an indispensable task in digital image processing. With the sharp increase in the image data, real-time problem has become a limitation of the state of the art of edge extraction *** this paper, QSobel, a novel quantum image edge extraction algorithm is designed based on the flexible representation of quantum image(FRQI) and the famous edge extraction algorithm Sobel. Because FRQI utilizes the superposition state of qubit sequence to store all the pixels of an image, QSobel can calculate the Sobel gradients of the image intensity of all the pixels simultaneously. It is the main reason that QSobel can extract edges quite fast. Through designing and analyzing the quantum circuit of QSobel, we demonstrate that QSobel can extract edges in the computational complexity of O(n2) for a FRQI quantum image with a size of2 n × 2n. Compared with all the classical edge extraction algorithms and the existing quantum edge extraction algorithms, QSobel can utilize quantum parallel computation to reach a significant and exponential ***, QSobel would resolve the real-time problem of image edge extraction.
暂无评论