检索结果-内蒙古大学图书馆

The influences of model parameters on the characteristics of memristors

Chinese Physics B 2012年第4期21卷 576-585页

作者：周静黄达 National Laboratory for Parallel and Distributed Processing School of ComputerNational University of Defense TechnologyChangsha 410073China

As the fourth passive circuit component, a memristor is a nonlinear resistor that can ＂remember＂ the amount of charge passing through it. The characteristic of ＂remembering＂ the charge and non-volatility makes memristors great potential candidates in many fields. Nowadays, only a few groups have the ability to fabricate memristors, and most researchers study them by theoretic analysis and simulation. In this paper, we first analyse the theoretical base and characteristics of memristors, then use a simulation program with integrated circuit emphasis as our tool to simulate the theoretical model of memristors and change the parameters in the model to see the influence of each parameter on the characteristics. Our work supplies researchers engaged in memristor-based circuits with advice on how to choose the proper parameters.

关键词： memristor I-V characteristics simulation program with integrated circuit emphasis

来源：评论

学校读者我要写书评

暂无评论

SPICE modeling of memristors with multilevel resistance states

引用

Chinese Physics B 2012年第9期21卷 594-600页

作者：方旭东唐玉华吴俊杰 National Laboratory for Parallel and Distributed Processing School of ComputerNational University of Defense Technology Department of Computer Science and Technology School of ComputerNational University of Defense Technology

With CMOS technologies approaching the scaling ceiling, novel memory technologies have thrived in recent years, among which the memristor is a rather promising candidate for future resistive memory （RRAM）. Memristor＇s potential to store multiple bits of information as different resistance levels allows its application in multilevel cell （MCL） tech- nology, which can significantly increase the memory capacity. However, most existing memristor models are built for binary or continuous memristance switching. In this paper, we propose the simulation program with integrated circuits emphasis （SPICE） modeling of charge-controlled and flux-controlled memristors with multilevel resistance states based on the memristance versus state map. In our model, the memristance switches abruptly between neighboring resistance states. The proposed model allows users to easily set the number of the resistance levels as parameters, and provides the predictability of resistance switching time if the input current/voltage waveform is given. The functionality of our models has been validated in HSPICE. The models can be used in multilevel RRAM modeling as well as in artificial neural network simulations.

关键词： memristor multilevel cell SPICE model

来源：评论

学校读者我要写书评

暂无评论

A Dynamic Replication Mechanism to Reduce Response-Time of I/O Operations in High Performance Computing Clusters

A Dynamic Replication Mechanism to Reduce Response-Time of I...

引用

IEEE International Conference on Social Computing (SocialCom)

作者： Ehsan Mousavi Khaneghah Seyedeh Leili Mirtaheri Lucio Grandinetti Amir Saman Memaripour Mohsen Sharifi Center of High Performance Computing for Parallel and Distributed Processing University of Calabria Rende Italy School of Computer Engineering Iran University of Science and Technology Tehran Iran

ISBN: (纸本)9781479915194

Extraordinary large datasets of high performance computing applications require improvement in existing storage and retrieval mechanisms. Moreover, enlargement of the gap between data processing and I/O operations' throughput will bound the system performance to storage and retrieval operations and remarkably reduce the overall performance of high performance computing clusters. File replication is a way to improve the performance of I/O operations and increase network utilization by storing several copies of every file. Furthermore, this will lead to a more reliable and fault-tolerant storage cluster. In order to improve the response time of I/O operations, we have proposed a mechanism that estimates the required number of replicas for each file based on its popularity. Besides that, the remaining space of storage cluster is considered in the evaluation of replication factors and the number of replicas is adapted to the storage state. We have implemented the proposed mechanism using HDFS and evaluated it using MapReduce framework. Evaluation results prove its capability to improve the response time of read operations and increase network utilization. Consequently, this mechanism reduces the overall response time of read operations by considering files' popularity in replication process and adapts the replication factor to the cluster state.

关键词： Time factors High performance computing Bandwidth Reliability Throughput System performance History

来源：评论

学校读者我要写书评

暂无评论

A mathematical model for empowerment of Beowulf clusters for exascale computing

A mathematical model for empowerment of Beowulf clusters for...

引用

International Conference on High Performance Computing & Simulation (HPCS)

作者： Seyedeh Leili Mirtaheri Ehsan Mousavi Khaneghah Lucio Grandinetti Mohsen Sharifi Center of High Performance Computing for Parallel and Distributed Processing University of Calabria Rende Italy School of Computer Engineering Iran University of Science and Technology Tehran Iran

High-performance computing (HPC) clusters are currently faced with two major challenges - namely, the dynamic nature of new generation of applications and the heterogeneity of platforms - if they are going to be useful for exascale computing. Processes running these applications may well demand unpredictable requirements and changes to system configuration and capabilities at runtime, thereby requiring fast system response without sacrificing the transparency and integrity of the reconfigured empowered system that is running on a heterogeneous platform. While a challenge in and of itself, platform heterogeneity is both useful and instrumental in the handling of unpredictable requests. The realization of such a dynamically reconfigurable and heterogeneous HPC cluster system for exascale computing requires a model to guide running processes to determine if they need empowerment of the current cluster, and if yes, by how much. To show the feasibility of empowerment of traditional HPC clusters for exascale computing, we have selected Beowulf as a noble candidate cluster and present a mathematical model for the empowerment of Beowulf clusters for exascale computing (EBEC). We have developed the model in line with Beowulf's cluster approach and by using vector space algebra. In contrast to traditional hardware-oriented approaches to improvise the performance of clusters, we use a software approach to the development of the proposed model by emphasizing processes, which act as the creators of the cluster and thus should decide on system (re)configuration, as the principal building blocks of the system. We have also adopted a new approach to heterogeneity by considering heterogeneity at different levels including hardware, system software, application software, and system functionality. In addition to support for heterogeneity and dynamic reconfiguration, the proposed model includes support for scalability that is crucial to exascale computing too.

关键词： Vectors Mathematical model Hardware Runtime Computers Computational modeling

来源：评论

学校读者我要写书评

暂无评论

PartialRC: A Partial Recomputing Method for Efficient Fault Recovery on GPGPUs

引用

Journal of Computer science & technology 2012年第2期27卷 240-255页

作者：徐新海杨学军薛京灵林宇斐林一松 National Laboratory for Parallel and Distributed Processing School of ComputerNational University of Defense Technology Programming Languages and Compilers Group School of Computer Science and Engineering University of New South Wales

GPGPUs are increasingly being used to as performance accelerators for HPC （High Performance Computing） applications in CPU/GPU heterogeneous computing systems, including TianHe-1A, the world＇s fastest supercomputer in the TOP500 list, built at NUDT （National University of Defense technology） last year. However, despite their performance advantages, GPGPUs do not provide built-in fault-tolerant mechanisms to offer reliability guarantees required by many HPC applications. By analyzing the SIMT （single-instruction, multiple-thread） characteristics of programs running on GPGPUs, we have developed PartialRC, a new checkpoint-based compiler-directed partial recomputing method, for achieving efficient fault recovery by leveraging the phenomenal computing power of GPGPUs. In this paper, we introduce our PartialRC method that recovers from errors detected in a code region by partially re-computing the region, describe a checkpoint-based faulttolerance framework developed on PartialRC, and discuss an implementation on the CUDA platform. Validation using a range of representative CUDA programs on NVIDIA GPGPUs against FullRC （a traditional full-recomputing Checkpoint-Rollback-Restart fault recovery method for CPUs） shows that PartialRC reduces significantly the fault recovery overheads incurred by FullRC, by 73.5% when errors occur earlier during execution and 74.6% when errors occur later on average. In addition, PartialRC also reduces error detection overheads incurred by FullRC during fault recovery while incurring negligible performance overheads when no fault happens.

关键词： GPGPU partial recomputing fault tolerance CUDA checkpointing

来源：评论

学校读者我要写书评

暂无评论

MPtostream:an OpenMP compiler for CPU-GPU heterogeneous parallel systems

引用

science China(Information sciences) 2012年第9期55卷 1961-1971页

作者： YANG XueJun,TANG Tao ,WANG GuiBin,JIA Jia & XU XinHai National laboratory for parallel and distributed processing,National University of Defense technology,Changsha 410073,China 1. National Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha 410073 China

In light of GPUs’ powerful floating-point operation capacity,heterogeneous parallel systems incorporating general purpose CPUs and GPUs have become a highlight in the research field of high performance computing(HPC).However,due to the complexity of programming on GPUs,porting a large number of existing scientific computing applications to the heterogeneous parallel systems remains a big *** OpenMP programming interface is widely adopted on multi-core CPUs in the field of scientific *** effectively inherit existing OpenMP applications and reduce the transplant cost,we extend OpenMP with a group of compiler directives,which explicitly divide tasks among the CPU and the GPU,and map time-consuming computing fragments to run on the GPU,thus dramatically simplifying the *** have designed and implemented MPtoStream,a compiler of the extended OpenMP for AMD’s stream processing *** experimental results show that programming with the extended directives deviates from programming with OpenMP by less than 11% modification and achieves significant speedup ranging from 3.1 to 17.3 on a heterogeneous system,incorporating an Intel Xeon E5405 CPU and an AMD FireStream 9250 GPU,over the execution on the Xeon CPU alone.

关键词： GPGPU stream OpenMP compiler

来源：评论

学校读者我要写书评

暂无评论

Latency-Aware Dynamic Voltage and Frequency Scaling on Many-Core Architectures for Data-Intensive Applications

Latency-Aware Dynamic Voltage and Frequency Scaling on Many-...

引用

International Conference on Cloud Computing and Big Data (CloudCom-Asia)

作者： Zhiquan Lai King Tin Lam Cho-Li Wang Jinshu Su Youliang Yan Wangbin Zhu National University of Defense Technology China The University of Hong Kong Hong Kong National Key Laboratory of Parallel and Distributed Processing (PDL) China Huawei Technologies Co. Ltd. Shenzhen China

Low power is the first-class design requirement for HPC systems. Dynamic voltage and frequency scaling (DVFS) has become the commonly used and efficient technology to achieve a trade-off between power consumption and system performance. However, most the prior work using DVFS did not take into account the latency of voltage/frequency scaling, which is a critical factor in real hardware determining the power efficiency of the power management algorithm. This paper, firstly, investigate the latency features of DVFS on a real many-core hardware platform. Secondly, we propose a latency-aware DVFS algorithm for profile-based power management to avoid aggressive power state transitions. At last, we evaluate our algorithm on Intel SCC platform using a data-intensive benchmark, Graph 500 benchmark. The experimental results not only show impressive potential for energy saving in data-intensive applications (up to 31% energy saving and 60% EDP reduction), but also evaluate the efficiency of our latency-aware DVFS algorithm which achieves 12.0% extra energy saving and 5.0% extra EDP reduction, moreover, increases the execution performance by 22.4%.

关键词： Benchmark testing Runtime Hardware Heuristic algorithms Time-frequency analysis Voltage measurement

来源：评论

学校读者我要写书评

暂无评论

A fast successive over-relaxation algorithm for force-directed network graph drawing

引用

science China(Information sciences) 2012年第3期55卷 677-688页

作者： WANG YongXian & WANG ZhengHua National Key laboratory for parallel and distributed processing, National University of Defense technology, Changsha 410073, China 1. National Key Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha 410073 China

Force-directed approach is one of the most widely used methods in graph drawing research. There are two main problems with the traditional force-directed algorithms. First, there is no mature theory to ensure the convergence of iteration sequence used in the algorithm and further, it is hard to estimate the rate of convergence even if the convergence is satisfied. Second, the running time cost is increased intolerablely in drawing largescale graphs, and therefore the advantages of the force-directed approach are limited in practice. This paper is focused on these problems and presents a sufficient condition for ensuring the convergence of iterations. We then develop a practical heuristic algorithm for speeding up the iteration in force-directed approach using a successive over-relaxation (SOR) strategy. The results of computational tests on the several benchmark graph datasets used widely in graph drawing research show that our algorithm can dramatically improve the performance of force-directed approach by decreasing both the number of iterations and running time, and is 1.5 times faster than the latter on average.

关键词： graph drawing graph layout successive over-relaxation force-directed algorithm

来源：评论

学校读者我要写书评

暂无评论

Static Power Optimization for Homogeneous Multiple GPUs Based on Task Partition

Static Power Optimization for Homogeneous Multiple GPUs Base...

引用

2nd International Congress on Computer Applications and Computational science (CACS 2011)

作者： Lin, Yisong Tang, Tao Wang, Guibin National Laboratory of Parallel and Distributed Processing National University of Defense Technology Changsha China

ISBN: (纸本)9783642283079;9783642283086

Recently, GPU has been widely used in High Performance Computing (HPC). In order to improve computational performance, several GPUs are integrated into one computer node in practical system. However, power consumption of GPUs is very high and becomes as bottleneck to its further development. In doing so, optimizing power consumption have been draw broad attention in the research area and industry community. In this paper, we present an energy optimization model considering performance constraint for homogeneous multi-GPUs, and propose a performance prediction model when task partitioning policy is specified. Experiment results validate that the model can accurately predict the execution of program for single or multiple GPUs, and thus reduce static power consumption by the guide of task partition.

关键词： Electric power utilization

来源：评论

学校读者我要写书评

暂无评论

Speculative symbolic execution

Speculative symbolic execution

引用

2012 IEEE 23rd International Symposium on Software Reliability Engineering, ISSRE 2012

作者： Zhang, Yufeng Chen, Zhenbang Wang, Ji National Laboratory for Parallel and Distributed Processing Department of Computing Science National University of Defense Technology Changsha China

ISBN: (纸本)9780769548883

Symbolic execution is an effective path oriented and constraint based program analysis technique. Recently, there is a significant development in the research and application of symbolic execution. However, symbolic execution still suffers from the scalability problem in practice, especially when applied to large-scale or very complex programs. In this paper, we propose a new fashion of symbolic execution, named Speculative Symbolic Execution (SSE), to speed up symbolic execution by reducing the invocation times of constraint solver. In SSE, when encountering a branch statement, the search procedure may speculatively explore the branch without regard to the feasibility. Constraint solver is invoked only when the speculated branches are accumulated to a specified number. In addition, we present a key optimization technique that enhances SSE greatly. We have implemented SSE and the optimization technique on Symbolic Pathfinder (SPF). Experimental results on six programs show that, our method can reduce the invocation times of constraint solver by 20.7% to 48.7% (with an average of 29.9%), and save the search time from 23.6% to 43.6% (with an average of 30%). © 2012 IEEE.

关键词： Java programming language

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：