检索结果-内蒙古大学图书馆

SPICE modeling of memristors with multilevel resistance states

Chinese Physics B 2012年第9期21卷 594-600页

作者：方旭东唐玉华吴俊杰 National Laboratory for Parallel and Distributed Processing School of ComputerNational University of Defense Technology Department of Computer Science and Technology School of ComputerNational University of Defense Technology

With CMOS technologies approaching the scaling ceiling, novel memory technologies have thrived in recent years, among which the memristor is a rather promising candidate for future resistive memory （RRAM）. Memristor＇s potential to store multiple bits of information as different resistance levels allows its application in multilevel cell （MCL） tech- nology, which can significantly increase the memory capacity. However, most existing memristor models are built for binary or continuous memristance switching. In this paper, we propose the simulation program with integrated circuits emphasis （SPICE） modeling of charge-controlled and flux-controlled memristors with multilevel resistance states based on the memristance versus state map. In our model, the memristance switches abruptly between neighboring resistance states. The proposed model allows users to easily set the number of the resistance levels as parameters, and provides the predictability of resistance switching time if the input current/voltage waveform is given. The functionality of our models has been validated in HSPICE. The models can be used in multilevel RRAM modeling as well as in artificial neural network simulations.

关键词： memristor multilevel cell SPICE model

来源：评论

学校读者我要写书评

暂无评论

Adaptive Linearized Alternating Direction Method of Multipliers for Non-Convex Compositely Regularized Optimization Problems

引用

Tsinghua Science and Technology 2017年第3期22卷 328-341页

作者： Linbo Qiao Bofeng Zhang Xicheng Lu Jinshu Su College of Computer National University of Defense Technologyand National Laboratory for Parallel and Distributed ProcessingNational University of Defense TechnologyChangsha 410073China College of Computer National University of Defense TechnologyChangsha 410073China

We consider a wide range of non-convex regularized minimization problems, where the non-convex regularization term is composite with a linear function engaged in sparse learning. Recent theoretical investigations have demonstrated their superiority over their convex counterparts. The computational challenge lies in the fact that the proximal mapping associated with non-convex regularization is not easily obtained due to the imposed linear composition. Fortunately, the problem structure allows one to introduce an auxiliary variable and reformulate it as an optimization problem with linear constraints, which can be solved using the Linearized Alternating Direction Method of Multipliers （LADMM）. Despite the success of LADMM in practice, it remains unknown whether LADMM is convergent in solving such non-convex compositely regularized optimizations. In this research, we first present a detailed convergence analysis of the LADMM algorithm for solving a non-convex compositely regularized optimization problem with a large class of non-convex penalties. Furthermore, we propose an Adaptive LADMM （AdaLADMM） algorithm with a line-search criterion. Experimental results on different genres of datasets validate the efficacy of the proposed algorithm.

关键词： adaptive linearized alternating direction method of multipliers non-convex compositely regularizedoptimization cappled-ll regularized logistic regression

来源：评论

学校读者我要写书评

暂无评论

QSobel:A novel quantum image edge extraction algorithm

引用

Science China(Information Sciences) 2015年第1期58卷 107-119页

作者： ZHANG Yi LU Kai GAO YingHui Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology College of Computer National University of Defense Technology College of Electronic Science and Engineering National University of Defense Technology

Edge extraction is an indispensable task in digital image processing. With the sharp increase in the image data, real-time problem has become a limitation of the state of the art of edge extraction *** this paper, QSobel, a novel quantum image edge extraction algorithm is designed based on the flexible representation of quantum image(FRQI) and the famous edge extraction algorithm Sobel. Because FRQI utilizes the superposition state of qubit sequence to store all the pixels of an image, QSobel can calculate the Sobel gradients of the image intensity of all the pixels simultaneously. It is the main reason that QSobel can extract edges quite fast. Through designing and analyzing the quantum circuit of QSobel, we demonstrate that QSobel can extract edges in the computational complexity of O(n2) for a FRQI quantum image with a size of2 n × 2n. Compared with all the classical edge extraction algorithms and the existing quantum edge extraction algorithms, QSobel can utilize quantum parallel computation to reach a significant and exponential ***, QSobel would resolve the real-time problem of image edge extraction.

关键词： edge extraction quantum image processing FRQI Sobel computational complexity

来源：评论

学校读者我要写书评

暂无评论

Cost-effective SET-tolerant clock distribution network design by mitigating single event transient propagation

引用

Science China(Information Sciences) 2017年第10期60卷 261-263页

作者： Peipei HAO Shuming CHEN Pengcheng HUANG Jianjun CHEN Bin LIANG Institute of Microelectronics College of ComputerNational University of Defense Technology National Laboratory for Paralleling and Distributed Processing College of ComputerNational University of Defense Technology

It has been shown that clock distribution networks(CDNs)are becoming increasingly vulnerable to transient faults known as single event transients(SETs),owing to technology scaling[1].In the deep submicron regime,CDNs ... 详细信息

关键词： SET In

来源：评论

学校读者我要写书评

暂无评论

Fast image matching algorithm based on affine invariants

引用

Journal of Central South University 2014年第5期21卷 1907-1918页

作者：张毅卢凯高颖慧 National Laboratory for Parallel and Distributed Processing(National University of Defense Technology) College of Computer National University of Defense Technology College of Electronic Science and Engineering National University of Defense Technology

Feature-based image matching algorithms play an indispensable role in automatic target recognition （ATR）. In this work, a fast image matching algorithm （FIMA） is proposed which utilizes the geometry feature of extended centroid （EC） to build affine invariants. Based on at-fine invariants of the length ratio of two parallel line segments, FIMA overcomes the invalidation problem of the state-of-the-art algorithms based on affine geometry features, and increases the feature diversity of different targets, thus reducing misjudgment rate during recognizing targets. However, it is found that FIMA suffers from the parallelogram contour problem and the coincidence invalidation. An advanced FIMA is designed to cope with these problems. Experiments prove that the proposed algorithms have better robustness for Gaussian noise, gray-scale change, contrast change, illumination and small three-dimensional rotation. Compared with the latest fast image matching algorithms based on geometry features, FIMA reaches the speedup of approximate 1.75 times. Thus, FIMA would be more suitable for actual ATR applications.

关键词： affine invariants image matching extended centroid robustness performance

来源：评论

学校读者我要写书评

暂无评论

MilkyWay-2 supercomputer： system and application

引用

Frontiers of computer Science 2014年第3期8卷 345-356页

作者： Xiangke LIAO Liquan XIAO Canqun YANG Yutong LU Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha 410073 China College of Computer National University of Defense Technology Changsha 410073 China

On June 17, 2013, MilkyWay-2 （Tianhe-2） supercomputer was crowned as the fastest supercomputer in the world on the 41th TOP500 list. This paper provides an overview of the MilkyWay-2 project and describes the design of hardware and software systems. The key architecture features of MilkyWay-2 are highlighted, including neo-heterogeneous compute nodes integrating commodity- off-the-shelf processors and accelerators that share similar instruction set architecture, powerful networks that employ proprietary interconnection chips to support the massively parallel message-passing communications, proprietary 16- core processor designed for scientific computing, efficient software stacks that provide high performance file system, emerging programming model for heterogeneous systems, and intelligent system administration. We perform extensive evaluation with wide-ranging applications from LINPACK and Graph500 benchmarks to massively parallel software deployed in the system.

关键词： MilkyWay-2 supercomputer petaflops computing neo-heterogeneous architecture interconnect network heterogeneous programing model system management benchmark optimization performance evaluation

来源：评论

学校读者我要写书评

暂无评论

The TH Express high performance interconnect networks

引用

Frontiers of computer Science 2014年第3期8卷 357-366页

作者： Zhengbin PANG Min XIE Jun ZHANG Yi ZHENG Guibin WANG Dezun DONG Guang SUO Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha 410073 China College of Computer National University of Defense Technology Changsha 410073 China

Interconnection network plays an important role in scalable high performance computer （HPC） systems. The TH Express-2 interconnect has been used in MilkyWay-2 system to provide high-bandwidth and low-latency interprocessot communications, and continuous efforts are devoted to the development of our proprietary interconnect. This paper describes the state-of-the-art of our proprietary interconnect, especially emphasizing on the design of network interface. Several key features are introduced, such as user-level communication, remote direct memory access, offload collective operation, and hardware reliable end-to-end communication, etc. The design of a low level message passing infrastructures and an upper message passing services are also proposed. The preliminary performance results demonstrate the efficiency of the TH interconnect interface.

关键词： HPC network interface chip (NIC) TH Express nterconnect offload collective operation

来源：评论

学校读者我要写书评

暂无评论

Internet-based virtual computing environment(iVCE):Concepts and architecture

引用

Science in China(Series F) 2006年第6期49卷 681-701页

作者： LU Xicheng WANG Huaimin WANG Ji National Laboratory for Parallel and Distributed Processing Changsha 410073 China College of Computer National University of Defense Technology Changsha 410073 China

Resources over Internet have such intrinsic characteristics as growth, autonomy and diversity, which have brought many challenges to the efficient sharing and comprehensive utilization of these resources. This paper presents a novel approach for the construction of the Internet-based Virtual Computing Environment （iVCE）, whose sig- nificant mechanisms are on-demand aggregation and autonomic collaboration. The iVCE is built on the open infrastructure of the Internet and provides harmonious, transparent and integrated services for end-users and applications. The concept of iVCE is presented and its architectural framework is described by introducing three core concepts, i.e., autonomic element, virtual commonwealth and virtual executor. Then the connotations, functions and related key technologies of each components of the architecture are deeply analyzed with a case study, iVCE for Memory.

关键词： resource aggregation collaboration virtual computing environment Internet

来源：评论

学校读者我要写书评

暂无评论

GPU acceleration of subgraph isomorphism search in large scale graph

引用

Journal of Central South University 2015年第6期22卷 2238-2249页

作者：杨博卢凯高颖慧王小平徐凯 Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology College of Computer National University of Defense Technology Department of Electronic Science and Engineering National University of Defense Technology

A novel framework for parallel subgraph isomorphism on GPUs is proposed, named GPUSI, which consists of GPU region exploration and GPU subgraph matching. The GPUSI iteratively enumerates subgraph instances and solves the subgraph isomorphism in a divide-and-conquer fashion. The framework completely relies on the graph traversal, and avoids the explicit join operation. Moreover, in order to improve its performance, a task-queue based method and the virtual-CSR graph structure are used to balance the workload among warps, and warp-centric programming model is used to balance the workload among threads in a warp. The prototype of GPUSI is implemented, and comprehensive experiments of various graph isomorphism operations are carried on diverse large graphs. The experiments clearly demonstrate that GPUSI has good scalability and can achieve speed-up of 1.4–2.6 compared to the state-of-the-art solutions.

关键词： parallel graph isomorphism GPU backtrack paradigm

来源：评论

学校读者我要写书评

暂无评论

Combining model and iterative compilation for program performance optimization

Journal of Software

引用

Journal of Software 2009年第3期4卷 240-246页

作者： Lu, Pingjing Che, Yonggang Wang, Zhenghua National Laboratory for Parallel and Distributed Processing School of Computer National University of Defense Technology Changsha 410073 China

The performance gap for high performance applications has been widening over time. High level program transformations are critical to improve applications' performance, many of which concern the determination of optimal values for transformation parameters, such as loop unrolling and blocking. Static approaches achieve these values based on analytical models that are hard to achieve because of increasing architecture complexity and code structures. Recent iterative compilation approaches achieve it by executing different versions of the program on actual platforms and select the one that renders best performance, outperforming static compilation approaches significantly. But the expensive compilation cost has limited their application scope to embedded applications and a small group of math kernels. This paper proposes a combinative approach--Combining Model and Iterative Compilation for Program Performance Optimization (CMIC). Such an approach first constructs a program optimization transformation model based on hardware performance counters to decide how and when to apply transformations, and then selects the optimal transformation parameters using Nelder-Mead simplex algorithm. Experimental results show that our approach can effectively improve programs' floating-point performance, reducing programs' runtime, therefore, lessening the performance gap for high-performance applications. © 2009 ACADEMY PUBLISHER.

关键词： Digital arithmetic

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：