检索结果-内蒙古大学图书馆

Green challenges to system software in data centers

中国计算机科学前沿 2011年第3期5卷 353-368页

作者： Yuzhong SUN Yiqiang ZHAO Ying SONG Yajun YANG Haifeng FANG Hongyong ZANG Yaqiong LI Yunwei GAO Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China Graduate University of Chinese Academy of Sciences Beijing 100190 China

With the increasing demand and the wide application of high performance commodity multi-core processors,both the quantity and scale of data centers grow dramatically and they bring heavy energy *** and engineers have applied much effort to reducing hardware energy consumption,but software is the true consumer of power and another key in making better use of *** software is critical to better energy utilization,because it is not only the manager of hardware but also the bridge and platform between applications and *** this paper,we summarize some trends that can affect the efficiency of data ***,we investigate the causes of software *** on these studies,major technical challenges and corresponding possible solutions to attain green system software in programmability,scalability,efficiency and software architecture are ***,some of our research progress on trusted energy efficient system software is briefly introduced.

关键词： green software multi-core data center power efficient system software

来源：评论

学校读者我要写书评

暂无评论

The Godson Processors:Its Research,Development,and Contributions

引用

Journal of computer Science & Technology 2011年第3期26卷 363-372页

作者：胡伟武高燕萍陈天石肖俊华 Key Laboratory of Computer System and Architecture Chinese Academy of Sciences Loongson Technologies Corporation Limited Graduate University of Chinese Academy of Sciences

The Godson project with an R＆D history of 10 years is an independent national program of China that aims at developing advanced microprocessor technologies based on fundamental research and commercialization of the chip technology. We will give a comprehensive presentation of the Godson project, including its history, technical roadmaps, and several unique technical merits.

关键词： IT industry CPU research and development Godson microprocessor XPU system on chip

来源：评论

学校读者我要写书评

暂无评论

Revisiting Multiple Pattern Matching Algorithms for Multi-Core architecture

引用

Journal of computer Science & Technology 2011年第5期26卷 866-874页

作者：谭光明刘萍卜东波刘燕兵 Key Laboratory of Computer System and Architecture Institute of Computing TechnologyChinese Academy of Sciences Key Laboratory of Network Technology Institute of Computing TechnologyChinese Academy of Sciences

Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion *** this paper,we present an attempt to design efficient multiple pattern searching algorithms on multi-core *** observe an important feature which indicates that the multiple pattern matching time mainly depends on the number and minimal length of *** multi-core algorithm proposed in this paper leverages this feature to decompose pattern set so that the parallel execution time is *** formulate the problem as an optimal decomposition and scheduling of a pattern set,then propose a heuristic algorithm,which takes advantage of dynamic programming and greedy algorithmic techniques,to solve the optimization *** results suggest that our decomposition approach can increase the searching speed by more than 200% on a 4-core AMD Barcelona system.

关键词： parallel algorithm multi-core multiple pattern matching

来源：评论

学校读者我要写书评

暂无评论

Design for Testability Features of Godson-3 Multicore Microprocessor

引用

Journal of computer Science & Technology 2011年第2期26卷 302-313页

作者：齐子初刘慧李向库胡伟武 Key Laboratory of Computer System and Architecture Chinese Academy of Sciences Institute of Computing Technology Chinese Academy of Sciences Loongson Technologies Corporation Limited

This paper describes the design for testability （DFT） challenges and techniques of Godson-3 microprocessor, which is a scalable multicore processor based on the scalable mesh of crossbar （SMOC） on-chip network and targets high-end applications. Advanced techniques are adopted to make the DFT design scalable and achieve low-power and low-cost test with limited IO resources. To achieve a scalable and flexible test access, a highly elaborate test access mechanism （TAM） is implemented to support multiple test instructions and test modes. Taking advantage of multiple identical cores embedding in the processor, scan partition and on-chip comparisons are employed to reduce test power and test time. Test compression technique is also utilized to decrease test time. To further reduce test power, clock controlling logics are designed with ability to turn off clocks of non-testing partitions. In addition, scan collars of CACHEs are designed to perform functional test with low-speed ATE for speed-binning purposes, which poses low complexity and has good correlation results.

关键词： DFT （design for testability） TAM （test access mechanism） multicore processor low power test

来源：评论

学校读者我要写书评

暂无评论

Dawning Nebulae:A PetaFLOPS Supercomputer with a Heterogeneous Structure

引用

Journal of computer Science & Technology 2011年第3期26卷 352-362页

作者：孙凝辉邢晶霍志刚谭光明熊劲李波马灿 Key Laboratory of Computer System and Architecture Chinese Academy of Sciences Institute of Computing Technology Chinese Academy of Sciences Graduate University of Chinese Academy of Sciences

Dawning Nebulae is a heterogeneous system composed of 9280 multi-core x86 CPUs and 4640 NVIDIA Fermi GPUs. With a Linpack performance of 1.271 petaFLOPS, it was ranked the second in the TOP500 List released in June 2010. In this paper, key issues in the system design of Dawning Nebulae are introduced. system tuning methodologies aiming at petaFLOPS Linpack result are presented, including algorithmic optimization and communication improvement. The design of its file I/O subsystem, including HVFS and the underlying DCFS3, is also described. Performance evaluations show that the Linpack efficiency of each node reaches 69.89%, and 1024-node aggregate read and write bandwidths exceed 100 GB/s and 70 GB/s respectively. The success of Dawning Nebulae has demonstrated the viability of CPU/GPU heterogeneous structure for future designs of supercomputers.

关键词： supercomputer heterogeneous systems performance evaluation

来源：评论

学校读者我要写书评

暂无评论

Characterization and capacity evaluation of body-to-body channels using MIMO antennas

Characterization and capacity evaluation of body-to-body cha...

引用

European Conference on Antennas and Propagation, EuCAP

作者： Khalida Ghanem H. AlQuwaiee R. Fouad N. Abu Khamis System Architecture Laboratory Center of development of advanced techniques (CDTA) Algiers Algeria College of Electrical Engineering King Abdullah University of Science and Technology Jeddah KSA College of computer Engineering Prince Mohammed Bin Fahd University Al Khobar KSA

A comparison of on-body and body-to-body channels in an indoor high scattered environment is performed through the characterization and the evaluation of the achievable capacity when using a MIMO PIFA array system. For the on-body channels, the belt-head channel offers a better capacity than the belt-chest channel at the high SNR because of its more rich scattering quality. However the presence of a high LOS signal compensates for such a limitation and allows the belt-chest channel to yield a similar capacity as the belt-head channel at low SNR. The body-to-body belt-belt and belt-head channels yield the same capacity values because they exhibit the same statistical parameters. Their average capacity is comparable to the on-body belt-chest channel, which is viable in high-data communications.

关键词： Portable document format IEEE Xplore

来源：评论

学校读者我要写书评

暂无评论

Using Data-Level Parallelism to Accelerate Instruction -Level Redundancy

Using Data-Level Parallelism to Accelerate Instruction -Leve...

引用

World Automation Congress

作者： Yu Hu Zhong liang Chen Xiaowei Li Key Laboratory of Computer System and Architecture Institute of Computing Technology CAS Beijing 100190 China Department of Electrical and Computer Engineering Northeastern University Boston MA 02115 USA

ISBN: (纸本)9781467344975

Instruction-level redundancy is an effective scheme to reduce the susceptibility of microprocessors to soft errors, offering high error detection and recovery capability;however, it usually incurs significant performance degradation due to resource racing. Motivated by the fact that narrow-width operands are commonly seen in applications, we exploit data-level parallelism to accelerate instruction-level redundancy. For the instructions within sphere of replication (SoR) of data-level redundancy, normal and redundant versions of the narrow-width operand of the instruction are folded into one register to share the same functional unit during execution hence alleviating resource racing. The other instructions are all protected by instructionlevel redundancy. We run SPECint2000 benchmarks on a modified version of SimpleScalar simulator, and synthesize the extra hardware to evaluate area overhead of the proposed pipeline. Experimental results show that our acceleration scheme outperforms conventional instruction-level redundancy by 13% in IPC. Besides, the extra area overhead is negligible.

关键词： Keywords Data-level redundancy Instruction-level redundancy Narrow-width value Sphere of replication

来源：评论

学校读者我要写书评

暂无评论

Computation pattern driven reuse of manual optimizations for GPGPUs

Computation pattern driven reuse of manual optimizations for...

引用

2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2011

作者： Xu, Shixiong Han, Dongni Chen, Li Key Laboratory of Computer System and Architecture Institute of Computing Technology University of Chinese Beijing China

ISBN: (纸本)9780769545646

The wide application of General Purpose Graphic Processing Units (GPGPUs) results in large manual efforts on porting and optimizing algorithms on them. However, most existing automatic ways of generating GPGPU code fail to conduct optimization strategies regarding a specific computation and to reuse constantly evolving manual optimizations. In this paper, we present a computation pattern driven approach for computation-specific GPGPU code generation and optimization, which in turn reuses manual optimizations to a certain extent. We suggest language extensions to OpenMP, high-level data structure attributes, in order to assist the process of computation pattern matching and to help give users intuitive performance tuning parameters in the view of data structure attributes. We illustrate the feasibility of this approach through three important computation dwarfs, which are dense matrix, sparse matrix, and structured mesh computation in scientific computing. We also build a prototype OpenMP-to-CUDA translator that consists of computation pattern recognition and code template instantiation. The experimental results demonstrate the performance benefits of computation pattern driven method. To our best knowledge, it is the first work on reusing manual optimizations for GPGPUs with computation pattern driven approach. © 2011 IEEE.

关键词： Data structures

来源：评论

学校读者我要写书评

暂无评论

Virtual resource monitoring in cloud computing

引用

Journal of Shanghai University(English Edition) 2011年第5期15卷 381-385页

作者：韩芳芳彭俊杰张武李青李建敦江钦龙袁勤 School of Computer Engineering and Science Shanghai University Key Laboratory of Computer System and Architecture Institute of Computing TechnologyChinese Academy of Science

Cloud computing is a new computing model. The resource monitoring tools are immature compared to traditional distributed computing and grid computing. In order to better monitor the virtual resource in cloud computing, a periodically and event-driven push （PEP） monitoring model is proposed. Taking advantage of the push and event-driven mechanism, the model can provide comparatively adequate information about usage and status of the resources. It can simplify the communication between Master and Work Nodes without missing the important issues happened during the push interval. Besides, we develop ＂mon＂ to make up for the deficiency of Libvirt in monitoring of virtual CPU and memory.

关键词： virtual resource monitoring cloud computing virtualization

来源：评论

学校读者我要写书评

暂无评论

New methodologies for parallel architecture

New methodologies for parallel architecture

引用

作者： Fan, Dong-Rui Li, Xiao-Wei Li, Guo-Jie Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China

Moore's law continues to grant computer architects ever more transistors in the foreseeable future, and para-llelism is the key to continued performance scaling in modern microprocessors. In this paper, the achievements in our research project, which is supported by the National Basic Research 973 Program of China, on parallel architecture, are systematically presented. The innovative approaches and techniques to solve the significant problems in parallel architecture design are summarized, including architecture level optimization, compiler and language-supported technologies, reliability, power-performance efficient design, test and verification challenges, and platform building. Two prototype chips, a multi-heavy-core Godson-3 and a many-light-core Godson-T, are described to demonstrate the highly scalable and reconfigurable parallel architecture designs. We also present some of our achievements appearing in ISCA, MICRO, ISSCC, HPCA, PLDI, PACT, IJCAI, Hot Chips, DATE, IEEE Trans. VLSI, IEEE Micro, IEEE Trans. computers, etc. © 2011 Springer Science+Business Media, LLC & Science Press, China.

关键词： Parallel architectures

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：