检索结果-内蒙古大学图书馆

SimK:A Large-Scale Parallel Simulation Engine

Journal of computer Science & technology 2009年第6期24卷 1048-1060页

作者：许建卫陈明宇郑规曹政吕慧伟孙凝晖 Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Graduate University of Chinese Academy of Sciences Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China

Simulation is an important method to evaluate future computer systems. Currently microprocessor architecture has switched to parallel, but almost all simulators remained at sequential stage, and the advantages brought by multi-core or many-core processors cannot be utilized. This paper presents a parallel simulator engine （SimK） towards the prevalent SMP/CMP platform, aiming at large-scale fine-grained computer system simulation. In this paper, highly efficient synchronization, communication and buffer management policies used in SimK are introduced, and a novel lock-free scheduling mechanism that avoids using any atomic instructions is presented. To deal with the load fluctuation at light load case, a cooperated dynamic task migration scheme is proposed. Based on SimK, we have developed large-scale parallel simulators HppSim and HppNetSim, which simulate a full supercomputer system and its interconnection network respectively. Results show that HppSim and HppNetSim both gain sound speedup with multiple processors, and the best normalized speedup reaches 14.95X on a two-way quad-core server.

关键词： large scale system simulation fine-grained synchronization simulation framework lock-free synchronization

来源：评论

学校读者我要写书评

暂无评论

Revisiting Multiple Pattern Matching Algorithms for Multi-Core architecture

引用

Journal of computer Science & technology 2011年第5期26卷 866-874页

作者：谭光明刘萍卜东波刘燕兵 Key Laboratory of Computer System and Architecture Institute of Computing TechnologyChinese Academy of Sciences Key Laboratory of Network Technology Institute of Computing TechnologyChinese Academy of Sciences

Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion *** this paper,we present an attempt to design efficient multiple pattern searching algorithms on multi-core *** observe an important feature which indicates that the multiple pattern matching time mainly depends on the number and minimal length of *** multi-core algorithm proposed in this paper leverages this feature to decompose pattern set so that the parallel execution time is *** formulate the problem as an optimal decomposition and scheduling of a pattern set,then propose a heuristic algorithm,which takes advantage of dynamic programming and greedy algorithmic techniques,to solve the optimization *** results suggest that our decomposition approach can increase the searching speed by more than 200% on a 4-core AMD Barcelona system.

关键词： parallel algorithm multi-core multiple pattern matching

来源：评论

学校读者我要写书评

暂无评论

Trainbow: a new trusted virtual machine based platform

引用

中国高等学校学术文摘·计算机科学 2010年第1期4卷 47-64页

作者： Yuzhong SUN Yongbing HUANG Yunwei GAO Haifeng FANG Ying SONG Lei DU Kai ZHANG Hongyong ZANG Yaqiong LI Yajun YANG Ran AO Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of SciencesBeijing 100190China Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of SciencesBeijing 100190China Graduate University of Chinese Academy of Sciences Beijing 100190China Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of SciencesBeijing 100190China Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of SciencesBeijing 100190China Graduate University of Chinese Academy of Sciences Beijing 100190China Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of SciencesBeijing 100190China Department of Computer Science and Technology Xi'an Jiaotong University Xi'an 710049China Key Laboratory of Computer System and Architecture Institute of Computing TechnologyChinese Academy of Sciences Beijing 100190China Graduate University of Chinese Academy of Sciences Beijing 100190China

Currently, with the evolution of virtualization technology, cloud computing mode has become more and more popular. However, people still concern the issues of the runtime integrity and data security of cloud computing platform, as well as the service efficiency on such computing platform. At the same time, according to our knowledge, the design theory of the trusted virtual computing environment and its core system software for such network-based computing platform is at the exploratory stage. In this paper, we believe that efficiency and isolation are the two key proprieties of the trusted virtual computing environment. To guarantee these two proprieties, based on the design principle of splitting, customizing, reconstructing, and isolation-based enhancing to the platform, we introduce TRainbow, a novel trusted virtual computing platform developing by our research *** the two creative mechanisms, that is, capacity flowing amongst VMs and VM-based kernel reconstructing, TRainbow provides great improvements (up to 42%) in service performance and isolated reliable computing environment for Internet-oriented, large-scale, concurrent services.

关键词： Computing platform virtual machine capacity service computing trust chain isolation

来源：评论

学校读者我要写书评

暂无评论

New Methodologies for Parallel architecture

引用

Journal of computer Science & technology 2011年第4期26卷 578-587页

作者：范东睿李晓维李国杰 Key Laboratory of Computer System and Architecture Institute of Computing TechnologyChinese Academy of Sciences

Moore＇s law continues to grant computer architects ever more transistors in the foreseeable future, and parallelism is the key to continued performance scaling in modern microprocessors. In this paper, the achievements in our research project, which is supported by the National Basic Research 973 Program of China, on parallel architecture, are systematically presented. The innovative approaches and techniques to solve the significant problems in parallel architecture design are smnmarized, including architecture level optimization, compiler and language-supported technologies, reliability, power-performance efficient design, test and verification challenges, and platform building. Two prototype chips, a multi-heavy-core Godson-3 and a many-light-core Godson-T, are described to demonstrate the highly scalable and reconfigurable parallel architecture designs. We also present some of our achievements appearing in ISCA, MICRO, ISSCC, HPCA, PLDI, PACT, IJCAI, Hot Chips, DATE, IEEE Trans. VLSI, IEEE Micro, IEEE Trans. computers, etc.

关键词： architecture multi-core many-core parallelism

来源：评论

学校读者我要写书评

暂无评论

Deterministic Circular Self Test Path

引用

Tsinghua Science and technology 2007年第S1期12卷 20-25页

作者：文科胡瑜李晓维 Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences

Circular self test path (CSTP) is an attractive technique for testing digital integrated circuits(IC) in the nanometer era, because it can easily provide at-speed test with small test data volume and short test application time. However, CSTP cannot reliably attain high fault coverage because of difficulty of testing random-pattern-resistant faults. This paper presents a deterministic CSTP (DCSTP) structure that consists of a DCSTP chain and jumping logic, to attain high fault coverage with low area overhead. Experimental re- sults on ISCAS’89 benchmarks show that 100% fault coverage can be obtained with low area overhead and CPU time, especially for large circuits.

关键词： very large scale integration (VLSI) test built-in-self-test (BIST) circular self test path deterministic

来源：评论

学校读者我要写书评

暂无评论

Green challenges to system software in data centers

引用

中国计算机科学前沿 2011年第3期5卷 353-368页

作者： Yuzhong SUN Yiqiang ZHAO Ying SONG Yajun YANG Haifeng FANG Hongyong ZANG Yaqiong LI Yunwei GAO Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China Graduate University of Chinese Academy of Sciences Beijing 100190 China

With the increasing demand and the wide application of high performance commodity multi-core processors,both the quantity and scale of data centers grow dramatically and they bring heavy energy *** and engineers have applied much effort to reducing hardware energy consumption,but software is the true consumer of power and another key in making better use of *** software is critical to better energy utilization,because it is not only the manager of hardware but also the bridge and platform between applications and *** this paper,we summarize some trends that can affect the efficiency of data ***,we investigate the causes of software *** on these studies,major technical challenges and corresponding possible solutions to attain green system software in programmability,scalability,efficiency and software architecture are ***,some of our research progress on trusted energy efficient system software is briefly introduced.

关键词： green software multi-core data center power efficient system software

来源：评论

学校读者我要写书评

暂无评论

GenerOS: An asymmetric operating system kernel for multi-core systems

GenerOS: An asymmetric operating system kernel for multi-cor...

引用

24th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2010

作者： Yuan, Qingbo Zhao, Jianbo Chen, Mingyu Sun, Ninghui Key Laboratory of Computer System and Architecture Institute of Compute Technology Chinese Academy of Sciences Beijing China

ISBN: (纸本)9781424464432

Due to complex abstractions implemented over shared data structures protected by locks, conventional symmetric multithreaded operating system kernel such as Linux is hard to achieve high scalability on the emerging multi-core architectures, which integrate more and more cores on a single die. This paper presents GenerOS - a general asymmetric operating system kernel for multi-core systems. In principal, GenerOS partitions processing cores into application core, kernel core and interrupt core, each of which is dedicated to a specified function. In implementation, we conduct a delicate modification to Linux kernel and provide the same interface as Linux kernel so that GenerOS is compatible with legacy applications. The better performance of GenerOS mainly benefits from: (1) Applications run on their own cores with minimal interrupt and kernel support;(2) Every kernel service is encapsulated into a serial process so that there will be fewer contentions than conventional symmetric kernel;(3) A slim schedule policy is used in the kernel core to support schedule between system calls with low overhead. Experiments with two typical workloads on 16-core AMD machine show that GenerOS behaves better than original Linux kernel when there are more processing cores (19.6% for TPCH using oracle database management system and 42.8% for httperf using apache web server). © 2010 IEEE.

关键词： Linux

来源：评论

学校读者我要写书评

暂无评论

Design-for-Testability Features and Test Implementation of a Giga Hertz General Purpose Microprocessor

引用

Journal of computer Science & technology 2008年第6期23卷 1037-1046页

作者：王达胡瑜李华伟李晓维 Key Laboratory of Computer System and Architecture Institute of Computing TechnologyChinese Academy of Sciences Graduate University of Chinese Academy of Sciences

This paper describes the design-for-testability （DFT） features and low-cost testing solutions of a general purpose microprocessor. The optimized DFT features are presented in detail. A hybrid scan compression structure was executed and achieved compression ratio more than ten times. Memory built-in self-test （BIST） circuitries were designed with scan collars instead of bitmaps to reduce area overheads and to improve test and debug efficiency. The implemented DFT framework also utilized internal phase-locked loops （PLL） to provide complex at-speed test clock sequences. Since there are still limitations in this DFT design, the test strategies for this case are quite complex, with complicated automatic test pattern generation （ATPG） and debugging flow. The sample testing results are given in the paper. All the DFT methods discussed in the paper are prototypes for a high-volume manufacturing （HVM） DFT plan to meet high quality test goals as well as slow test power consumption and cost.

关键词： microprocessor design-for-testability test generation built-in self-test at-speed testing

来源：评论

学校读者我要写书评

暂无评论

Selected Crosstalk Avoidance Code for Reliable Network-on-Chip

引用

Journal of computer Science & technology 2009年第6期24卷 1074-1085页

作者：张颖李华伟李晓维 Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Graduate School of the Chinese Academy of Sciences

With the shrink of the technology into nanometer scale, network-on-chip （NOC） has become a reasonable solution for connecting plenty of IP blocks on a single chip. But it suffers from both crosstalk effects and single event upset （SEU）, especially crosstalk-induced delay, which may constrain the overall performance of NOC. In this paper, we introduce a reliable NOC design using a code with the capability of both crosstalk avoidance and single error correction. Such a code, named selected crosstalk avoidance code （SCAC） in our previous work, joins crosstalk avoidance code （CAC） and error correction code （ECC） together through codeword selection from an original CAC codeword set. It can handle possible error caused by either crosstalk effects or SEU. When designing a reliable NOC, data are encoded to SCAC codewords and can be transmitted rapidly and reliably across NOC. Experimental results show that the NOC design with SCAC achieves higher performance and is reliable to tolerate single errors. Compared with previous crosstalk avoidance methods, SCAC reduces wire overhead, power dissipation and the total delay. When SCAC is used in NOC, it can save 20% area overhead and reduce 49% power dissipation.

关键词： crosstalk avoidance codeword selection reliable network-on-chip single event upset

来源：评论

学校读者我要写书评

暂无评论

Landing Stencil Code on Godson-T

引用

Journal of computer Science & technology 2010年第4期25卷 886-894页

作者：崔慧敏王蕾范东睿冯晓兵 Key Laboratory of Computer System and Architecture Institute of Computing TechnologyChinese Academy of Sciences Graduate University of Chinese Academy of Sciences

The advent of multi-core/many-core chip technology offers both an extraordinary opportunity and a profound challenge. In particular, computer architects and system software designers are faced with a unique opportunity to introducing new architecture features as well as adequate compiler technology -- together they may have profound impact. This paper presents a case study （using the 1-D Jacobi computation） of compiler-amendable performance optimization techniques on a many-core architecture Godson-T. Godson-T architecture has several unique features that are chosen for this study： 1） chip-level global addressable memory in particular the scratchpad memories （SPM） local to the processing cores; 2） fine-grain memory based synchronization （e.g., full-empty bit for fine-grain synchronization）. Leveraging state-of-the-art performance optimization methods for 1-D stencil parallelization （e.g., timed tiling and variants）, we developed and implement a number of many-core-based optimization for Godson-T. Our experimental study shows good performance in both execution time speedup and scalability, validate the value of globally accessed SPM and fine-grain synchronization mechanism （full-empty bits） under the Godson-T, and provides some useful guidelines for future compiler technology of many-core chip architectures.

关键词： many-core, stencil, Jacobi, compiler SPM, fine-grain synchronization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：