检索结果-内蒙古大学图书馆

Midwest Symposium on Circuits and Systems (MWSCAS)

作者： Ge Zhang Weiwu Hu Institute of Computing Technology Key Laboratory of Computer System and Architecture Beijing China

This paper presents a methodology for high-level power modeling of cell-based processors. A flexible power model library, which can automatically generate detailed power data for actual circuits of each part of given processor, is developed and annotated dynamically for architecture-level power simulator. According to this method, the dynamic power, leakage power and even area and cell counts can be accurately estimated, and the preliminary power validation for a MIPS microprocessor proves our methodology to be effective and highly correlated, with only small errors comparing with the gate-level power analysis.

关键词： Microprocessors Circuit simulation Libraries Adders Circuit synthesis Power system modeling Analytical models Content addressable storage Power generation Flexible printed circuits

来源：评论

学校读者我要写书评

暂无评论

HPPNetSim: A parallel simulation of large-scale interconnection networks

HPPNetSim: A parallel simulation of large-scale interconnect...

引用

42nd Annual Simulation Symposium 2009, ANSS 2009, Part of the 2009 Spring Simulation Multiconference

作者： Cao, Zheng Xu, Jianwei Chen, Mingyu Zheng, Gui Lv, Huiwei Sun, Ninghui Institute of Computing Technology Chinese Academy of Sciences Beijing China Key Laboratory of Computer System and Architecture Chinese Academy of Sciences Beijing China Graduate University of Chinese Academy of Sciences Beijing China

ISBN: (纸本)9781617386404

As the scale of parallel machine grows, communication network is playing more important role than ever before. Communication affects not only execution time, but also scalability of parallel applications. Parallel interconnection network simulator is a suitable tool to study large-scale interconnection networks. However, simulating packet level communication on detailed cycle-to-cycle network models is a really challenge work. We implement a kernel-based parallel simulator HPPNetSim to solve problems. Optimistic PDES mechanism needs huge memory consumption of saving simulation entities' states in large-scale simulations, so we chose conservative synchronization approach. Simulation kernel and network models are all carefully designed. To accelerate process of simulation, optimizations are introduced, such as block/unblock synchronization, load balancing, dynamic look-ahead generation, and etc. Simulation examples and performance results show that both high accuracy and good performance are obtained in HPPNetSim. It achieves speedup of 19.8 for 32 processing nodes when simulating 36-port 3-tree fat-tree network.

关键词： Discrete event simulation

来源：评论

学校读者我要写书评

暂无评论

A case study of improving at-speed testing coverage of a gigahertz microprocessor

A case study of improving at-speed testing coverage of a gig...

引用

2009 16th IEEE International Conference on Electronics, Circuits and Systems, ICECS 2009

作者： Qi, Zichu Liu, Hui Li, Xiangku Xu, Jun Hu, Weiwu Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences PO Box 2704-25 Beijing 100190 China Beijing China Loongson Technologies Corporation Limited China

ISBN: (纸本)9781424450916

For a gigahertz microprocessor with multiple clock domains and a large amount of embedded RAMs (Random Access Memory), generating at-speed testing patterns is becoming very difficult and very time-consuming. This paper presents some novel techniques to improve at-speed testing coverage with low cost. These methods are major concern about preventing X states propagation, which include avoiding capturing X states for registers, sequential bypass of macros, clock control scheme for inter-clock domains and accurate analysis of exception paths in intra-clock domains. Functional patterns are utilized to further improv. The efficiency oy the at-speed testing. A novel optimal flow is presented by carefully selecting these techniques. By usin. The flow, 90% transition fault coverage is achieved. In addition, bot. The number of patterns an. The test time oy the transition test are decreased by 15%. The total area overhead is about a few hundreds of AND cells and has little timing impact oy the critical paths. © 2009 IEEE.

关键词： Random access storage

来源：评论

学校读者我要写书评

暂无评论

HPPNetSim: A parallel simulation of large-scale interconnection networks

HPPNetSim: A parallel simulation of large-scale interconnect...

引用

2009 Spring Simulation Multiconference, SpringSim 2009

作者： Cao, Zheng Xu, Jianwei Chen, Mingyu Gui, Zheng Lv, Huiwei Sun, Ninghui Institute of Computing Technology Chinese Academy of Sciences Beijing China Key Laboratory of Computer System and Architecture Chinese Academy of Sciences Beijing China Graduate University of Chinese Academy of Sciences Beijing China

As the scale of parallel machine grows, communication network is playing more important role than ever before. Communication affects not only execution time, but also scalability of parallel applications. Parallel interconnection network simulator is a suitable tool to study large-scale in-terconnection networks. However, simulating packet level communication on detailed cycle-to-cycle network models is a really challenge work. We implement a kernel-based parallel simulator HPPNetSim to solve problems. Optimistic PDES mechanism needs huge memory consumption of sav-ing simulation entities' states in large-scale simulations, so we chose conservative synchronization approach. Simula-tion kernel and network models are all carefully designed. To accelerate process of simulation, optimizations are in-troduced, such as block/unblock synchronization, load balancing, dynamic look-ahead generation, and etc. Simula-tion examples and performance results show that both high accuracy and good performance are obtained in HPPNetSim. It achieves speedup of 19.8 for 32 processing nodes when simulating 36-port 3-tree fat-tree network.

关键词： Discrete event simulation

来源：评论

学校读者我要写书评

暂无评论

A New Post-Silicon Debug Approach Based on Suspect Window

A New Post-Silicon Debug Approach Based on Suspect Window

引用

IEEE VLSI Test Symposium (VTS)

作者： Jianliang Gao Yinhe Han Xiaowei Li Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences

ISBN: (纸本)9781424437696

Bugs are tending to be unavoidable in the design of complex integrated circuits. It is imperative to identify the bugs as soon as possible by post-silicon debug. The main challenge for post-silicon debug is the observability of the internal signals. This paper exploits the fact that it is not necessary to observe the error free states. Then we introduce "suspect window" and present a method for determining its boundary. Based on suspect window, we propose a debug approach to achieve high observability by reusing scan chain. Since scan dumps take place only in suspect window, debug time is greatly reduced. Experimental results demonstrate the effectiveness of the proposed approach.

关键词： Post-silicon debug Suspect window Trace Scan Bug

来源：评论

学校读者我要写书评

暂无评论

Mining recent approximate frequent items in wireless sensor networks

Mining recent approximate frequent items in wireless sensor ...

引用

6th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2009

作者： Ren, Meirui Guo, Longjiang School of Computer Science and Technology Heilongjiang University Harbin China Data Base and Parallel Computing Key Laboratory of Heilongjiang Province China School of Computer Science and Technology Harbin Institute of Technology China

ISBN: (纸本)9780769537351

Mining Frequent Items from sensory data is a major research problem in wireless sensor networks(WSNs) and it can be widely used in environmental monitoring. Conventional Lossy Counting algorithm can be applied to solve this problem in centralized manner. However, centralized algorithm brings severely data collision in WSNs, and results in inaccurate mining results. In this paper, we present D-FIMA, a distributed frequent items mining algorithm. D-FIMA, running at every sensor node, establishes items aggregation tree via forwarding mining request beforehand, and each node maintains local approximate frequent items. The root of the aggregation tree outputs the final global approximate frequent items. Theoretical analysis and the simulation results show that energy consumption of D-FIMA is much less than the centralized algorithm, and mining results of D-FIMA is more accurate than the centralized algorithm. © 2009 IEEE.

关键词： Sensor nodes

来源：评论

学校读者我要写书评

暂无评论

Three kinds of extraneous factors in Dixon resultants

引用

Science China Mathematics 2009年第1期52卷 160-172页

作者： ZHAO ShiZhong FU HongGuang Shanghai Key Laboratory of Trustworthy Computing East China Normal UniversityShanghai 200062China School of Computer Science and Engineering University of Electronic Science and Technology of ChinaChengdu 610054China Chengdu Institute of Computer Applications Chinese Academy of SciencesChengdu 610041China

Dixon resultant is a basic elimination method which has been used widely in the high technology fields of automatic control, robotics, etc. But how to remove extraneous factors in Dixon resultants has been a very difficult problem. In this paper, we discover some extraneous factors by expressing the Dixon resultant in a linear combination of original polynomial system. Furthermore, it has been proved that the factors mentioned above include three parts which come from Dixon derived polynomials, Dixon matrix and the resulting resultant expression by substituting Dixon derived polynomials respectively.

关键词： Dixon resultant Dixon matrix extraneous factors 00A06 13A50 13P99 68W30

来源：评论

学校读者我要写书评

暂无评论

Optimizing inter-domain communication

Optimizing inter-domain communication

引用

15th International Conference on Parallel and Distributed Systems, ICPADS '09

作者： Zang, Hongyong Sun, Yuzhong Gu, Kuiyan Institute of Computing Technology Chinese Academy of Sciences Beijing 100080 China Graduate University of Chinese Academy of Sciences Beijing 100049 China Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100080 China Dongxin Geology Research Institute of Shengli Oilfield Sinopec Dongying Shandong 257094 China

ISBN: (纸本)9780769539003

Virtual machine technology has played an important role in data center. Distributed services deployed in multiple virtual machines, may reside on one physical machine. This situation requires an efficient inter-domain communication channel with transparency and security principles ensured. Although current inter-domain mechanism has gained a much better performance compared to traditional inter-domain path offered by hypervisor, shared data channel size limitation and additional copy are still two restrains against a higher performance efficiency. In this paper, we give an analysis on these limitations, overcome these shortcomings, and achieve a higher efficient inter-domain communication channel. By applying virtual address protection mechanism during channel bootstrap, we enlarge the maximum size of shared data channel. By pointing network packet structure to the buffer in the shared data channel, we avoid an extra data copy on the receiver VM side. In our evaluation using a number of standard benchmarks, we have reduced the latency by nearly 40%, increased the throughput by approximately 45% and cut down more than 3500 CPU cycles per packet. © 2009 IEEE.

关键词： Virtual machine

来源：评论

学校读者我要写书评

暂无评论

A highly efficient inter-domain communication channel

A highly efficient inter-domain communication channel

引用

IEEE 9th International Conference on computer and Information technology, CIT 2009

作者： Zang, Hongyong Gu, Kuiyan Li, Yaqiong Sun, Yuzhong Meng, Dan Institute of Computing Technology Chinese Academy of Sciences Beijing 100080 China Graduate University of Chinese Academy of Sciences Beijing 100049 China Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100080 China Dongxin Geology Research Institute of Shengli Oilfield Sinopec Dongying Shandong 257094 China

ISBN: (纸本)9780769538365

With virtual machine technology, distributed services deployed in multiple cooperative virtual machines, such as multi-tier web services, may reside on one physical machine. This situation requires an efficient inter-domain communication channel, and meanwhile transparency and security should be guaranteed, for diverse existing distributed applications are serving on plenty of machines. In this paper, we have implemented a highly efficient inter-domain communication channel, called SChannel, with full transparency to both user applications and network protocol stack, and security between guest domains on Xen platform. Between two co-resident domains, SChannel establishes a twoway shared memory channel with elastic size, which is set up using static shared memory mechanism, instead of high-cost dynamic shared memory. Furthermore, SChannel avoids one additional copy from the shared data channel on the receiver domain side. In our evaluation using a number of standard benchmarks, SChannel increases the throughput 5 times than standard inter-domain mechanism offered by the hypervisor. Compared with other typical transparent inter-domain communication mechanism, SChannel achieves approximately 44.5% improvement of throughput, and reduces more than 3500 CPU cycles per packet. © 2009 IEEE.

关键词： Transparency

来源：评论

学校读者我要写书评

暂无评论

Software and Hardware Cooperate for 1-D FFT Algorithm Optimization on Multicore Processors

Software and Hardware Cooperate for 1-D FFT Algorithm Optimi...

引用

International Conference on computer and Information technology (CIT)

作者： Yongbin Zhou Junchao Zhang Dongrui Fan Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy and Sciences Beijing China

Multicore architecture is becoming a promise to keep Moore's Law and brings a revolution in both research and industry which results new design space for software and architecture. Fast Fourier transform (FFT), computing intensive and bandwidth intensive, is one of the most popular and important applications in the world. Compared with the computing resource on multicore architecture, the on-chip memory resource is much more expensive because of the limitation of physical chip size. Efficient implementation of FFT algorithm on multicore with good scalability is a challenge for both software and hardware developers. In this paper, supported by the Godson-T architecture, an optimized implementation of 1-D FFT has been developed with matrix transpose conceal and computation/communication overlapping, which achieve more than 30% performance improvement as well as almost 1/3 L2 cache consumption reduce comparing with the base six-step FFT. The limitation of scalability is also analyzed and the conclusion is that on Godson-T when frequency and simultaneous data access happen, the limited access bandwidth of L2 cache is the bottleneck and result in the longer on-chip network latency.

关键词： Hardware Software algorithms Multicore processing computer architecture Bandwidth Scalability Moore's Law Aerospace industry computer industry Software design

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：