检索结果-内蒙古大学图书馆

2008 Asia and South Pacific Design Automation Conference, ASP-DAC

作者： Hu, Yu Fu, Xiang Fan, Xiaoxin Fujiwara, Hideo Key Laboratory of Computer System and Architecture Institute of Computing Technology CAS Beijing 100080 China 8916-5 Takayama Ikoma Nara 630-0192 Japan

ISBN: (纸本)9781424419227

Conventional random access scan (RAS) designs, although economic in test power dissipation, test application time and test data volume, are expensive in area and routing overhead. In this paper, we present a localized RAS architecture (LRAS) to address this issue. A novel scan cell structure, which has fewer transistors than the multiplexer-type scan cell, is proposed to eliminate the global test enable signal and to localize the row enable and the column enable signals. Experimental results on ISCAS'89 and ITC'99 benchmark circuits demonstrate that LRAS has 54% less area overhead than multiplexer-type scan chain based designs, while significantly outperforms the state-of-the-art RAS scheme in routing overhead. ©2008 IEEE.

关键词： Multiplexing equipment

来源：评论

学校读者我要写书评

暂无评论

Exploiting idle register classes for fast spill destination

Exploiting idle register classes for fast spill destination

引用

22nd ACM International Conference on Supercomputing, ICS'08

作者： Lu, Fang Wang, Lei Feng, Xiaobing Li, Zhiyuan Zhang, Zhaoqing Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100191 China Department of Computer Sciences Purdue University West Lafayette IN 47906 United States

ISBN: (纸本)9781605581583

On today's microprocessors, there often exist several different types of registers, e.g. general purpose registers and floating point registers. A given program may use one type of registers much more frequently than other types. This creates an opportunity to employ the infrequently used registers as spill destinations for the more frequently used register types. In this paper, we present a code optimization method named idle register exploitation (IRE) to exploit such opportunities. We developed a model, called the IRE model, or IREM, to determine the static performance gains of IRE versus spilling to the stack. On a microprocessor with fast data paths between different types of registers, we find that IRE method speeds up the execution of the SPECint benchmark suite from 1.7% to 10%. In contrast, on microprocessors with less efficient data transfer paths, the performance gain is limited. In some cases, performance may even suffer degradation. This result argues strongly for the adoption of fast data paths between different types of registers for the purpose of reducing register spills, which is important in view of the increased significance of memory bottlenecks on future microprocessors. Copyright 2008 ACM.

关键词： Data transfer

来源：评论

学校读者我要写书评

暂无评论

A study and implementation of the Huffman algorithm based on condensed Huffman table

A study and implementation of the Huffman algorithm based on...

引用

International Conference on computer Science and Software Engineering, CSSE 2008

作者： Bao, Ergude Li, Weisheng Fan, Dongrui Ma, Xiaoyu School of Software Beijing Jiaotong University Key Laboratory of Computer System and Architecture Institute of Computer Technology Chinese Academy of Sciences

ISBN: (纸本)9780769533360

Huffman codes are being widely used as a very efficient technique for compressing data. To achieve high compressing ratio, some properties of encoding and decoding for canonical Huffman table are discussed. A study and implementation of the Huffman algorithm based on condensed Huffman table is studied. New condensed Huffman table could reduce the cost of the Huffman coding table. Compared with traditional Huffman coding table and other improved tables, the best advantages of new condensed Huffman table is that the space requirement is reduced significantly. © 2008 IEEE.

关键词： Codes (symbols)

来源：评论

学校读者我要写书评

暂无评论

Fetching primary and redundant instructions in turn for a fault-tolerant embedded microprocessor

Fetching primary and redundant instructions in turn for a fa...

引用

14th IEEE Pacific Rim International Symposium on Dependable computing, PRDC 2008

作者： Zhang, Shijian Hu, Weiwu Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100080 China Graduate School of the Chinese Academy of Sciences Beijing 100039 China

ISBN: (纸本)9780769534480

With the development of semiconductor technology, microprocessors become more and more susceptible to transient faults. Some proposed schemes support redundant execution of a program in a superscalar processor for fault tolerance. However, they require a huge queue to accommodate interim states, which enlarge the hardware cost significantly. This paper analyzes the effect of halving a processor's instruction fetch bandwidth on a program's performance. We find that the performance degradation resulted from halving instruction fetch bandwidth declines when instruction latency is lengthened, branch prediction accuracy deteriorates or cache miss rate increases. Since an embedded microprocessor is characterized by long instruction latency, high branch misprediction rate and cache miss rate, a fault-tolerant scheme is proposed, in which two threads fetch instructions in turn and execute in the same processor core simultaneously without any extra queue. The simulation results from eight embedded applications show that performance penalty of our solution ranges from 6.5% to 30.1%, with an average of 22.5%, which is lower than that of the other proposed schemes. The experiment also indicates that our scheme can effectively detect faults occurring in the entire pipeline with short fault detection latency and minimal hardware cost. It is well suited for our solution to realize a reliable embedded microprocessor. © 2008 IEEE.

关键词： Fault detection

来源：评论

学校读者我要写书评

暂无评论

Personalized multimedia web summarizer for tourist 08

Personalized multimedia web summarizer for tourist

引用

17th International Conference on World Wide Web 2008, WWW'08

作者： Wu, Xiao Li, Jintao Zhang, Yongdong Tang, Sheng Neo, Shi-Yong Key Laboratory of Intelligent Information Processing Institute of Computing Technology CAS Beijing China Department of Computer Science National University of Singapore Singapore

ISBN: (纸本)9781605580852

In this paper, we highlight the use of multimedia technology in generating intrinsic summaries of tourism related information. The system utilizes an automated process to gather, filter and classify information on various tourist spots on the Web. The end result present to the user is a personalized multimedia summary generated with respect to users queries filled with text, image, video and real-time news made retrievable for mobile devices. Preliminary experiments demonstrate the superiority of our presentation scheme to traditional methods.

关键词： Automation

来源：评论

学校读者我要写书评

暂无评论

Robust appearance-based method for head pose estimation

引用

Ruan Jian Xue Bao/Journal of Software 2009年第6期20卷 1651-1663页

作者： Ma, Bing-Peng Shan, Shi-Guang Chen, Xi-Lin Gao, Wen Key Laboratory of Intelligent Information Processing Chinese Acad. of Sci. Beijing 100190 China Institute of Computing Technology Chinese Acad. of Sci. Beijing 100190 China Graduate University Chinese Acad. of Sci. Beijing 100049 China School of Electronic Engineering and Computer Science Peking University Beijing 100871 China

This paper proposes a new pose estimation method based on the appearance of 2D head image. First, the 1D Gabor filters are used to extract the features on the raw images. Compared with the traditional 2D Gabor represents, the 1D Gabor represents are more closely related to the head pose, while the advantages of computation and storage are obvious. Second, for the extracted features, a new method, named kernel local fisher discriminant analysis, is applied to eliminate the multimodal problem, while at the same time enhance the discrimination ability. Experimental results show that the proposed method is effective for pose estimation. It must be pointed out that the generalizability of the proposed method is illustrated by the impressive performance when the training dataset and the testing dataset are heterogeneous. © by institute of Software, the Chinese Academy of Sciences. All rights reserved.

关键词： Gabor filters

来源：评论

学校读者我要写书评

暂无评论

Testing content addressable memories using instructions and march-like algorithms

Testing content addressable memories using instructions and ...

引用

15th IEEE International Conference on Electronics, Circuits and Systems, ICECS 2008

作者： Lin, Ma Yunji, Chen Menghao, Su Zichu, Qi Heng, Zhang Weiwu, Hu Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences P.O. Box 2704-25 Beijing 100080 China Graduate University Chinese Academy of Science Beijing 100049 China

ISBN: (纸本)9781424421824

CAM is widely used in microprocessors and SOC TLB modules. It gives great advantage for software development. And TLB operations become bottleneck of the microprocessor performance. The test cost of normal BIST approach of the CAM can not be ignored. The paper analyses the fault models of CAM and proposes an instruction suitable march-like algorithm. The algorithm requires 14N+2L operations, where N is the number of words of the CAM and L is the width of a word. The algorithm covers 100% targeted faults. Instruction-level test using the algorithm has not any test cost on area and performance. Moreover the algorithm can be used in BIST approaches and have less performance lost for microprocessors. The paper instances the algorithm in a MIPS compatible microprocessor and have good results. © 2008 IEEE.

关键词： System-on-chip

来源：评论

学校读者我要写书评

暂无评论

Verification algorithm of a sequential circuit's equivalence based on state transfer graph

Verification algorithm of a sequential circuit's equivalence...

引用

2nd International Symposium on Information Technologies and Applications in Education, ISITAE 2008

作者： Li, Wei Yang, Desheng Lu, Ying Zhang, Yichao Institute of Computer Science and Technology Anhui University Hefei China Key Laboratory of Intelligent Computing and Signal Processing Ministry of Education Anhui University Hefei China

ISBN: (纸本)9780863419133

An equivalence verification algorithm of sequential circuits [1] based on state transfer graph (STG) is presented in this paper, which obtains some certain useful information through verifying the corresponding state transfer graphs' isomorphism, namely that two corresponding sequential circuits' equivalence . And the verifying includes two steps: firstly, find out all state pairs of the vertexes, which are being verified;secondly verify the equivalence of state pairs, if all the state pairs can be matched as equal state pairs, thus we can come to a conclusion that the corresponding circuits have the same sequential behavior. The algorithm mainly verifies ISCAS85 circuits and some simple state transfer graphs, and the final experiment data shows that the algorithm introduced in the paper will obtain better results compared to BDD(Binary Decision Diagram) and SET(Symbolic Trajectory Evaluation) methods. © 2008 IET.

关键词： Timing circuits

来源：评论

学校读者我要写书评

暂无评论

An interconnect-aware power efficient cache coherence protocol for CMPs

An interconnect-aware power efficient cache coherence protoc...

引用

International Symposium on Parallel and Distributed Processing (IPDPS)

作者： Hongbo Zeng Jun Wang Ge Zhang Weiwu Hu Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy and Sciences Beijing China

The continuing shrinking of technology enables more and more processor cores to reside on a single chip. However, the power consumption and delay of global wires have presented a great challenge in designing future chip multiprocessors. With these overheads of wires properly accounted for, researchers have explored some efficient on- chip network designs in the domain of larger scale caches. While in the paper, we attempt to reduce the interconnect power consumption with a novel cache coherence protocol. Conventional coherence protocols are kept independent from underlying networks for flexibility reasons. But in CMPs, processor cores and the on-chip network are tightly integrated. Exposing features of interconnect networks to protocols will unveil some optimization opportunities for power reduction. Specifically, by utilizing the location information of cores on a chip, the coherence protocol we propose in this work chooses to response the requester with the data copy in the closest sharer of the desired cache line, other than fetching it from distant L2 cache banks. This mechanism reduces the hops cache lines must travel and eliminates the power that would have incurred on the corresponding not-traveled links. To get accurate and detailed power information of interconnects, we extract wire power parameters by physical level simulation (HSPICE) and obtain router power by synthesizing RTL with actual ASIC libraries. We conduct experiments on a 16-core CMP simulator with a group of SPLASH2 benchmarks. The results demonstrate that an average of 16.3% L2 cache accesses could be optimized, resulting in an average 9.3% power reduction of data links with 19.2% as the most. This mechanism also yields a performance speedup of 1.4%.

关键词： Protocols Energy consumption Wires Network-on-a-chip Power system interconnection Delay Large-scale systems Switches Acceleration Laboratories

来源：评论

学校读者我要写书评

暂无评论

Location Consistency Model Revisited: Problem, Solution and Prospects

Location Consistency Model Revisited: Problem, Solution and ...

引用

IEEE International Conference on Parallel and Distributed computing, Applications and Technologies (PDCAT)

作者： Guoping Long Nan Yuan Dongrui Fan Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy and Sciences Beijing China

Location consistency (LC) is a weak memory consistency model which is defined entirely on partial order execution semantics of parallel programs. Compared with sequential consistency (SC), LC is scalable and provides ample theoretical parallelism. This makes LC an interesting memory model in the upcoming many-core parallel processing era. Previous work has pointed out that LC does not guarantee SC execution behavior for all data race free programs. In this paper, we compare the semantics of LC with PRAM consistency and memory coherence, and prove that LC is strictly weaker than PRAM consistency. For data race free programs, we prove that the semantics of LC is equivalent to memory coherence. In addition, by introducing memory ordering semantics into LC judiciously, we prove that the enhanced model is equivalent to SC for data race free programs. Finally, we discuss possible solutions for adding reasoning rules for LC-like weak memory models.

关键词： Coherence Phase change random access memory Distributed computing Parallel processing Programming profession Application software Laboratories Concurrent computing computer architecture Counting circuits

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：