检索结果-内蒙古大学图书馆

High efficient memory race recording scheme for parallel program deterministic replay under multi-core architecture

Jisuanji Yanjiu yu Fazhan/computer Research and Development 2012年第1期49卷 64-75页

作者： Liu, Lei Huang, He Tang, Zhimin Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China Graduate University of Chinese Academy of Sciences Beijing 100049 China MIPS Technologies Shanghai 210021 China

Current shared memory multi-core and multiprocessor systems are nondeterministic. When these systems execute a multithreaded application, even if supplied with the same input, they could produce a different output each time. It frustrates debugging and limits the ability to properly test multithreaded code, and is becoming a major stumbling block to the much-needed widespread adoption of parallel programming. The support for deterministic replay of multithreaded execution is greatly helpful in finding concurrency bugs. A memory race recording scheme, named Rainbow, is proposed. Its core idea is to make inter-thread communications fully deterministic. The unique feature of Rainbow is that it precisely sets up happens-before relationships between conflicting memory operations among different threads. By using effective, bloom-filter based, coherence history queue, Rainbow removes redundant happens-before relations implied in the already generated log and enables a compact log. Rainbow adds the modest hardware to the base multi-core processors, and the coherence protocol is unmodified. The analysis results show that Rainbow reduces the log size by 17% of a state-of-the-art scheme, and the records execution speed is similar to that of release consistency (RC) execution and replays at about 93% of its speed. The determinism can be provided with little performance cost using our architecture proposals on the state-of-the-art hardware, and the software-only approaches can be utilized on existing systems without problem.

关键词： Parallel programming

来源：评论

学校读者我要写书评

暂无评论

A unified architecture for speed-binning and circuit failure prediction and detection

A unified architecture for speed-binning and circuit failure...

引用

IEEE International Conference on computer Science and Automation Engineering (CSAE)

作者： Songwei Pei Zhaolin Li Huawei Li Xiaowei Li Shaojun Wei Research Institute of Information Technology Tsinghua University Beijing China State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing China Institute of Microelectronics Tsinghua University Beijing China

With the continual scaling of semiconductor process technology, the circuit timing is increasingly impacted by process variations. It is thus important to categorize high-speed digital circuits into multiple bins of different performances. However, the speed-binning process typically needs very long test application time. In this paper, we proposed a unified architecture, which can accomplish performance grading with a high confidence and short test application time. Moreover, the proposed architecture can be used for on-line circuit failure prediction and detection. Experimental results are presented to validate the proposed architecture.

关键词： Delay Clocks computer architecture Circuit stability Aging Logic gates System-on-a-chip

来源：评论

学校读者我要写书评

暂无评论

In-Field Testing of NAND Flash Storage: Why and How?

In-Field Testing of NAND Flash Storage: Why and How?

引用

Asian Test Symposium (ATS)

作者： Yu Hu Xinli Gu Xiaowei Li State Key Laboratory of Computer Architecture Institute of Computing Technology CAS Beijing China North America Network Division Huawei Technologies Company Limited Santa Clara USA

NAND Flash memories have rapidly emerged as a storage class memory such as SSD (Solid state Disk), CF (Compact Flash) Card, SD (Secure Digital Memory) Card. Due to its distinct operation mechanisms, NAND Flash memory suffers from erase/program endurance, data retention and program/read disturbance problems. Specifically, erase and program operation keeps in developing bad blocks during the lifetime of memory chips. Bad blocks are blocks that contain faulty bits but the ECC (Error Correction Code) algorithm cannot correct them. Although wear leveling tries to balance the erase/program operations on different blocks so that all blocks can wear out at a similar pace, new bad blocks still inevitably *** propose an in-field testing technique which takes some pages in a block as predictors. Due to wear out faster than the other pages, the predictors will become bad before the other pages in the block become bad. The further questions are (1) how to detect those wearing fast pages so as to use them as predictors, (2) how many predictors are needed to achieve a satisfactory prediction accuracy, (3) misprediction will result in what negative impact on performance and endurance.

关键词： Flash memory Error correction codes Laboratories computer architecture North America computers

来源：评论

学校读者我要写书评

暂无评论

Revisiting Multiple Pattern Matching Algorithms for Multi-Core architecture

引用

Journal of computer Science & technology 2011年第5期26卷 866-874页

作者：谭光明刘萍卜东波刘燕兵 Key Laboratory of Computer System and Architecture Institute of Computing TechnologyChinese Academy of Sciences Key Laboratory of Network Technology Institute of Computing TechnologyChinese Academy of Sciences

Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion *** this paper,we present an attempt to design efficient multiple pattern searching algorithms on multi-core *** observe an important feature which indicates that the multiple pattern matching time mainly depends on the number and minimal length of *** multi-core algorithm proposed in this paper leverages this feature to decompose pattern set so that the parallel execution time is *** formulate the problem as an optimal decomposition and scheduling of a pattern set,then propose a heuristic algorithm,which takes advantage of dynamic programming and greedy algorithmic techniques,to solve the optimization *** results suggest that our decomposition approach can increase the searching speed by more than 200% on a 4-core AMD Barcelona system.

关键词： parallel algorithm multi-core multiple pattern matching

来源：评论

学校读者我要写书评

暂无评论

Multi-Layer Synthesis of HighlyStructured Texture

引用

Wuhan University Journal of Natural Sciences 2012年第4期17卷 302-308页

作者： TANG Li YU Rongwei GU Yuanting YUAN Jie DU Gong School of Computer Science and Engineering ChangshuInstitute of Technology Changshu 215500 Jiangsu China State Key Laboratary of Software Engineering WuhanUniversity Wuhan 430072 Hubei China Key Laboratory of Aerospace Information Security andTrust Computing Ministry of Education Wuhan 430072 HubeiChina Institute of Software Chinese Academy of Sciences Beijing100080 Beijing China

In this paper, a novel concept of multilayer synthesis and a general framework for texture synthesis method are presented. Within this framework, we first decompose the texture into the supposed pattern layer and material layer in the frequency domain by an E-texton extracting algorithm, then manipulate and extend them respectively according to their own personalities, and finally merge the newly synthesized pattern layer and material layer again to generate the final output. Experiment results show that our method not only greatly improves the synthesis quality for those cases that single-layer synthesis cannot handle well but also provides an ability of achieving various special synthesis effects.

关键词： texture synthesis texture layer multilayer synthesis

来源：评论

学校读者我要写书评

暂无评论

Selective opening chosen ciphertext security directly from the DDH assumption

Selective opening chosen ciphertext security directly from t...

引用

6th International Conference on Network and System Security, NSS 2012

作者： Liu, Shengli Zhang, Fangguo Chen, Kefei Dept. of Computer Science and Engineering Shanghai Jiao Tong University Shanghai 200240 China School of Information Science and Technology Sun Yat-sen University Guangzhou 510006 China State Key Laboratory of Information Security Institute of Software Chinese Academy of Sciences China Shanghai Key Laboratory of Scalable Computing and Systems Shanghai China

ISBN: (纸本)9783642346002

Chosen-ciphertext security has been well-accepted as a standard security notion for public key encryption. But in a multi-user surrounding, it may not be sufficient, since the adversary may corrupt some users to get the random coins as well as the plaintexts used to generate ciphertexts. The attack is named "selective opening attack". We study how to achieve full-fledged chosen-ciphertext security in selective opening setting directly from the DDH assumption. Our construction is free of chameleon hashing, since tags are created for encryptions in a flexible way to serve the security proof. © 2012 Springer-Verlag.

关键词： Public key cryptography

来源：评论

学校读者我要写书评

暂无评论

A clustering-based scheme for concurrent trace in debugging NoC-based multicore systems 12

A clustering-based scheme for concurrent trace in debugging ...

引用

Design, Automation and Test in Europe Conference and Exhibition

作者： Jianliang Gao Jianxin Wang Yinhe Han Lei Zhang Xiaowei Li School of Information Science and Engineering Central South University China Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences China

ISBN: (纸本)9783981080186

Concurrent trace is an emerging challenge when debugging multicore systems. In concurrent trace, trace buffer becomes a bottleneck since all trace sources try to access it simultaneously. In addition, the on-chip interconnection fabric is extremely high hardware cost for the distributed trace signals. In this paper, we propose a clustering-based scheme which implements concurrent trace for debugging Network-on-Chip (NoC) based multicore systems. In the proposed scheme, a unified communication framework eliminates the requirement for interconnection fabric which is only used during debugging. With clustering scheme, multiple concurrent trace sources can access distributed trace buffer via NoC under bandwidth constraint. We evaluate the proposed scheme using Booksim and the results show the effectiveness of the proposed scheme.

关键词： Multicore processing Bandwidth Fabrics Debugging Clustering algorithms Real time systems System-on-a-chip

来源：评论

学校读者我要写书评

暂无评论

A design of artificial high-rise building evacuation systems under fires

A design of artificial high-rise building evacuation systems...

引用

2012 IEEE International Conference on Service Operations and Logistics, and Informatics, SOLI 2012

作者： Hu, Yu-Ling Liu, Xi-Wei School of Automation Beijing Institute of Technology Beijing 100081 China School of Electric and Information Engineering Beijing University of Civil Engineering and Architecture Beijing 100044 China State Key Laboratory of Management and Control for Complex Systems Institute of Automation Chinese Academy of Sciences Beijing 100190 China Dongguan Research Institute of CASIA Cloud Computing Center Chinese Academy of Sciences Songshan Lake Dongguan China

ISBN: (纸本)9781467324007

Because the structure and function of a high-rise building is complex and the density of occupants is high, and the rescue from outside is very difficult, safe and timely evacuation is an important issue under high-rise building fires. Worsely, it is impossible to do experiments for economic, moral, or even legal reasons. In this paper, based on an ACP (artificial systems, computational experiments, parallel excution) approach, a design of an artificial evacuation system for computational experiments research on societies evacuation strategies is proposed. Based on the buildingEXODUS platform, the agent technology is used to build the high-rise building architectures and occupant individuals who have autonomous activities. By using CFAST software, numerical calculation technology is used to build several kinds of fire scenarios. By interactions and influences, an artificial evacuation system can be built. © 2012 IEEE.

关键词： Fires

来源：评论

学校读者我要写书评

暂无评论

引用

International Conference on computer Communications and Networks (ICCCN)

作者： Yuehua Wang Zhong Zhou Ling Liu Liang Cheng Wei Wu State Key Laboratory of Virtual Reality Technology and Systems School of Computer Science and Engineering Beihang University China Georgia Institute of Technology College of Computing Georgia Institute of Technology USA Department of Computer Science and Engineering Lehigh University USA

Group communication is essential for multi-user applications. However, due to unpredictable node departures and non-deterministic network partitions, providing reliable and scalable group communication services is challenging when the applications are utilized by the users with heterogeneous capacities on a large scale. To address this challenge, we propose a novel replication scheme to achieve high reliability and low-cost scalability in group communication with following three features. First, it introduces a new concept of replication based on topological similarity, which empowers each node with an ability of measuring similarity between the nodes in topology. By eliminating the topological similarity between the replicas, it intelligently mitigates service interruptions caused by node failures and network partitions. Second, instead of specifying the number of replicas, it provides a technique for nodes to dynamically adapt the replication placement schemes by exploiting functionality importance of the nodes in the group- communication session. It eliminates the bottleneck problem and improves the network resource utilization. Third, the scheme is self-converging and it can stabilize within a few adaptations even facing a high churn rate. Extensive simulations show that it yields significant improvements in reduction of replication overhead and service interruption when comparing to existing approaches.

关键词： Random access memory Topology Maintenance engineering Reliability theory computer network reliability

来源：评论

学校读者我要写书评

暂无评论

New Methodologies for Parallel architecture

引用

Journal of computer Science & technology 2011年第4期26卷 578-587页

作者：范东睿李晓维李国杰 Key Laboratory of Computer System and Architecture Institute of Computing TechnologyChinese Academy of Sciences

Moore＇s law continues to grant computer architects ever more transistors in the foreseeable future, and parallelism is the key to continued performance scaling in modern microprocessors. In this paper, the achievements in our research project, which is supported by the National Basic Research 973 Program of China, on parallel architecture, are systematically presented. The innovative approaches and techniques to solve the significant problems in parallel architecture design are smnmarized, including architecture level optimization, compiler and language-supported technologies, reliability, power-performance efficient design, test and verification challenges, and platform building. Two prototype chips, a multi-heavy-core Godson-3 and a many-light-core Godson-T, are described to demonstrate the highly scalable and reconfigurable parallel architecture designs. We also present some of our achievements appearing in ISCA, MICRO, ISSCC, HPCA, PLDI, PACT, IJCAI, Hot Chips, DATE, IEEE Trans. VLSI, IEEE Micro, IEEE Trans. computers, etc.

关键词： architecture multi-core many-core parallelism

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：