Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion *** this paper,we present an attempt to design efficient mult...
详细信息
Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion *** this paper,we present an attempt to design efficient multiple pattern searching algorithms on multi-core *** observe an important feature which indicates that the multiple pattern matching time mainly depends on the number and minimal length of *** multi-core algorithm proposed in this paper leverages this feature to decompose pattern set so that the parallel execution time is *** formulate the problem as an optimal decomposition and scheduling of a pattern set,then propose a heuristic algorithm,which takes advantage of dynamic programming and greedy algorithmic techniques,to solve the optimization *** results suggest that our decomposition approach can increase the searching speed by more than 200% on a 4-core AMD Barcelona system.
With the increasing demand and the wide application of high performance commodity multi-core processors,both the quantity and scale of data centers grow dramatically and they bring heavy energy *** and engineers have ...
详细信息
With the increasing demand and the wide application of high performance commodity multi-core processors,both the quantity and scale of data centers grow dramatically and they bring heavy energy *** and engineers have applied much effort to reducing hardware energy consumption,but software is the true consumer of power and another key in making better use of *** software is critical to better energy utilization,because it is not only the manager of hardware but also the bridge and platform between applications and *** this paper,we summarize some trends that can affect the efficiency of data ***,we investigate the causes of software *** on these studies,major technical challenges and corresponding possible solutions to attain green system software in programmability,scalability,efficiency and software architecture are ***,some of our research progress on trusted energy efficient system software is briefly introduced.
This paper describes the design-for-testability (DFT) features and low-cost testing solutions of a general purpose microprocessor. The optimized DFT features are presented in detail. A hybrid scan compression struct...
详细信息
This paper describes the design-for-testability (DFT) features and low-cost testing solutions of a general purpose microprocessor. The optimized DFT features are presented in detail. A hybrid scan compression structure was executed and achieved compression ratio more than ten times. Memory built-in self-test (BIST) circuitries were designed with scan collars instead of bitmaps to reduce area overheads and to improve test and debug efficiency. The implemented DFT framework also utilized internal phase-locked loops (PLL) to provide complex at-speed test clock sequences. Since there are still limitations in this DFT design, the test strategies for this case are quite complex, with complicated automatic test pattern generation (ATPG) and debugging flow. The sample testing results are given in the paper. All the DFT methods discussed in the paper are prototypes for a high-volume manufacturing (HVM) DFT plan to meet high quality test goals as well as slow test power consumption and cost.
With the shrink of the technology into nanometer scale, network-on-chip (NOC) has become a reasonable solution for connecting plenty of IP blocks on a single chip. But it suffers from both crosstalk effects and sing...
详细信息
With the shrink of the technology into nanometer scale, network-on-chip (NOC) has become a reasonable solution for connecting plenty of IP blocks on a single chip. But it suffers from both crosstalk effects and single event upset (SEU), especially crosstalk-induced delay, which may constrain the overall performance of NOC. In this paper, we introduce a reliable NOC design using a code with the capability of both crosstalk avoidance and single error correction. Such a code, named selected crosstalk avoidance code (SCAC) in our previous work, joins crosstalk avoidance code (CAC) and error correction code (ECC) together through codeword selection from an original CAC codeword set. It can handle possible error caused by either crosstalk effects or SEU. When designing a reliable NOC, data are encoded to SCAC codewords and can be transmitted rapidly and reliably across NOC. Experimental results show that the NOC design with SCAC achieves higher performance and is reliable to tolerate single errors. Compared with previous crosstalk avoidance methods, SCAC reduces wire overhead, power dissipation and the total delay. When SCAC is used in NOC, it can save 20% area overhead and reduce 49% power dissipation.
The advent of multi-core/many-core chip technology offers both an extraordinary opportunity and a profound challenge. In particular, computer architects and system software designers are faced with a unique opportunit...
详细信息
The advent of multi-core/many-core chip technology offers both an extraordinary opportunity and a profound challenge. In particular, computer architects and system software designers are faced with a unique opportunity to introducing new architecture features as well as adequate compiler technology -- together they may have profound impact. This paper presents a case study (using the 1-D Jacobi computation) of compiler-amendable performance optimization techniques on a many-core architecture Godson-T. Godson-T architecture has several unique features that are chosen for this study: 1) chip-level global addressable memory in particular the scratchpad memories (SPM) local to the processing cores; 2) fine-grain memory based synchronization (e.g., full-empty bit for fine-grain synchronization). Leveraging state-of-the-art performance optimization methods for 1-D stencil parallelization (e.g., timed tiling and variants), we developed and implement a number of many-core-based optimization for Godson-T. Our experimental study shows good performance in both execution time speedup and scalability, validate the value of globally accessed SPM and fine-grain synchronization mechanism (full-empty bits) under the Godson-T, and provides some useful guidelines for future compiler technology of many-core chip architectures.
Recent studies have focused on leveraging large-scale artificial intelligence (LAI) models to improve semantic representation and compression capabilities. However, the substantial computational demands of LAI models ...
详细信息
In wireless sensor networks (WSNs), a faulty sensor may produce incorrect data and transmit them to the other sensors. This would consume the limited energy and bandwidth of WSNs. Furthermore, the base station may mak...
详细信息
In wireless sensor networks (WSNs), a faulty sensor may produce incorrect data and transmit them to the other sensors. This would consume the limited energy and bandwidth of WSNs. Furthermore, the base station may make inappropriate decisions when it receives the incorrect data sent by the faulty sensors. To solve these problems, this paper develops an online distributed algorithm to detect such faults by exploring the weighted majority vote scheme. Considering the spatial correlations in WSNs, a faulty sensor can diagnose itself through utilizing the spatial and time information provided by its neighbor sensors. Simulation results show that even when as many as 30% of the sensors are faulty, over 95% of faults can be correctly detected with our algorithm. These results indicate that the proposed algorithm has excellent performance in detecting fault of sensor measurements in WSNs.
The Godson project with an R&D history of 10 years is an independent national program of China that aims at developing advanced microprocessor technologies based on fundamental research and commercialization of the c...
详细信息
The Godson project with an R&D history of 10 years is an independent national program of China that aims at developing advanced microprocessor technologies based on fundamental research and commercialization of the chip technology. We will give a comprehensive presentation of the Godson project, including its history, technical roadmaps, and several unique technical merits.
It can be observed from looking backward that processor architecture is improved through spirally shifting from simple to complex and from complex to simple. Nowadays we are facing another shifting from complex to sim...
详细信息
It can be observed from looking backward that processor architecture is improved through spirally shifting from simple to complex and from complex to simple. Nowadays we are facing another shifting from complex to simple, and new innovative architecture will emerge to utilize the continuously increasing transistor budgets. The growing importance of wire delays, changing workloads, power consumption, and design/verification complexity will drive the forthcoming era of Chip Multiprocessors (CMPs). Furthermore, typical CMP projects both from industries and from academics are investigated. Through going into depths for some primary theoretical and implementation problems of CMPs, the great challenges and opportunities to future CMPs are presented and discussed. Finally, the Godson series microprocessors designed in China are introduced.
It is a well-known fact that test power consumption may exceed that during functional operation. Leakage power dissipation caused by leakage current in Complementary Metal-Oxide-Semiconductor (CMOS) circuits during ...
详细信息
It is a well-known fact that test power consumption may exceed that during functional operation. Leakage power dissipation caused by leakage current in Complementary Metal-Oxide-Semiconductor (CMOS) circuits during test has become a significant part of the total power dissipation. Hence, it is important to reduce leakage power to prolong battery life in portable systems which employ periodic self-test, to increase test reliability and to reduce test cost. This paper analyzes leakage current and presents a kind of leakage current simulator based on the transistor stacking effect. Using it, we propose techniques based on don't care bits (denoted by Xs) in test vectors to optimize leakage current in integrated circuit (IC) test by genetic algorithm. The techniques identify a set of don't care inputs in given test vectors and reassign specified logic values to the X inputs by the genetic algorithm to get minimum leakage vector (MLV). Experimental results indicate that the techniques can effectually optimize leakage current of combinational circuits and sequential circuits during test while maintaining high fault coverage,
暂无评论