The bandwidth becomes the major bottleneck of the performance improvement for modern microprocessors. A cache adaptive write allocate policy that improves the bandwidth of microprocessor significantly is proposed by i...
详细信息
The bandwidth becomes the major bottleneck of the performance improvement for modern microprocessors. A cache adaptive write allocate policy that improves the bandwidth of microprocessor significantly is proposed by investigating cache store misses. The cache adaptive write allocate policy collects fully modified blocks in miss queue. Fully modified blocks are written to lower level memory based on non-write allocate policy which can switch to write allocate policy adaptively. Compared with other cache store miss policies, the cache adaptive write allocate policy avoids unnecessary memory traffic, reduces cache pollution and decreases load and store queue full rate without increasing hardware overhead. Experiment results indicate that on average 62.6% memory bandwidth in STREAM benchmarks is improved by utilizing the cache adaptive write allocate policy. The performance of SPEC CPU 2000 benchmarks is also improved efficiently. The average IPC speedup is 5.9%.
Now the research of computer architecture focuses on how to utilize the energy of CPU to attain high performance as much as possible. Obviously the architecture-level power estimation tool is important. Existing archi...
详细信息
Now the research of computer architecture focuses on how to utilize the energy of CPU to attain high performance as much as possible. Obviously the architecture-level power estimation tool is important. Existing architecture-level power simulators only focus on full-custom dynamic circuits modeling, but ignores the power modeling of ASIC designs which are mainly composed of static circuits or standard cell libraries. So this paper is concerned with the implementation of a high performance and low power general purpose CPU, the Godson-2 processor, and analyzes the power characteristics of the CPU, and implements an architecture-level power estimation methodology which aims at the Godson-2 processor. This methodology takes the power modeling methodology of CMOS static circuits into account carefully, so it is better for the estimation of current high performance CPU architecture which is designed with ASIC methodology. Compared with the RTL power estimating method, this methodology has high speed and high flexibility and the accuracy is also very good. On the platform of Intel Xeon 2.4 GHz, the speed of this methodology is about 300 K instructions per second, which is 5000 times that of the RTL power estimating method with only little error penalty.
This paper proposes a random routing algorithm with end-to-end feedback. Random routing has the capability of handling random transmission errors efficiently with high forwarding speed. End-to-end feedback promises th...
详细信息
This paper proposes a random routing algorithm with end-to-end feedback. Random routing has the capability of handling random transmission errors efficiently with high forwarding speed. End-to-end feedback promises the correctness of transmission and reduces the power consumption. Experimental results demonstrated that our random routing algorithm has lower latency, lower power consumption, and can provide on-chip communication with high reliability.
It can be observed from looking backward that processor architecture is improved through spirally shifting from simple to complex and from complex to simple. Nowadays we are facing another shifting from complex to sim...
详细信息
It can be observed from looking backward that processor architecture is improved through spirally shifting from simple to complex and from complex to simple. Nowadays we are facing another shifting from complex to simple, and new innovative architecture will emerge to utilize the continuously increasing transistor budgets. The growing importance of wire delays, changing workloads, power consumption, and design/verification complexity will drive the forthcoming era of Chip Multiprocessors (CMPs). Furthermore, typical CMP projects both from industries and from academics are investigated. Through going into depths for some primary theoretical and implementation problems of CMPs, the great challenges and opportunities to future CMPs are presented and discussed. Finally, the Godson series microprocessors designed in China are introduced.
In a cluster or a database server system, the performance of some data intensive applications will be degraded much because of the limited local memory and large amount of interactions with slow disk. In high speed ne...
详细信息
In a cluster or a database server system, the performance of some data intensive applications will be degraded much because of the limited local memory and large amount of interactions with slow disk. In high speed network, utilizing remote memory of other nodes or customized memory server to be as second level buffer can decrease access numbers to disks and benefit application performance. With second level buffer mode, this paper made some improvements for a recently proposed buffer cache replacement algorithm-LIRS, and brings forward an adaptive algorithm-LIRS-A. LIRS-A can adaptively adjust itself according to application characteristic, thus the problem of not suiting for time locality of LIRS is avoided. In TPC-H benchmarks, LIRS-A could improve hit rate over LIRS by 7.2% at most. In a Groupby query with network stream analyzing database, LIRS-A could improve hit rate over LIRS by 31.2% at most. When compared with other algorithms, LIRS-A also show similar or better performance.
Dynamic programming has been one of the most efficient approaches to sequence analysis and structure prediction in biology. However, their performance is limited due to the drastic increase in both the number of biolo...
详细信息
In this paper a new approach for building networked embedded software is presented. The approach is based on the composition of reusable components with the addition of a perspective contract principle for modeling no...
详细信息
In this paper a new approach for building networked embedded software is presented. The approach is based on the composition of reusable components with the addition of a perspective contract principle for modeling non-functional properties. Nonfunctional properties are an important aspect of networked embedded software, and this is why they are modeled separately. As such, the component view presented here differs from traditional component based views, where focus is laid on the functional part. The ideas discussed in the paper have been implemented in a tool. This tool enables the construction of networked embedded software by means of components and perspective contracts. Currently, a queuing network based algorithm that considers all non-functional properties together performs a static analysis on the perspective contracts before execution of the application
Under visualization idea based on large-scale dismantling and sharing, the implementing of network interconnection of calculation components and storage components by loose coupling, which are tightly coupling in trad...
详细信息
Under visualization idea based on large-scale dismantling and sharing, the implementing of network interconnection of calculation components and storage components by loose coupling, which are tightly coupling in traditional server, achieves computing capacity, storage capacity and service capacity distribution according to need in application-level. Under the new server model, the segregation and protection of user space and system space as well as the security monitoring of virtual resources are the important factors of ultimate security guarantee. This article presents a large-scale and expansible distributed invasion detection system of virtual computing environment based on virtual machine. The system supports security monitoring management of global resources and provides uniform view of security attacks under virtual computing environment, thereby protecting the user applications and system security under capacity services domain.
Under virtualization idea based on large-scale dismantling and sharing, the implementing of network interconnection of calculation components and storage components by loose coupling, which are tightly coupling in tra...
详细信息
Under virtualization idea based on large-scale dismantling and sharing, the implementing of network interconnection of calculation components and storage components by loose coupling, which are tightly coupling in traditional server, achieves computing capacity, storage capacity and service capacity distri- bution according to need in application-level. Under the new server model, the segregation and protection of user space and system space as well as the security monitoring of virtual resources are the important factors of ultimate security guarantee. This article presents a large-scale and expansible distributed invasion detection system of virtual computing environment based on virtual machine. The system supports security monitoring management of global resources and provides uniform view of security attacks under virtual computing environment, thereby protecting the user applications and system security under capacity services domain.
暂无评论