作者:
Schoellkopf, Jean-PierreIMAG Lab
Computer Architecture Research Group St. Martin d'Heres Fr IMAG Lab Computer Architecture Research Group St. Martin d'Heres Fr
The 'Familiar Instruction Set computer' (FISC) project is an application of the CAPRI Silicon Compiler project to implement 'computer-like' VLSI chips defined by their behavior. The paper first present...
详细信息
ISBN:
(纸本)0444867511
The 'Familiar Instruction Set computer' (FISC) project is an application of the CAPRI Silicon Compiler project to implement 'computer-like' VLSI chips defined by their behavior. The paper first presents a 'Silicon Assembler' LUBRICK, which allows hierarchical design of functional cells according to basic interconnection structures, to obtain a good result in terms of silicon area and correctness. The inter-connection problems are emphasized. In a second part of the paper, the data-path design for FISC is presented, to show how bit-sliced structures can be designed in a short time using the silicon assembler.
This paper describes a commercial software and hardware platform for telecommunications and multimedia processing. The software architecture loosely follows the CORBA and ODP standards of distributed computing and sup...
详细信息
ISBN:
(纸本)9781450373395
This paper describes a commercial software and hardware platform for telecommunications and multimedia processing. The software architecture loosely follows the CORBA and ODP standards of distributed computing and supports a number of application types on different hardware configurations. This paper is the result of lessons learned in the process of designing, building, and modifying an industrial telecommunications platform. In particular, the use of the trading function in the design of the system led to such benefits as support for the dynamic evolution of the system, the ability to dynamically add services and data types to a running system, support for heterogeneous systems, and a simple design performing well enough to handle traffic in excess of 40,000 busy-hour calls.
To communicate with a controller board, several deflection systems use the digital XY2-100 protocol which is not equipped on most microcontroller units (MCUs). This paper presents a solution to implement the XY2-100 p...
详细信息
The challenge for on-chip networks is to provide low latency communication in a very low power budget. To reduce the latency and keep the simplicity of a mesh network, torus network is proposed. As torus networks have...
详细信息
Based on the homotopy analysis method, a general analytic technique for strongly nonlinear problems, a Maple package of automated derivation (ADHO) for periodic nonlinear oscillation systems is presented. This Maple...
详细信息
Based on the homotopy analysis method, a general analytic technique for strongly nonlinear problems, a Maple package of automated derivation (ADHO) for periodic nonlinear oscillation systems is presented. This Maple package is valid for periodic oscillation systems in rather general, and can automatically deliver the accurate approximations of the frequency co and the mean of motion δof a nonlinear periodic oscillator. Based on the homotopy analysis method which is valid even for highly nonlinear problems, this Maple package can give accurate approximate expressions even for nonlinear oscillation systems with strong nonlinearity. Besides, the package is user-friendly: One just needs to input a governing equation and initial conditions, and then gets satisfied analytic approximations in few seconds. Several different types of examples are given in this paper to illustrate the validity of this Maple package. Such kind of package provides us a helpful and easy-to-use tool in science and engineering to analyze periodic of this Maple package from the is published publicly. nonlinear oscillations. And it is free address http://*** to download the electronic version ***/*** once the paper
Adiabatic process in thermodynamics transfers energy across zero temperature difference. The adiabatic CMOS design style attempts to switch a transistor to transfer energy across its source and drain while the voltage...
详细信息
ISBN:
(纸本)0769518893
Adiabatic process in thermodynamics transfers energy across zero temperature difference. The adiabatic CMOS design style attempts to switch a transistor to transfer energy across its source and drain while the voltage difference is zero. We define an adiabatic micro-architecture that pushes instructions across zero IPC gradient. The IPC gradient can be zero across time: for the same stage IPC over time does not vary, or across space: adjacent pipeline stages have zero variance. The reason to consider adiabatic micro-architectures is that the energy for a given computation can be shown to be minimum for an adiabatic micro-architecture. An adiabatic compiler, really a back-end, is defined to be a compiler to support an adiabatic micro-architecture achieve its goals. The minimal support provided by an adiabatic compiler includes a static estimation of program ILP. We add new passes to the MachineSUIF compiler, to flag instruction groups that can potentially walk through a superscalar pipeline as a group. Hence, these instruction groups offer a fairly robust model of superscalar microarchitecture ILP. A compile time scheduling analysis can also generate instruction slack values. The slack indicates the program region within which an instruction can be scheduled. We also present a dispatch stage dynamic scheduling algorithm that utilizes the compiler annotated slacks to reschedule instructions with the explicit objective of minimizing the dispatch stage IPC variance. In other words, the proposed dispatch stage is adiabatic. Preliminary experimental results demonstrate an average reduction of 4.16% in IPC variance over SPEC2000 benchmarks with the adiabatic compiler and microarchitecture. The preliminary evaluation also shows the average processor dispatch stage energy reduction of 3.9% over the same SPEC2000 benchmarks. We expect to add similar IPC smoothening control knobs at instruction fetch and issue stages as well in the future, which should result in a more signifi
作者:
Tang, YuanYou, RonghuiSchool of Computer Science
School of Software Fudan University Shanghai Key Lab. of Intelligent Information Processing State Key Lab. of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences China
It's important to hit a space-time balance for a real-world algorithm to achieve high performance on modern shared-memory multi-core or many-core systems. However, a large class of dynamic programs with more than ...
详细信息
This paper presents an algorithm that allocates registers optimally for straight-line code running on a generic multi-issue computer. On such a machine, an optimal register allocation is one that minimizes the number ...
详细信息
ISBN:
(纸本)9780897916653
This paper presents an algorithm that allocates registers optimally for straight-line code running on a generic multi-issue computer. On such a machine, an optimal register allocation is one that minimizes the number of issue slots that the code requires. Optimal spill selection and load/store placement are used to minimize the number of additional issue slots needed, given a schedule for the non-memory reference instructions and a fixed number of available physical registers. The generic multi-issue machine model closely models the operation of vector and VLIW processors, and could be extended to model super-scalar processors. The algorithm uses dynamic programming to search the state space of feasible register allocations; implicit and explicit state pruning are used to make the problem tractable without sacrificing optimality. The optimal allocation produced by the algorithm for a substantial example is presented.
Low cost protection of embedded systems against soft errors has recently become a major concern. This issue is even more critical in memory elements that are inherently more prone to transient faults. In this paper, w...
详细信息
ISBN:
(纸本)1595935789
Low cost protection of embedded systems against soft errors has recently become a major concern. This issue is even more critical in memory elements that are inherently more prone to transient faults. In this paper, we propose a reliability aware data placement technique in order to partially protect embedded memory systems. We show that by adopting this method instead of traditional placement schemes with complete memory protection, an acceptable level of fault tolerance can be achieved while incurring less area and power overhead. In this approach, each variable in the program is placed in either protected or non-protected memory area according to the profile-driven liveness analysis of all memory variables. In order to measure the level of fault coverage, we inject faults into the memory during the course of program execution in a Monte Carlo simulation framework. Subsequently, we calculate the coverage of partial protection scheme based on the number of protected, failed and crashed runs during the fault injection experiment. Copyright 2006 ACM.
The sustained push toward smaller and smaller technology sizes has reached a point where device reliability has moved to the forefront of concerns for next-generation designs. Silicon failure mechanisms, such as trans...
详细信息
暂无评论