High-level hardware modeling is an essential, yet time-consuming, part of system design. However, effective component-based reuse in hardware modeling languages can reduce model construction time and enable the explor...
详细信息
ISBN:
(纸本)1581139373
High-level hardware modeling is an essential, yet time-consuming, part of system design. However, effective component-based reuse in hardware modeling languages can reduce model construction time and enable the exploration of more design alternatives, leading to better designs. While component overloading and parametric polymorphism are critical for effective component-base reuse, no existing modeling language supports both. The lack of these features creates overhead for designers that discourages reuse, negating any benefits of reuse. This paper presents a type system which supports both component overloading and parametric polymorphism. It proves that performing type inference for any such system is NP-complete and presents a heuristic that works efficiently in practice. The result is a type system and type inference algorithm that can encourage reuse, reduce design specification time, and lead to better designs.
This paper proposes a novel Deadlock Avoidance Algorithm (DAA) and its hardware implementation, the Deadlock Avoidance Unit (DAU), as an Intellectual Property (IP) core that provides a mechanism for very fast and auto...
详细信息
ISBN:
(纸本)1581139373
This paper proposes a novel Deadlock Avoidance Algorithm (DAA) and its hardware implementation, the Deadlock Avoidance Unit (DAU), as an Intellectual Property (IP) core that provides a mechanism for very fast and automatic deadlock avoidance in MultiProcessor system-on-a-Chip (MP-SoC) with multiple (e.g., 10) processing elements and multiple (e.g., 40) resources. The DAU avoids deadlock by not allowing any grant or request that leads to a deadlock. In case of livelock, the DAU asks one of the processes involved in the livelock to release resource(s) so that the livelock can also be resolved. We simulated two realistic examples that can benefit from the DAU, and demonstrated that the DAU not only avoids deadlock in a few clock cycles but also achieves a 37% speed-up of application execution time over avoiding deadlock in software. Finally, the SoC area overhead due to the DAU is small, under 0.01% in our example.
The multi-objective genetic algorithm is an effective solution to the complex problem of hardwaresoftwarecodesign. An extended genetic algorithm (EGA) has been developed that implements a novel selection method with...
详细信息
In this paper, we present an approach for automatic synthesis of system on Chip (SoC) multiprocessor architectures for applications expressed as process networks. Our approach is targeted towards design space explorat...
详细信息
ISBN:
(纸本)1581139373
In this paper, we present an approach for automatic synthesis of system on Chip (SoC) multiprocessor architectures for applications expressed as process networks. Our approach is targeted towards design space exploration (DSE) and thus the speed of synthesis is of critical interest. The focus here is on the problem of resource allocation and binding with a view to optimize cost under performance constraints. Our approach exploits adjacency relation of processes and uses a dynamic programming based algorithm to synthesize the architecture including interconnection network. We have done a number of experiments on real as well as randomly generated process networks. The results have been compared with an optimal MILP formulation. They conclusively show that this approach is fast as well as effective and can be employed for DSE.
For complex system-on-chips (SoCs) fabricated in nanometer technologies, the system-level on-chip communication architecture is emerging as a significant source of power consumption. Managing and optimizing this impor...
详细信息
ISBN:
(纸本)1581139373
For complex system-on-chips (SoCs) fabricated in nanometer technologies, the system-level on-chip communication architecture is emerging as a significant source of power consumption. Managing and optimizing this important component of SoC power requires a detailed understanding of the characteristics of its power consumption. Various power estimation and low-power design techniques have been proposed for the global interconnects that form part of SoC communication architectures (e.g., low-swing buses, bus encoding, etc). While effective, they only address a limited part of communication architecture power consumption. A state-of-the-art communication architecture, viewed in its entirety, is quite complex, comprising several components, such as bus interfaces, arbiters, bridges, decoders, and multiplexers, in addition to the global bus lines. Relatively little research has focused on analyzing and comparing the power consumed by different components of the communication architecture. In this work, we present a systematic evaluation and analysis of the power consumed by a state-of-the-art communication architecture (the AMBA on-chip bus), using a commercial design flow. We focus on developing a quantitative understanding of the relative contributions of different communication architecture components to its power consumption, and the factors on which they depend. We decompose the communication architecture power into power consumed by logic components (such as arbiters, decoders, bus bridges), global bus lines (that carry address, data, and control information), and bus interfaces. We also perform studies that analyze the impact of varying application traffic characteristics, and varying SoC complexity, on communication architecture power. Based on our analyses, we evaluate different techniques for reducing the power consumed by the on-chip communication architecture, and compare their effectiveness in achieving power savings at the system level. In addition to quanti
In previous work, we showed the benefits and feasibility of having a processor dynamically partition its executing software such that critical software kernels are transparently partitioned to execute as a hardware co...
详细信息
ISBN:
(纸本)0769520855
In previous work, we showed the benefits and feasibility of having a processor dynamically partition its executing software such that critical software kernels are transparently partitioned to execute as a hardware coprocessor on configurable logic - an approach we call warp processing. The configurable logic place and route step is the most computationally intensive part of such hardware/software partitioning, normally running for many minutes or hours on powerful desktop processors. In contrast, dynamic partitioning requires place and route to execute in just seconds and on a lean embedded processor. We have therefore designed a configurable logic architecture specfically for dynamic hardware/software partitioning. Through experiments with popular benchmarks, we show that by specifically focusing on the goal of software kernel speedup when designing the FPGA architecture, rather than on the more general goal of ASIC prototyping, we can perform place and route for our architecture 50 times faster, using 10,000 times less data memory and 1,000 times less code memory, than popular commercial tools mapping to commercial configurable logic. Yet, we show that we obtain speedups (2x on average, and as much as 4x) and energy savings (33% on average, and up to 74%) when partitioning even just one loop, which are comparable to commercial tools and fabrics. Thus, our configurable logic architecture represents a good candidate for platforms that will support dynamic hardware/software partitioning, and enables ultra-fast desktop tools for hardware/software partitioning, and even for fast configurable logic design in general.
A hands free telephone application has been implemented on a low-cost, custom-configured, block-floating-point digital signal processor (DSP). The application consists of an acoustic echo canceller and a spectral subt...
详细信息
A hands free telephone application has been implemented on a low-cost, custom-configured, block-floating-point digital signal processor (DSP). The application consists of an acoustic echo canceller and a spectral subtraction (SS) based noise suppressor. The objective of pursuing a custom configuration was the minimization of hardware cost for the given application. For this objective, implementation has been carried out though a software / hardwarecodesign design flow on a resizable DSP platform. The intention of exploiting block-floating-point as arithmetic was to remove the burden of time-consuming fixed-point model development, while employing an inexpensive fixed-point DSP. This paper describes an implementation of the application. Signal processing quality evaluation results are also presented for some critical computation modules in the application.
Today's systems-on-Chip have reached a complexity that demands high-level modelling for both design and verification. By raising the level of abstraction and supporting seamlessness in the methodology new design f...
详细信息
ISBN:
(纸本)0769521258
Today's systems-on-Chip have reached a complexity that demands high-level modelling for both design and verification. By raising the level of abstraction and supporting seamlessness in the methodology new design flows increases the productivity. High level models described on the basis of the C/C++ language family are widely used. Introducing a new flow based on the SDL system C allows reuse of such legacy models. A refinement method and a supporting framework are presented to integrate C-code for software and hardware components into a system level model. The focus of the presentation is on the multilevel model support of the framework.
To develop an ASIP (Application Specific Instruction set Processor), development of HW (hardware) and development of SWDE (software development environments) are required. Separate develops of HW and SWDE in a short t...
详细信息
ISBN:
(纸本)0769520871
To develop an ASIP (Application Specific Instruction set Processor), development of HW (hardware) and development of SWDE (software development environments) are required. Separate develops of HW and SWDE in a short time are difficult. So HW/SW co-design system is necessary rapid develop of ASIPs. We have developed C-DASH(C-like Design Automation Shell), which is a HW/SW co-design system for designing processors based on ISA (Instruction Set Architecture). This paper describes the HW/SW co-design system C-DASH, along with a description of a java processor that directly executes Java byte code is given as an example.
As technology scales, transient faults due to single event upsets have emerged as a key challenge for reliable embedded system design. This paper proposes a design methodology that incorporates reliability into hardwa...
详细信息
ISBN:
(纸本)0769522262
As technology scales, transient faults due to single event upsets have emerged as a key challenge for reliable embedded system design. This paper proposes a design methodology that incorporates reliability into hardware-software co-design paradigm for embedded systems. We introduce an allocation and scheduling algorithm that efficiently handles conditional execution in multi-rate embedded systems, and selectively duplicates critical tasks to detect soft errors, such that the reliability of the system is increased. The increased reliability is achieved by utilizing the otherwise idle computation resources and incurs no resource or performance penalty. The proposed algorithm is fast and efficient, and is suitable for use in the inner loop of our hardware/software co-synthesis framework where the scheduling routine has to be invoked many times.
暂无评论