In an MP-SoC environment, a customized run-time management should be incorporated on top of the basic OS services to globally optimize costs (e.g. energy consumption) across all active applications, according to const...
详细信息
ISBN:
(纸本)1424401550
In an MP-SoC environment, a customized run-time management should be incorporated on top of the basic OS services to globally optimize costs (e.g. energy consumption) across all active applications, according to constraints (e.g. performance, user requirements) and available platform resources. To that end, we have proposed a Pareto-based approach combining a design-time application mapping and platform exploration with a low-complexity run-time manager. this allows to alleviate the OS in its run-time decisison making and to avoid conservative worst-case assumptions. In this paper, we focus on the characterization of the Pareto-based application specification, resulting from our design-time exploration. this specification is essential as input for our run-time manager. A representative video codec multimedia application, simulated on our NIP-SoC platform simulator, is used as case study. For the resulting Pareto-based specification, both binary size and performance overhead is negligible.
Digital signal processing (DSP) applications involve processing long streams of input data. It is important to take into account this form of processing when implementing embedded software for DSP systems. Task-level ...
详细信息
Digital signal processing (DSP) applications involve processing long streams of input data. It is important to take into account this form of processing when implementing embedded software for DSP systems. Task-level vectorization, or block processing, is a useful dataflow graph transformation that can significantly improve execution performance by allowing subsequences of data items to be processed through individual task invocations. In this way, several benefits can be obtained, including reduced context switch overhead, increased memory locality, improved utilization of processor pipelines, and use of more efficient DSP oriented addressing modes. On the other hand, block processing generally results in increased memory requirements since it effectively increases the sizes of the input and output values associated with processing tasks. In this paper, we investigate the memory-performance trade-off associated with block processing. We develop novel block processing algorithms that carefully take into account memory constraints to achieve efficient block processing configurations within given memory space limitations. Our experimental results indicate that these methods derive optimal memory-constrained block processing solutions most of the time. We demonstrate the advantages of our block processing techniques on practical kernel functions and applications in the DSP domain.
Traditional design techniques for embeddedsystems apply transformations on the source code to optimize hardware-related cost factors. Unfortunately, such transformations cannot adequately deal withthe highly dynamic...
详细信息
Traditional design techniques for embeddedsystems apply transformations on the source code to optimize hardware-related cost factors. Unfortunately, such transformations cannot adequately deal withthe highly dynamic nature of today's multimedia applications. therefore, we go one step back in the design process. Starting from a conceptual UML model, we first transform the model before refining it into executable code. this paper presents: various model transformations, an estimation technique for the steering cost parameters, and three case studies that show how our model transformations result in factors improvement in memory footprint and performance with respect to the initial implementation. (c) 2006 Elsevier B.V. All rights reserved.
Platform architectures for modem embeddedsystems are increasingly heterogeneous and parallel. Early design decisions, such as the allocation of hardware resources and the partitioning of functionality onto architectu...
详细信息
ISBN:
(纸本)354026969X
Platform architectures for modem embeddedsystems are increasingly heterogeneous and parallel. Early design decisions, such as the allocation of hardware resources and the partitioning of functionality onto architecture building blocks, become even more complex and important for the resulting design quality. To effectively support designers during the concept phase we base our design flow SystemQ on queuing systems. We show how by starting with a performance model the system's behavior and structure can be refined systematically. SystemQ is implemented in SystemC and seamlessly supports the refinement of SystemQ models down to established transaction and RT levels. Compared with existing approaches, SystemQ's formalism exposes transaction scheduling as one key aspect of the system's performance and allows the modeling of time and resource workload-dependent behavior. A case study underpins the usefulness of SystemQ's approach by evaluating a network access platform at three refinement levels.
In multitasking, priority-driven systems, resource access-control protocols such as Priority Ceiling Protocol (PCP) reduce the undesirable effects of resource contention. In general, software implementation of these p...
详细信息
ISBN:
(纸本)354026969X
In multitasking, priority-driven systems, resource access-control protocols such as Priority Ceiling Protocol (PCP) reduce the undesirable effects of resource contention. In general, software implementation of these protocols entails costly computations that can degrade the system performance to unacceptable levels. In this paper, we present the design for a hardware-accelerator to execute the PCP functionality for controlling access to multiple-unit resources and illustrate that the proposed implementation accelerates the execution time by a factor of up to 30.
modeling and simulation of Multiphysics Multisclae systems (SMMS) poses a grand challenge to computational science. To adequately simulate numerous intertwined processes characterized by different spatial and temporal...
详细信息
ISBN:
(纸本)9783642019692
modeling and simulation of Multiphysics Multisclae systems (SMMS) poses a grand challenge to computational science. To adequately simulate numerous intertwined processes characterized by different spatial and temporal scales spanning many orders of magnitude, sophisticated models and advanced computational techniques are required. the aim of the SMMS workshop is to encourage and review the progress in this multidisciplinary research field. this short paper describes the scope of the workshop and gives pointers to the papers reflecting the latest developments in the field.
the design of appropriate communication architectures for complex systems-on-Chip (SoC) is a challenging task. One promising alternative to solve these problems are Networks-on-Chip (NoCs). Recently, the application o...
详细信息
ISBN:
(纸本)354026969X
the design of appropriate communication architectures for complex systems-on-Chip (SoC) is a challenging task. One promising alternative to solve these problems are Networks-on-Chip (NoCs). Recently, the application of deterministic and stochastic Petri-Nets (DSPNs) to model on-chip communication has been proven to be an attractive method to evaluate and explore different communication aspects. In this contribution the modeling of basic NoC communication scenarios featuring different processor cores, network topologies and communication schemes is presented. In order to provide a test bed for the verification of modeling results a state-of-the-art FPGA-platform has been utilized. this platform allows to instantiate a soft-core processor network which can be adapted in terms of communication network topologies and communication schemes. It will be shown that DSPN modeling yields good prediction results at low modeling effort. Different DSPN modeling aspects in terms of accuracy and computational effort are discussed.
the emergence of programmable logic devices as processing platforms for digital signal processing applications poses challenges concerning rapid implementation and high level optimization of algorithms on these platfo...
详细信息
ISBN:
(纸本)354026969X
the emergence of programmable logic devices as processing platforms for digital signal processing applications poses challenges concerning rapid implementation and high level optimization of algorithms on these platforms. this paper describes Abhainn, a rapid implementation methodology and toolsuite for translating an algorithmic expression of the system to a working implementation on a heterogeneous multiprocessor/field programmable gate array platform, or a standalone system on programmable chip solution. Two particular focuses for Abhainn are the automated but configurable realisation of inter-processor communuication fabrics, and the establishment of novel dedicated hardware component design methodologies allowing algorithm level transformation for system optimization. this paper outlines the approaches employed in boththese particular instances.
this paper presents a UML 2.0 based design flow for real-time embeddedsystems. the flow starts with UML 2.0 application, architecture and mapping models for our TUTWLAN terminal with its medium access control protoco...
详细信息
ISBN:
(纸本)354026969X
this paper presents a UML 2.0 based design flow for real-time embeddedsystems. the flow starts with UML 2.0 application, architecture and mapping models for our TUTWLAN terminal with its medium access control protocol. As a result, the hardware/software implementation on Altera Excalibur FPGA is achieved. Implementation utilizes eCos real-time operating system, and hardware accelerators for time-critical protocol functions. the design flow is prototyped in practice showing rapid UML 2.0 application model modification, real-time protocol processing in an image transfer application, and execution monitoring.
暂无评论