Most common real-time embedded programming languages provide a means to specify functionality;however, they have few constructs to specify precise timing constraints. LabVIEW is one example ( fa graphical programming ...
详细信息
ISBN:
(纸本)9780769534251
Most common real-time embedded programming languages provide a means to specify functionality;however, they have few constructs to specify precise timing constraints. LabVIEW is one example ( fa graphical programming language that supports timing specifications in the form of timed-loops. In this work, we present a plug-in for LabVIEW Embedded that maps the LabVIEW G graphical programming language and its timing specifications to the PREcision Timed machine (PRET), an architecture that exposes timing instructions in its instruction set architecture. We demonstrate the use of the plug-in with a simple producer/consumer example that uses timing to enforce synchronization.
High performance computing with low, cost machines becomes a reality. As an example, the Sony playstation3 gaining console offers performances up to 150 gflops for a machine's retail price of $400. Unfortunately, ...
详细信息
ISBN:
(纸本)9780769532431
High performance computing with low, cost machines becomes a reality. As an example, the Sony playstation3 gaining console offers performances up to 150 gflops for a machine's retail price of $400. Unfortunately, higher performances are achieved when the prograrammer exploits the architectural specificities of its Cell processor: he has to focus on inter-processor communications, task allocations among the processors. task scheduling, external memory prefetching, and synchronizatoion. In this paper, we propose and evaluate a compile flow that automates the transformation of a program expressed with the high level system design language SystemC used as a programming model, to its implementation on the Cell processor. SystemC constructs and scheduler are directly mapped to the Cell API, preserving their semantic. Inter-processor and external memory communications are abstracted by means of systemC channels. We illustrate the approach on two case studies implemented on a Sony Playstation 3.
The constant race for faster and more powerful CPUs is drawing to a close. No longer is it feasible to significantly increase the speed of the CPU without paying a crushing penalty in power consumption and production ...
详细信息
The constant race for faster and more powerful CPUs is drawing to a close. No longer is it feasible to significantly increase the speed of the CPU without paying a crushing penalty in power consumption and production costs. Instead of increasing single thread performance, the industry is turning to multiple CPU threads or cores (such as SMT and CMP) and heterogeneous CPU architectures (such as the Cell Broadband Engine). While this is a step in the right direction, in every modern PC there is a wealth of untapped compute resources. The NIC has a CPU;the disk controller is Programmable;some high-end graphics adapters are already more powerful than host CPUs. Some of these CPUs can perform some functions more efficiently than the host CPUs. Our operating systems and programming abstractions should be expanded to let applications tap into these computational resources and make the best use of them. Therefore, we propose the HYDRA framework, which lets application developers use the combined power of every compute resource in a coherent way. HYDRA is a programming model and a runtime support layer which enables utilization of host processors as well as various Programmable peripheral devices' processors. We present the framework and its application for a demonstrative use-case, as well as provide a thorough evaluation of its capabilities. Using HYDRA We were able to cut down the development cost of a system that uses multiple heterogenous compute resources significantly.
Multi Processor SoC (MPSoC) are being designed today. MPSoC design can help achieve aggressive performance and low power targets but it creates new design challenges: How to design the interconnect fabric and memory s...
详细信息
ISBN:
(纸本)9781605581156
Multi Processor SoC (MPSoC) are being designed today. MPSoC design can help achieve aggressive performance and low power targets but it creates new design challenges: How to design the interconnect fabric and memory sub-system to allow the massive data movement required in a multi processor SoC environment? How to develop, debug and verify HW and SW functionality in a MPSoC design? Is MPSoC design an inflection point that will require new design methods including ESL methodologies?
Network processors are designed to handle the inherently parallel nature of network processing applications. However, partitioning and scheduling of application tasks and data allocation to reduce memory contention re...
详细信息
Network processors are designed to handle the inherently parallel nature of network processing applications. However, partitioning and scheduling of application tasks and data allocation to reduce memory contention remain as major challenges in realizing the full performance potential of a given network processor. The large variety of processor architectures in use and the increasing complexity of network applications further aggravate the problem. This work proposes a novel framework, called FEADS, for automating the task of application partitioning and scheduling for network processors. FEADS uses the simulated annealing approach to perform design space exploration of application mapping onto processor resources. Further, it uses cyclic and r-periodic scheduling to achieve higher throughput schedules. To evaluate dynamic performance metrics such as throughput and resource utilization under realistic workloads, FEADS automatically generates a Petri net (PN) which models the application, architectural resources, mapping and the constructed schedule and their interaction. The throughput obtained by schedules constructed by FEADS is comparable to that obtained by manual scheduling for linear task flow graphs;for more complicated task graphs, FEADS' schedules have a throughput which is upto 2.5 times higher compared to the manual schedules. Further, static scheduling of tasks results in an increase in throughput by upto 30% compared to an implementation of the same mapping without task scheduling.
Despite the fact that Grid computing is the main theme of distributed computing research during the last few years, programming on the Grid is still a huge difficulty to normal users. The POP-C++ programming system ha...
详细信息
Despite the fact that Grid computing is the main theme of distributed computing research during the last few years, programming on the Grid is still a huge difficulty to normal users. The POP-C++ programming system has been built to provide Grid programming facilities which greatly ease the development and the deployment of parallel applications on the Grid. The original parallel object model used in POP-C++ is a combination of powerful features of object-oriented programming and of high-level distributed programming capabilities. The model is based on the simple idea that objects are suitable structures to encapsulate and to distribute heterogeneous data and computing elements over the Grid. Programmers can guide the resource allocation for each object through the high-level resource descriptions. The object creation process, supported by the POP-C++ runtime system, is transparent to programmers. Both inter-object and intra-object parallelism are supported through various method invocation semantics. The POP-C++ programming language extends C++ to support the parallel object model with just a few new keywords. In this paper, we present the Grid programming aspects of POP-C++. With POP-C++, writing a Grid-enabled application becomes as simple as writing a sequential C++ application. (C) 2006 Elsevier B.V. All rights reserved.
A method for modeling the parallel machine scheduling problems with fuzzy parameters and precedence constraints based on credibility measure is provided. For the given n jobs to be processed on m machines, it is assum...
详细信息
A method for modeling the parallel machine scheduling problems with fuzzy parameters and precedence constraints based on credibility measure is provided. For the given n jobs to be processed on m machines, it is assumed that the processing times and the due dates are nonnegative fuzzy numbers and all the weights are positive, crisp numbers. Based on credibility measure, three parallel machine scheduling problems and a goal-programming model are formulated. Feasible schedules are evaluated not only by their objective values but also by the credibility degree of satisfaction with their precedence constraints. The genetic algorithm is utilized to find the best solutions in a short period of time. An illustrative numerical example is also given. Simulation results show that the proposed models are effective, which can deal with the parallel machine scheduling problems with fuzzy parameters and precedence constraints based on credibility measure.
In this paper, the author propose viewing a program as a collection of "free" objects, each of which has its own thread of control and executes its operations within atomic transactions. Such objects would c...
详细信息
In this paper, the author propose viewing a program as a collection of "free" objects, each of which has its own thread of control and executes its operations within atomic transactions. Such objects would communicate by asynchronous message passing with futures - objects that encapsulate the results of server invocations and let clients retrieve them while hiding the actual state of readiness from the client. Except for the unleashing of newly created free objects, theres no need for specific language constructs other than those already available in 00 languages. After all, C++ has been successful in part because it looked like C, Java has been successful in part because it looked like C++, and C# looks like both. With free objects, the program looks exactly like a traditional OO program, with one exception: certain object creations fork new activity threads and must be distinguished as such. This column discusses how to do that.
This paper presents a framework to design a shared memory multiprocessor on a programmable platform. We propose a complete flow, composed by a programming model and a template architecture. Our framework permits to wr...
详细信息
ISBN:
(纸本)9781595936059
This paper presents a framework to design a shared memory multiprocessor on a programmable platform. We propose a complete flow, composed by a programming model and a template architecture. Our framework permits to write a parallel application by using a shared memory model. It deals with the consistency of shared data, with no need of hardware coherence protocol, but uses a software model to properly synchronize the local copies with the shared memory image. This idea can be applied both to a scratchpad-based architecture or a cache-based one. The architecture is synthesizable with standard IPs, such as the softcores and interconnect elements, which may be found in any commercial FPGA toolset.
An embedded parallel system for image processing is described in this *** system is used to process real-time *** system is composed by an embedded parallel computer and a set of software *** write image processing pr...
详细信息
ISBN:
(纸本)9780972147903
An embedded parallel system for image processing is described in this *** system is used to process real-time *** system is composed by an embedded parallel computer and a set of software *** write image processing programs in a C-like *** to a programming model a set of software tools maps these programs to code that can run on the embedded parallel *** embedded parallel computer includes a host processor,a SIMD coprocessor,a stream memory and a controller of the stream memory,together providing an embedded parallel system for image processing.
暂无评论