Computing in cyber-physical systems has to be efficient in terms of a number of objectives. In particular, computing has to be execution-time and energy efficient. In this paper, we will consider optimization techniqu...
详细信息
ISBN:
(纸本)9781467322973;9781467322966
Computing in cyber-physical systems has to be efficient in terms of a number of objectives. In particular, computing has to be execution-time and energy efficient. In this paper, we will consider optimization techniques aiming at efficiency in terms of these two objectives. In the first part, we will consider techniques for the integration of compilers and worst-case execution time (WCET) estimation. We will demonstrate, how such integration opens the door to WCET-reduction algorithms. For example, an algorithm for WCET-aware compilation reduces the WCET for an automotive application by more than 50% by exploiting scratch pad memories (SPMs). In the second part, we will demonstrate techniques for improving the energy efficiency of cyber-physical systems, in particular the use of SPMs. In the third part, we demonstrate how the optimization for multiple objectives taken into account. This paper provides an overview of work performed at the Chair for embeddedsystems of TU Dortmund and the Informatik Centrum Dortmund, Germany(1).
Many modern computing systems deal with streams of data, which have to be processed in parallel in order to be handled in real-time. This is in particular the case for some kind of cyber physical systems, which proces...
详细信息
ISBN:
(纸本)9781467322973;9781467322966
Many modern computing systems deal with streams of data, which have to be processed in parallel in order to be handled in real-time. This is in particular the case for some kind of cyber physical systems, which process data provided by physical devices. We consider here an approach to generate efficient hardware for-a particular class of-such systems, which relies upon the polyhedral model. Flexible parallel components, described by the ALPHA functional language, are modelled and assembled using a scheduling method which combines the synchronous data-flow principle of balance equations, and the polyhedral scheduling technique. The modelling of flexible components relies on a simple, affine-periodic, delayable and stretchable time model, which allows a full system to be assembled and synthesized by combining the component hardware descriptions with automatically generated wrappers. We illustrate this method on a simplified WCDMA system and we discuss the relationship of this approach with stream languages, latency-insensitive design, and multidimensional data-flow systems.
While advances in processor architecture continues to increase hardware parallelism, parallel software creation is hard. There is an increasing need for tools and methodologies to narrow the entry gap for non-experts ...
详细信息
ISBN:
(纸本)9781467322973;9781467322966
While advances in processor architecture continues to increase hardware parallelism, parallel software creation is hard. There is an increasing need for tools and methodologies to narrow the entry gap for non-experts in parallel software development as well as to streamline the work for experts. This paper presents the methodology and algorithms for the creation of parallel software written in Scilab source code for multicore embedded processors in the context of the "Architecture oriented paraLlelization for high performance embedded Multicore systems using scilAb" (ALMA) EU FP7 project. The ALMA parallelization approach in a nutshell attempts to manage the complexity of the task by alternating focus between very localized and holistic view program optimization strategies.
This paper introduces a methodology for forward error correction (FEC) architectures prototyping, oriented to system verification and characterization. A complete design flow is described, which satisfies the requirem...
详细信息
ISBN:
(纸本)9781467322973;9781467322966
This paper introduces a methodology for forward error correction (FEC) architectures prototyping, oriented to system verification and characterization. A complete design flow is described, which satisfies the requirement for error-free hardware design and acceleration of FEC simulations. FPGA devices give the designer the ability to observe rare events, due to tremendous speed-up of FEC operations. A Matlab-based system assists the investigation of the impact of very rare decoding failure events on the FEC system performance and the finding of solutions which aim to parameters optimization and BER performance improvement of LDPC codes in the error floor region. Furthermore, the development of an embedded system, which offers remote access to the system under test and verification process automation, is explored. The presented here prototyping approach exploits the high-processing speed of FPGA-based emulators and the observability and usability of software-based models.
Smart cameras allow pre-processing of video data on the camera instead of sending it to a remote server for further analysis. Having a network of smart cameras allows various vision tasks to be processed in a distribu...
详细信息
Smart cameras allow pre-processing of video data on the camera instead of sending it to a remote server for further analysis. Having a network of smart cameras allows various vision tasks to be processed in a distributed fashion. While cameras may have different tasks, we concentrate on distributed tracking in smart camera networks. This application introduces various highly interesting problems. Firstly, how can conflicting goals be satisfied such as cameras in the network try to track objects while also trying to keep communication overhead low? Secondly, how can cameras in the network self adapt in response to the behavior of objects and changes in scenarios, to ensure continued efficient performance? Thirdly, how can cameras organise themselves to improve the overall network's performance and efficiency? This paper presents a simulation environment, called CamSim, allowing distributed self-adaptation and self-organisation algorithms to be tested, without setting up a physical smart camera network. The simulation tool is written in Java and hence allows high portability between different operating systems. Relaxing various problems of computer vision and network communication enables a focus on implementing and testing new self-adaptation and self-organisation algorithms for cameras to use.
Epileptic detection techniques rely heavily on the Electroencephalography (EEG) as a representative signal carrying valuable information pertaining to the current brain state. In this work, we investigate the stabilit...
详细信息
Epileptic detection techniques rely heavily on the Electroencephalography (EEG) as a representative signal carrying valuable information pertaining to the current brain state. In this work, we investigate the stability of time domain EEG features while varying the channel conditions. We identify the feature sets that would provide the most robust EEG classification accuracy. Moreover, an embedded Compressive Sensing (CS)-based EEG encoding system whose complexity is adapted to the channel condition is proposed. We also propose a framework called Classification Accuracy-Compression Ratio-Signal to Noise Ratio (CA-CR-SNR) that adapts compression ratio according to the channel condition. simulation results show that selecting appropriate EEG feature combinations can relatively overcome the impact of bad channel conditions; however, this simple solution is still inadequate. The proposed adaptive algorithm reconfigures the compression ratio based on a channel feedback signal to further improve the classification accuracy.
FPGA-based prototyping is nowadays common practice in the functional verification of hardware components since it allows to cover a large number of test cases in a shorter time compared to HDL simulation. In addition,...
详细信息
ISBN:
(纸本)9781467322973;9781467322966
FPGA-based prototyping is nowadays common practice in the functional verification of hardware components since it allows to cover a large number of test cases in a shorter time compared to HDL simulation. In addition, an FPGA-based emulator significantly accelerates the simulation with respect to bit-true software models. This speed-up is crucial when the statistical properties of a system have to be analyzed by Monte Carlo techniques. In this paper we consider a multiple-input multiple-output (MIMO) wireless communication system and show how integrating an FPGA accelerator in the software simulation framework is key to enable the development of complex hardware components in the receiver, from algorithm all the way to chip testing. In particular, we focus on a MIMO detector implementation based on the depth-first sphere decoding algorithm. The speed-up of up to 3 orders of magnitude achieved by hardware-accelerated simulation compared to a pure software testbed enables an extensive fixed-point exploration. Furthermore, it allows a unique characterization of the system communication performance and the MIMO detector run-time characteristics, which vary for different configuration parameters and operating scenarios and hence require a thorough investigation.
systems with tightly interacting computational (cyber) units and physical systems are generally referred to as cyber-physical systems. They involve an interplay between embeddedsystems, control theory, real-time syst...
详细信息
ISBN:
(纸本)9781467322973;9781467322966
systems with tightly interacting computational (cyber) units and physical systems are generally referred to as cyber-physical systems. They involve an interplay between embeddedsystems, control theory, real-time systems and software engineering. A very good example of cyber-physical systems design arises in the context of automotive architectures and software. Modern high-end cars have 50-100 processors or electronic control units (ECUs) that communicate over a network of buses such as CAN and FlexRay. In such complex settings, traditional control-theoretic approaches - where control engineers are only concerned with high-level plant and controller models - start breaking down. This is because implementation-level realities such as message delay, jitter, and task execution times are not adequately considered when designing the controller. Hence, it is becoming necessary to adopt a more holistic, cyber-physical systems design approach where the semantic gap between high-level control models and their actual implementations on multiprocessor automotive platforms is quantified and consciously closed. In this paper we give several examples on how this may be done and the current research challenges in this area that are being faced by the academia and the industry.
Due to energy efficiency requirements of modern embeddedsystems, chip vendors are inclined towards multicore architectures with different types of processing engines and non-uniform interconnect fabrics. At the same ...
详细信息
ISBN:
(纸本)9781467322973;9781467322966
Due to energy efficiency requirements of modern embeddedsystems, chip vendors are inclined towards multicore architectures with different types of processing engines and non-uniform interconnect fabrics. At the same time multiple applications are intended to run concurrently on the devices with such heterogeneous architectures. This rapid growth in the complexity of the hardware and its use cases imposes new challenges on the software development tools. To overcome this complexity, model of computation based approaches are becoming increasingly promising. Synchronous Data Flow (SDF) is a popular specification formalism for streaming applications with inherently concurrent nature. However, the parallelism expressed in the original representation is often not sufficient to maximally exploit the potential of multicore platforms. In this paper we present a holistic methodology for improving the throughput of streaming applications while mapping them onto heterogeneous architectures. The approach uses transformations that adapt the parallelism in SDF according to available platform resources. We use a genetic algorithm to explore SDF instances with the objective of maximizing throughput on a target platform. Our model supports architecture heterogeneity and multi-application scenarios. The experiments indicate that our approach outperforms other techniques for exploiting parallelism on a single application in most of the test cases and enables concurrent applications optimization.
In the past years, research and industry have introduced several parallel programming models to simplify the development of parallel applications. A popular class among these models are task-based programming models w...
详细信息
ISBN:
(纸本)9781467322973
In the past years, research and industry have introduced several parallel programming models to simplify the development of parallel applications. A popular class among these models are task-based programming models which proclaim ease-of-use, portability, and high performance. A novel model in this class, OpenMP Superscalar, combines advanced features such as automated runtime dependency resolution, while maintaining simple pragma-based programming for C/C++. OpenMP Superscalar has proven to be effective in leveraging parallelism in HPC workloads. embedded and consumer applications, however, are currently still mainly parallelized using traditional thread-based programming models. In this work, we investigate how effective OpenMP Superscalar is for embedded and consumer applications in terms of usability and performance. To determine the usability of OmpSs, we show in detail how to implement complex parallelization strategies such as ones used in parallel H. 264 decoding. To evaluate the performance we created a collection of ten embedded and consumer benchmarks parallelized in both OmpSs and Pthreads.
暂无评论