A hybrid approach for mapping applications represented as Directed Acyclic Graphs (DAGs) is introduced in this work. It combines the Benders decomposition principle, which integrates Integer Linear and Constraint Prog...
详细信息
ISBN:
(纸本)9781509030767
A hybrid approach for mapping applications represented as Directed Acyclic Graphs (DAGs) is introduced in this work. It combines the Benders decomposition principle, which integrates Integer Linear and Constraint Programming (ILP and CP) methods, with a pure ILP model to find optimal solutions. The cuts that are generated during the iterative Benders solution process are later exploited by the ILP solver to prune the remaining search space. The proposed model succeeds to provide the optimal solution in cases where either method alone fails to do so, while it also reduces the total solution time.
The proceedings contain 82 papers. The special focus in this conference is on Ubiquitous Computing and Ambient Intelligence. The topics include: Multi-layer security mechanism for networked embedded devices;smart citi...
ISBN:
(纸本)9783319675848
The proceedings contain 82 papers. The special focus in this conference is on Ubiquitous Computing and Ambient Intelligence. The topics include: Multi-layer security mechanism for networked embedded devices;smart cities in latin america;combining fog architectures and distributed event-based systems for mobile sensor location certification;IOT service recommendation strategy based on attribute relevance;methodology for analyzing the travel time variability in public road transport;user-centered design of agriculture automation systems using internet of things paradigm;study of dynamic factors in indoor positioning for harsh environments;a secure, out-of-band, mechanism to manage internet of things devices;secure system communication to emergencies for victims management through identity based signcryption scheme;prosumerization approach to semantic ambient intelligence platforms;modeling the origin-destination matrix with incomplete information;decision-making intelligent system for passenger of urban transports;improving tourist experience through an IOT application based on fatbeacons;protecting industry 4.0 systems against the malicious effects of cyber-physical attacks;fuzzy-based approach of concept alignment;a location-based service to support collaboration and strategic control in a real estate broker;a proposal for a distributed computational framework in IoT context;data structures modelling for citizen tracking based applications in smart cities;system model for a continuous improvement of road mass transit;analysis of distance and similarity metrics in indoor positioning based on bluetooth low energy;usability and acceptance of a mobile and cloud-based platform for supporting diabetes self-management;classification of pathologies using a vision based feature extraction.
Deep neural network algorithms show very high performance, however increased amounts of arithmetic and memory accesses hinder their adoption to embeddedsystems. This paper explores a programmable neural network proce...
详细信息
ISBN:
(纸本)9781509030767
Deep neural network algorithms show very high performance, however increased amounts of arithmetic and memory accesses hinder their adoption to embeddedsystems. This paper explores a programmable neural network processing architecture that can efficiently execute feed-forward, recurrent, and convolutional deep neural networks. The neural network algorithms are transformed to matrix-vector multiplication operations, which are then executed using a very wide SIMD (Single Instruction Multiple Data) functional unit. Especially, the functional and the data-level parallelism are compared for this architecture exploration, and an auxiliary hardware support for data rearrangement is added. The simulation results show that the architecture with a 128-wide SIMD functional unit can execute deep neural network algorithms for voice command, gesture, and handwritten digit recognition in real-time.
Next generation deep neural networks for classification hosted on embedded platforms will rely on fast, efficient, and accurate learning algorithms. Initialization of weights in learning networks has a great impact on...
详细信息
ISBN:
(纸本)9781509030767
Next generation deep neural networks for classification hosted on embedded platforms will rely on fast, efficient, and accurate learning algorithms. Initialization of weights in learning networks has a great impact on the classification accuracy. In this paper we focus on deriving good initial weights by modeling the error function of a deep neural network as a high-dimensional landscape. We observe that due to the inherent complexity in its algebraic structure, such an error function may conform to general results of the statistics of large systems. To this end we apply some results from Random Matrix Theory to analyse these functions. We model the error function in terms of a Hamiltonian in N-dimensions and derive some theoretical results about its general behavior. These results are further used to make better initial guesses of weights for the learning algorithm.
For decades computer architects pursued one primary goal: performance. Transistor scaling has translated into remarkable gains in operating frequency and reduction in power consumption. However, increased complexity f...
详细信息
ISBN:
(纸本)9781509030767
For decades computer architects pursued one primary goal: performance. Transistor scaling has translated into remarkable gains in operating frequency and reduction in power consumption. However, increased complexity from the device to architecture levels impose several new challenges, including a decrease in dependability/reliability due to physical failures. Reconfigurable platforms are highly susceptible to scaling related complexity, typically resulting in higher power consumption as compared to application-specific integrated circuits. The concern becomes far more important in the 3-D integrated circuit (IC) domain as vertically stacked blocks exhibit increased thermal resistance to the heat sink. The degradation in dependability becomes an important design challenge, not only for safety critical systems, but for the majority of architectures. In this paper, a framework used to explore alternative fault-tolerant schemes is proposed that masks the degradation in reliability for 3-D FPGA platforms. simulation results at the RTL level highlight the benefits of the introduced solution, as the maximum operating frequency and power consumption are improved by 33% and 26%, respectively, as compared to similar state-of-the-art solutions.
Microservers have recently gained attention as low-cost, low power, reduced footprint servers that are mainly based on energy efficient processors such as the ones used in embeddedsystems. Microservers based on low-p...
详细信息
ISBN:
(纸本)9781509030767
Microservers have recently gained attention as low-cost, low power, reduced footprint servers that are mainly based on energy efficient processors such as the ones used in embeddedsystems. Microservers based on low-power embedded processors are mainly targeting lightweight applications or parallel applications that benefit most from individual low-power servers with sufficient I/O between nodes rather than high performance processors. In this paper we evaluate the mapping of Apache Spark on low-power SoC-based processors. Apache Spark is one of the most widely framework in cloud computing for batch and streaming data analytics. We evaluate the energy efficiency of low-power SoCs that are used in embedded system for the execution of several Spark applications. The performance evaluation shows that low-power SoCs have the potential to offer up to 3x higher energy efficiency compared to high performance processors typically used in data centers.
The design of an adequate memory subsystem is critical for the achievement of high performance in application specific and data plane processors. Applications running on such processors must fully exploit the memory h...
详细信息
ISBN:
(纸本)9781509030767
The design of an adequate memory subsystem is critical for the achievement of high performance in application specific and data plane processors. Applications running on such processors must fully exploit the memory hierarchy, so that the gains achieved due to hardware optimization are not invalidated by software inefficiency. This paper presents a framework that vertically integrates the process of memory subsystem design with application optimization and algorithmic exploration. Based on an abstract model, the framework is able to predict the impact that a customization in the memory hierarchy will have on performance, while applying optimization techniques to efficiently utilize it. To perform the aforementioned processes, the tool flow relies on source level information and a novel data model, avoiding the necessity for expensive cycle-accurate simulations. Throughout the evaluation, we show that the framework is capable of predicting memory-related performance metrics with an accuracy of +/- 10%, when compared to simulation. Furthermore, we show that the approach is significantly more efficient than simulation, and can lead to gains in designer's productivity up to a factor of 40x.
Design space exploration (DSE) at system level needs to cover all parameters and has to find the best trade-off between performance and power of modern heterogeneous multi- and many-processor SoCs (MPSoC). Modelling v...
详细信息
ISBN:
(纸本)9781509030767
Design space exploration (DSE) at system level needs to cover all parameters and has to find the best trade-off between performance and power of modern heterogeneous multi- and many-processor SoCs (MPSoC). Modelling virtual platforms with SystemC TLM offers fast HW and SW co-design using the loosely-timed (LT) coding style. However, simulations at this high abstraction level lack the capability of providing power estimates in case no insight into the models of the virtual platform is possible. This paper extends a well-proven black box power estimation methodology. The proposed method is capable of estimating the power with high accuracy using fast LT modelling. Two case studies reveal average estimation errors of just 5.1% and 3.5% for the ARM Cortex-A9 on the PandaBoard and the Blackfin 609 DSP on the FinBoard, respectively.
Quantum computing has been attracting increasing attention in recent years because of the rapid advancements that have been made in quantum algorithms and quantum system design. Quantum algorithms are implemented with...
详细信息
ISBN:
(纸本)9781509030767
Quantum computing has been attracting increasing attention in recent years because of the rapid advancements that have been made in quantum algorithms and quantum system design. Quantum algorithms are implemented with the help of quantum circuits. These circuits are inherently reversible in nature and often contain a sizeable Boolean part that needs to be synthesized. The logic design of such quantum circuits constitutes a non-trivial task and, hence, have heavily been investigated by researchers in the recent past. This paper provides a brief overview of these research. We review the major steps to be conducted in the logic design of quantum circuits and provide a sketch for each single step. These descriptions are enriched with discussions as well as references to the respective related work.
We present a novel approach that assists the task of porting code to an embedded platform. Our tool automatically identifies code segments in the input program that can be replaced with optimized kernels from a platfo...
详细信息
暂无评论