The globalization of the IC supply chain has brought forth the era of fabless companies. Due to security issues during design and fabrication processes, various security concerns have risen, ranging from IP piracy and...
详细信息
ISBN:
(纸本)9781450364942
The globalization of the IC supply chain has brought forth the era of fabless companies. Due to security issues during design and fabrication processes, various security concerns have risen, ranging from IP piracy and reverse engineering to hardware Trojans. Logic encryption has emerged as a mitigation against these threats. However, no generic metrics for quantifying the security of logic encryption algorithms has been reported so far, making it impossible to formally compare diferent approaches. In this paper, we propose a unifying metric, capturing the key security aspects of logic encryption algorithms. The metric is evaluated on state-of-the- art algorithms and benchmarks.
Deep neural networks have been widely applied in many areas, such as computer vision, natural language processing and information retrieval. However, due to the high computation and memory demands, deep learning appli...
详细信息
ISBN:
(纸本)9781538634370
Deep neural networks have been widely applied in many areas, such as computer vision, natural language processing and information retrieval. However, due to the high computation and memory demands, deep learning applications have not been adopted in edge learning. In this paper, we exploit the sparsity in tensors to reduce the computation overheads and memory demands. Unlike other approaches which rely on hardware accelerator designs or sacrifice model accuracy for the performance by pruning parameters, we adaptively partition and deploy the workload to heterogeneous devices to reduce computation and memory requirements and increase computing efficiency. We had implemented our partitioning algorithms in Google's TensorFlow and evaluated on an AMD Kaveri system, which is an HSA-based heterogeneous computing system. Our method has effectively reduced the computation time, cache accesses, and cache miss rates, without impacting the accuracy of the learning models. Our approach achieves 66% and 88% speedup for the lenet-5 model and the lenet-1024-1024 model, respectively. For reducing memory traffic, our approach reduces 71% instruction cache references, 32% data cache references. Our system has also improved cache miss rate from 1.6% to 0.5% during the training of the lenet-1024-1024 model.
Processor models for electronic system level (ESL) simulations are usually provided by their vendors as binary object code. Those binaries appear as black boxes, which do not allow to observe their internals. This pre...
详细信息
ISBN:
(纸本)9781467373111
Processor models for electronic system level (ESL) simulations are usually provided by their vendors as binary object code. Those binaries appear as black boxes, which do not allow to observe their internals. This prevents the application of most existing ESL power estimation methodologies. To remedy this situation, this work presents an estimation methodology for the case of black box models. The evaluation for the ARM Cortex-A9 processor shows that the proposed approach is able to achieve a high accuracy. In comparison to hardware power measurements obtained from the OMAP4460 chip on the PandaBoard, the ESL estimation error is below 5%.
In this paper, a new methodology is presented for topology optimization of networked embeddedsystems as they occur in automotive and avionic systems and partially in wireless sensor networks. By introducing a model w...
详细信息
ISBN:
(纸本)1424401550
In this paper, a new methodology is presented for topology optimization of networked embeddedsystems as they occur in automotive and avionic systems and partially in wireless sensor networks. By introducing a model which is (1.) suitable for heterogeneous networks with different communication bandwidths, (2.) modeling of routing restrictions, and (3.) flexible binding of tasks onto processors, current design issues of networked embeddedsystems can be investigated. On the basis of this model, the presented methodology firstly allocates the required resources which can be communication links as well as computational nodes and secondly binds the functionality onto the nodes and the data dependencies onto the links such that no routing restrictions will be violated or capacities on communication links will be exceeded. By applying Evolutionary Algorithms, we are able to consider multiple objectives simultaneously during the optimization process and allow for a subsequent unbiased decision making. An experimental evaluation as well as a demonstration of a case study from the field of automotive electronics will show the applicability of the presented approach.
The objective of this paper is to analyze how the memory management configuration in Linux influences run-time performance of embeddedsystems. Extensive experiments confirm that the configuration of the memory manage...
详细信息
ISBN:
(纸本)9783540736226
The objective of this paper is to analyze how the memory management configuration in Linux influences run-time performance of embeddedsystems. Extensive experiments confirm that the configuration of the memory management subsystem significantly affects the overall execution time, the memory performance, and the system call overhead. Our quantitative experimental results will help embeddedsystems designers to understand the effect of memory management configurations on the applications within a system, and contribute to the design of more efficient systems with an OS-level design space exploration.
This paper provides two contributions to the research on applying domain-specific modeling languages to distributed real-time embedded (DRE) systems. First, we present the ALDERIS platform-independent visual language ...
详细信息
ISBN:
(纸本)3540364102
This paper provides two contributions to the research on applying domain-specific modeling languages to distributed real-time embedded (DRE) systems. First, we present the ALDERIS platform-independent visual language for component-based system development. Second, we demonstrate the use of the ALDERIS language on a helicopter autopilot DRE design. The ALDERIS language is based on the concept of platform-based design, and explicitly captures asynchronous event-driven component interactions as well as the underlying platform for the computation. Unlike most modeling languages, ALDERIS has formally defined semantics providing a way for the formal verification of dense real-time properties and energy consumption.
Many modern computing systems deal with streams of data, which have to be processed in parallel in order to be handled in real-time. This is in particular the case for some kind of cyber physical systems, which proces...
详细信息
ISBN:
(纸本)9781467322973;9781467322966
Many modern computing systems deal with streams of data, which have to be processed in parallel in order to be handled in real-time. This is in particular the case for some kind of cyber physical systems, which process data provided by physical devices. We consider here an approach to generate efficient hardware for-a particular class of-such systems, which relies upon the polyhedral model. Flexible parallel components, described by the ALPHA functional language, are modelled and assembled using a scheduling method which combines the synchronous data-flow principle of balance equations, and the polyhedral scheduling technique. The modelling of flexible components relies on a simple, affine-periodic, delayable and stretchable time model, which allows a full system to be assembled and synthesized by combining the component hardware descriptions with automatically generated wrappers. We illustrate this method on a simplified WCDMA system and we discuss the relationship of this approach with stream languages, latency-insensitive design, and multidimensional data-flow systems.
Next generation deep neural networks for classification hosted on embedded platforms will rely on fast, efficient, and accurate learning algorithms. Initialization of weights in learning networks has a great impact on...
详细信息
ISBN:
(纸本)9781509030767
Next generation deep neural networks for classification hosted on embedded platforms will rely on fast, efficient, and accurate learning algorithms. Initialization of weights in learning networks has a great impact on the classification accuracy. In this paper we focus on deriving good initial weights by modeling the error function of a deep neural network as a high-dimensional landscape. We observe that due to the inherent complexity in its algebraic structure, such an error function may conform to general results of the statistics of large systems. To this end we apply some results from Random Matrix Theory to analyse these functions. We model the error function in terms of a Hamiltonian in N-dimensions and derive some theoretical results about its general behavior. These results are further used to make better initial guesses of weights for the learning algorithm.
This paper discusses efficiency measures for the evaluation of high performance multimedia systems on a chip (SOC), considering a throughput rate R, chip size A, power dissipation P, and a flexibility criterion F. Bas...
详细信息
ISBN:
(纸本)9783540736226
This paper discusses efficiency measures for the evaluation of high performance multimedia systems on a chip (SOC), considering a throughput rate R, chip size A, power dissipation P, and a flexibility criterion F. Based on the analysis of recently published multimedia chips, the paper shows equivalences between the ratio of R over AP, a weighted sum on 1/R, A P, and a fuzzy multicriteria analysis on R, A, P. The paper indicates the fuzzy multicriteria analysis as generalization of the other efficiency measures, which can be easily applied to multiple cost and performance criteria. Because of the application of fuzzy set theory, the multicriteria approach supports quantitative criteria with a physical background as well as qualitative criteria by linguistic variables.
Recently introduced processors such as Tilera's Tile Gx100 and Intel's 48-core SCC have delivered on the promise of high performance per watt in manycore processors, making these architectures ostensibly as at...
详细信息
ISBN:
(纸本)9781479901036
Recently introduced processors such as Tilera's Tile Gx100 and Intel's 48-core SCC have delivered on the promise of high performance per watt in manycore processors, making these architectures ostensibly as attractive for low-power embedded processors as for cloud services. However, these architectures space-multiplex the microarchitectural resources between many threads to increase utilization, which leads to potentially large and varying levels of interference. This decorrelates CPU-time from actual application progress and decreases the ability of traditional software to accurately track and finely control application progress, hindering the adoption of manycore processors in embedded computing. In this paper we propose Progress Time as the counterpart of CPU-time in space-multiplexed systems and show how it can be used to track application progress. We also introduce TimeCube, a manycore embedded processor that uses dynamic execution isolation and shadow performance modeling to provide an accurate online measurement of each application's Progress Time. Our evaluation shows that a 32-core TimeCube processor can track application progress with less than 1% error even in the presence of a 6x average worst-case slowdown. TimeCube also uses Progress Times to perform online architectural resource management that leads to a 36% improvement in throughput compared to existing microarchitectural resource allocation schemes. Overall, the results argue for adding the requisite microarchitectural structures to support Progress Time in manycore chips for embeddedsystems.
暂无评论