Quantum computing has been attracting increasing attention in recent years because of the rapid advancements that have been made in quantum algorithms and quantum system design. Quantum algorithms are implemented with...
详细信息
ISBN:
(纸本)9781509030767
Quantum computing has been attracting increasing attention in recent years because of the rapid advancements that have been made in quantum algorithms and quantum system design. Quantum algorithms are implemented with the help of quantum circuits. These circuits are inherently reversible in nature and often contain a sizeable Boolean part that needs to be synthesized. The logic design of such quantum circuits constitutes a non-trivial task and, hence, have heavily been investigated by researchers in the recent past. This paper provides a brief overview of these research. We review the major steps to be conducted in the logic design of quantum circuits and provide a sketch for each single step. These descriptions are enriched with discussions as well as references to the respective related work.
We present a novel approach that assists the task of porting code to an embedded platform. Our tool automatically identifies code segments in the input program that can be replaced with optimized kernels from a platfo...
详细信息
A new instruction scheduling algorithm for Transport Triggered Architecture (TTA) is introduced. The proposed scheduling algorithm is based on operation-based two-level list scheduling and tries to aggressively bypass...
详细信息
ISBN:
(纸本)9781509030767
A new instruction scheduling algorithm for Transport Triggered Architecture (TTA) is introduced. The proposed scheduling algorithm is based on operation-based two-level list scheduling and tries to aggressively bypass data moves before scheduling them and resolves deadlocks by backtracking and bypassing less aggressively those moves that cause deadlocks. Compared to two earlier list schedulers for TTA processors, the proposed scheduler creates code that is on average 2.0 % and 2.2 % and best case of 15.2 % and 16.3 % faster while reducing the amount of register file reads by on average of 9.7 % and 6.9 % and best cases of 31.0 % and 19.0 %, and register file writes on average 18.0 % and 18.9 % and best cases of 48.1 % and 36.8 %. The scheduling time with the proposed scheduler is short enough for the algorithm to be used when performing design space exploration unlike in some instruction schedulers based on mathematical models. The scheduler also introduces a framework, which makes it very easy to be extended to support new optimizations.
Spin-based memory devices offer multiple benefits, like, zero standby power, fast operation speed and high write endurance. Besides for storage applications, Spin Torque Transfer (STT)-based Magnetic Tunnel Junctions ...
详细信息
ISBN:
(纸本)9781509030767
Spin-based memory devices offer multiple benefits, like, zero standby power, fast operation speed and high write endurance. Besides for storage applications, Spin Torque Transfer (STT)-based Magnetic Tunnel Junctions (MTJs) and Racetrack Memories (RMs) are also being investigated for logic operations, especially in the context of in-memory computing and neuromorphic architectures. In this paper, we propose spin-based design of encoder/decoder which can be used to reduce the power consumption of interconnect architectures in digital systems. Encoding schemes provide a useful way to reduce the power consumption in interconnects and buses. Realizing these schemes require the use of encoders and decoders. Our proposed design can be reconfigured to implement different encoding schemes. Also, the same design can be reconfigured to function either as encoder or as decoder. Our simulations show a 49.65% reduction in dynamic power consumption for 6-bit proposed reconfigurable design when compared with CMOS-only implementation. Compared to the CMOS implementations, the RM-based non-reconfigurable designs show improvements upto 1.8X and 17.34X in leakage and dynamic power respectively.
In this paper, an extension of the OVP based MPSoC simulator MPSoCSim is presented. This latter is an extension of the OVP simulator with a SystemC Network-on-Chip (NoC) allowing the modeling and evaluation of NoC bas...
详细信息
ISBN:
(纸本)9781509030767
In this paper, an extension of the OVP based MPSoC simulator MPSoCSim is presented. This latter is an extension of the OVP simulator with a SystemC Network-on-Chip (NoC) allowing the modeling and evaluation of NoC based Multiprocessor systems-on-Chip (MPSoCs). In the proposed version, this extended simulator enables the modeling and evaluation of complex clustered MPSoCs and many-cores. The clusters are compound of several independent subgroups. Each subgroup includes an OVP processor connected by a local bus to its own local memory for code, stack and heap. The subgroups being independent, the attached OVP processor model can be different from the other subgroups (ARM, MicroBlaze, MIPS,...) allowing the simulation of heterogeneous platforms. Also, each processor executes its own code. Subgroups are connected to each other through a shared bus allowing all the subgroups in the cluster to access to a shared memory. Finally, clusters are connected through a SystemC NoC supporting mesh topology with wormhole switching and different routing algorithms. The NoC is scalable and the number of subgroups in each cluster is parameterizable. For a dynamic execution, the OVP processor models support different Operating systems (OS). Also, some mechanisms are available in order to control the dynamic execution of applications on the platform. Different platforms and applications have been evaluated in terms of simulated execution time, simulation time on the host machine and number of simulated instructions.
Software defines the functionality of today's Cyber-Physical systems (CPS). Many product innovations are based on software and thus the complexity of software, even when running on platforms equipped with small mi...
详细信息
ISBN:
(纸本)9781509030767
Software defines the functionality of today's Cyber-Physical systems (CPS). Many product innovations are based on software and thus the complexity of software, even when running on platforms equipped with small microprocessors, is increasing dramatically. This calls for adequate embedded software integration testing, even before the actual hardware platform is available. The application of virtual platforms for functional validation, that allows simulating CPS running real target platform application code on a generic host computer, is currently being adopted by the industry. Since the correct behavior of a CPS not only depends on the correctness of computation but also on its timeliness, virtual platforms contain a certain notion of time. This work focuses on enhancing OVP processor models by a quasi-cycle accurate timing model. This paper demonstrates and evaluates the accuracy of the proposed timing model against real hardware measurements for the Xilinx MicroBlaze and ARM Cortex-M0 processors. Results show a mean error of 0.16% for the MicroBlaze and 0.72% for the ARM Cortex-M0 processor over all considered benchmarks, which is a clear improvement compared to previous published work.
Networked Control systems (NCS) form a key element of Cyber-Physical systems (CPS) where sensors, actuators (plants) and controller functions are spatially distributed and interconnected by communication networks. In ...
详细信息
Networked Control systems (NCS) form a key element of Cyber-Physical systems (CPS) where sensors, actuators (plants) and controller functions are spatially distributed and interconnected by communication networks. In this contribution models are developed where significant properties of communication networks such as stochastic delays, error control protocols and shared-network influences are embedded in a closed-loop control system. The resulting integrated systems are analyzed by methods of classical system-theoretic and discrete-time state analysis methods as well as by computersimulations. Fundamental insights in the properties of NCSs can already be gained through system-theoretic studies on basic architectures which provide closed-form expressions which are verified and extended by more detailed studies based on tool-supported discrete-time system state analysis and simulation-tool results. The approach allows for a more detailed detection of network protocol impacts on, e.g., the real-time behavior of NCSs when certain Service Level Agreements (SLA) have to be guaranteed as prescribed percentiles of control reaction times.
simulation tools are indispensable to computer architects. Detailed execution-driven CPU models offer high accuracy, but at the cost of simulation speed. Trace-driven simulation is widely adopted to alleviate this pro...
详细信息
ISBN:
(纸本)9781509030767
simulation tools are indispensable to computer architects. Detailed execution-driven CPU models offer high accuracy, but at the cost of simulation speed. Trace-driven simulation is widely adopted to alleviate this problem, especially for studies focusing on memory-system exploration. Ideally, trace-driven core models will mimic out-of-order processors executing full-system workloads to enable computer architects to evaluate modern systems. Additionally, to be useful to the broader community the tracing and replay models should be publicly available. However, existing trace-driven approaches are limited in their applicability and availability. We propose elastic traces in which we accurately capture data and load/store order dependencies by instrumenting a detailed out-of-order processor model. In contrast to existing work, we do not rely on offline analysis of timestamps, and instead use accurate dependency information tracked inside the processor pipeline. We thereby account for the effects of speculation and branch misprediction resulting in a more accurate trace playback. We provide a trace player that honours the dependencies and thus adapts its execution time to memory-system changes, as would the actual CPU. Compared to the detailed CPU, our trace player achieves a speed-up of 6-8 times. When modifying the memory-system parameters, the average error in absolute execution time is 7% for SPEC 2006 benchmarks on a bare metal system and 17% for HPC benchmarks on Linux. Relative performance is predicted with less than 3% error, achieving fast and accurate system performance exploration. We make this functionality available to the broader community via a widely-used open source full-system simulator.
Image processing algorithms applied on programmable embeddedsystems very often do not meet the given constraints in terms of real time capability. Mapping these algorithms to reconfigurable hardware solves this issue...
详细信息
This paper presents a fast and cycle accurate simulation environment for early power-performance analysis of multi-threaded applications targeted to symmetric multiprocessing embeddedarchitectures. Our simulation env...
详细信息
ISBN:
(纸本)9781450344838
This paper presents a fast and cycle accurate simulation environment for early power-performance analysis of multi-threaded applications targeted to symmetric multiprocessing embeddedarchitectures. Our simulation environment leverages the hybrid prototyping technique, where a lightweight emulation kernel performs logical simulation of multiple identical cores on top of a single physical instance of a core. The technique does not require a detailed timing model of the core hardware because the application threads execute directly on the target core. Previous work on hybrid prototyping supported modeling of only statically scheduled threads, thereby severely limiting its modeling capabilities. In this work, we describe the modeling of dynamic RTOS scheduler as well as hardware interrupts on top of the emulation kernel, in order to support the simulation of unmodified multi-threaded applications. Our experimental results demonstrate the high accuracy, simulation speed and scalability of our hybrid prototyping-based simulation models.
暂无评论