作者:
Svore, KMAho, AVCross, AWChuang, IMarkov, ILColumbia Univ
Dept Comp Sci New York NY 10027 USA MIT
Dept Elect Engn & Comp Sci Cambridge MA 02139 USA MIT
Media Lab Ctr Bits & Atoms Quanta Grp Cambridge MA 02139 USA MIT
Dept Phys Cambridge MA 02139 USA Univ Michigan
Dept Elect Engn & Comp Sci Ann Arbor MI 48109 USA
Despite convincing laboratory demonstrations of quantum information processing, it remains difficult to scale because it relies on inherently noisy components. Adequate use of quantum error correction and fault tolera...
详细信息
Despite convincing laboratory demonstrations of quantum information processing, it remains difficult to scale because it relies on inherently noisy components. Adequate use of quantum error correction and fault tolerance theoretically should enable much better scaling, but the sheer complexity of the techniques involved limits what is achievable today. The authors propose a layered software architecture consisting of a four-phase computer-aided design flow that assists with such computations by mapping a high-level language source program representing a quantum algorithm onto a quantum device. By weighing different optimization and error-correction procedures at appropriate phases of the design flow, researchers, algorithm designers, and tool builders can trade off performance and accuracy.
Traditional dependence tests detect dependences with linear array subscripts, but only give passive results to those with non-linear expressions. It may result in a multitude of pseudo-dependences. To maximise the par...
详细信息
Traditional dependence tests detect dependences with linear array subscripts, but only give passive results to those with non-linear expressions. It may result in a multitude of pseudo-dependences. To maximise the parallelism of applications and improve an optimising compiler's ability of detecting dependences between program statements, it is necessary to develop a non-linear dependence test to eliminate these pseudo-dependences. This study presents a new non-linear dependence test by analysing the optimal solution of the quadratic subscripts with the index bounds constraints. The authors prove that the non-linear dependences caused by subscripts, which can be written in the form of quadratic programming model, are able to be detected, and introduce a non-linear dependence testing algorithm based on quadratic programming. The effectiveness of this algorithm is verified. The authors developed a prototype implementation of the test with the Open64 compiler and evaluated it using some real world applications from Perfect Club benchmarks and Spec2006 benchmark suites. The experimental results indicate that, compared to existing testing methods, the quadratic programming (QP) test is more efficient for quadratic cases.
An arithmetic-based address translation technique is presented for low-power and real-time embedded processors with virtual memory support. General-purpose virtual memory support comes with its disadvantages of excess...
详细信息
An arithmetic-based address translation technique is presented for low-power and real-time embedded processors with virtual memory support. General-purpose virtual memory support comes with its disadvantages of excessive power consumption and nondeterministic execution times, which are the main reasons for not adopting virtual memory in energy-efficient and real-time embedded systems. To address these issues, an application-driven address translation is proposed, where most of the address translations, which are traditionally performed as translation lookaside buffer (TLB) lookups, are replaced with fast and energy-efficient addition operations. To achieve this, a program and system-wide information is used to identify sequences of consecutive virtual page numbers, which are mapped to sequences of consecutive physical page frames. For such pairs of page sequences, only the addition of a constant to the virtual page number is needed to produce the physical page frame. The proposed methodology relies on the combined efforts of compiler, operating system, and hardware architecture to achieve a significant power reduction. As the approach fundamentally eliminates conflicts inherent in the hardware translation table, execution time is not only improved but also made predictable for a large number of memory reference instructions. Experiments show power reductions in the range of 80-95% compared to a general-purpose TLB.
As our society becomes more technologically complex, computer systems are finding an alarming number of uses in safety-critical applications. In many such systems, the software component's reliability is essential...
详细信息
As our society becomes more technologically complex, computer systems are finding an alarming number of uses in safety-critical applications. In many such systems, the software component's reliability is essential to the system's safe operation, so it becomes natural to ask, "How can software be made to behave correctly when executed?" Using program transformations to produce trusted software simplifies verification. program transformations use proven laws to manipulate programs in a manner analogous to algebraic transformations. The authors have sketched how a formal method based on program transformations can be used to construct a verified compiler. Such a compiler has been proved to correctly compile any correct program into assembly language. While the compiler itself may not execute efficiently-after all, you need only use the verified compiler the last time you compile a program-the transformational approach should enable the verified compiler to produce efficient assembly code.
The most promising technique for automatically parallelizing loops when the system cannot determine dependences at compile time is speculative parallelization. Also called thread-level speculation, this technique assu...
详细信息
The most promising technique for automatically parallelizing loops when the system cannot determine dependences at compile time is speculative parallelization. Also called thread-level speculation, this technique assumes optimistically that the system can execute all iterations of a given loop in parallel. A hardware or software monitor divides the iterations into blocks and assigns them to different threads, one per processor, with no prior dependence analysis. If the system discovers a dependence violation at runtime, it stops the incorrectly computed work and restarts it with correct values. Of course, the more parallel the loop, the more benefits this technique delivers. To better understand how speculative parallelization works, it is necessary to distinguish between private and shared variables. Informally speaking, private variables are those that the program always modifies in each iteration before using them. On the other hand, values stored in shared variables are used in different iterations.
At the end of August, AMD demonstrated a chip that put two of its Opteron processors onto the same piece of silicon. IBM's Power4 already has two processors working alongside each other on a chip; and four of thos...
详细信息
At the end of August, AMD demonstrated a chip that put two of its Opteron processors onto the same piece of silicon. IBM's Power4 already has two processors working alongside each other on a chip; and four of those chips go into a supercomputer processing module. Soon, the trickle of general-purpose multiprocessors will become more of a stream.
暂无评论