the proceedings contain 118 papers. the special focus in this conference is on Parallel and Distributed Processing. the topics include: Dynamic reconfiguration of a PMMLA for high-throughput applications;a parallel al...
ISBN:
(纸本)3540643591
the proceedings contain 118 papers. the special focus in this conference is on Parallel and Distributed Processing. the topics include: Dynamic reconfiguration of a PMMLA for high-throughput applications;a parallel algorithm for minimum cost path computation on polymorphic processor array;a performance modeling and analysis environment for reconfigurable computers;an integrated partitioning and synthesis system for dynamically reconfigurabte multi-FPGA architectures;temporal partioning for partially-reconfigurable-field-programmable gate;a java development and runtime environment for reconfigurable computing;synthesizing reconfigurable sequential machines using tabular models;evaluation of a low-power reconfigurable DSP architecture;a reconfigurable hardware-monitor for communication analysis in distributed real-time systems;a mathematical benefit analysis of context switching reconfigurable computing;a configurable computing approach towards real-time target tracking;hardware reconfigurable neural networks;a simulator for the reconfigurable mesh architecture;processor architectures for circuit emulation;an empirical comparison of runtime systems for conservative parallel simulation;synchronizing operations on multiple objects;migration and rollback transparency for arbitrary distributed applications in workstation clusters;a topology based approach to coordinated multicast operations;a parallel evolutionary algorithm for the vehicle routing problem with heterogeneous fleet;artificial neural networks on reconfigurable meshes;a molecular quasi-random model of computations applied to evaluate collective intelligence;replicated shared object model for edge detection with spiral architecture and scheduling tasks of a parallel program in two-processor systems with use of cellular automata.
the fundamental premise behind the DASH project is that it is feasible to build large-scale shared-memory multiprocessors with hardware cache coherence. While paper studies and software sirnulators are useful for unde...
详细信息
Our recent work in microarchitecture has identified a new model of execution, restricted data Bow, in which data flow techniques are used to coordinate out-of-order execution of sequential instruction streams. We beli...
详细信息
the performance of multiple-instruction-issue processors can be severely limited by the compiler's ability to generate efficient code for concurrent hardware. In the Ihf- PACT project, we have developed IMPACT-I, ...
详细信息
the proceedings contain 52 papers. the topics discussed include: reconfigurable hardware for tomographic processing;microelectronics education using WWW and CAD tools;implementation of an edge detection algorithm in a...
ISBN:
(纸本)0818687045
the proceedings contain 52 papers. the topics discussed include: reconfigurable hardware for tomographic processing;microelectronics education using WWW and CAD tools;implementation of an edge detection algorithm in a reconfigurable computing system;automatic synthesis of hashing function circuits using evolutionary techniques;synthesis tools and design environment for dynamically reconfigurable FPGAs;generation of tests for the localization of single gate design errors in combinational circuits using the stuck-at fault model;exploring concurrency in data path functional units BIST plan optimization: a study-case;integrated CMOS linear dosimeters;a physical layer controller for wireless infrared networks;a temporal logic for data-flow VHDL;formal verification of VHDL - the model checker CV;formalization of finite state machines with data path for the verification of high-level synthesis;VHDL models for high level synthesis of fuzzy logic controllers;a CO-synthesis approach based on symbolic reachability analysis;synthesis of CMOS operational amplifiers through genetic algorithms;MorphoSys: a reconfigurable architecture for multimedia applications;designing the dispatch stage of a superscalar microprocessor;a high-performance switching element for a multistage interconnection network;and a two-level pipelined implementation of the IDEA cryptographic algorithm.
this paper presents new techniques for speeding up deterministic test pattern generation for VLSI circuits. these techniques improve the PODEM algorithm by reducing number of backtracks with a low computational cost. ...
详细信息
this paper presents new techniques for speeding up deterministic test pattern generation for VLSI circuits. these techniques improve the PODEM algorithm by reducing number of backtracks with a low computational cost. this is achieved by finding more necessary signal line assignments, by detecting conflicts earlier, and by avoiding unnecessary work during test generation. We have incorporated these techniques into an ATPG system for combinational circuits, called ATOM. the performance results for the ISCAS85 and full scan version of the ISCAS89 benchmark circuits demonstrated the effectiveness of these techniques on the test generation performance.
Systems which employ a microprocessor together with an application specific FPGA based coprocessor are common today. these applications can reduce power consumption and system costs by incorporating the microprocessor...
详细信息
Systems which employ a microprocessor together with an application specific FPGA based coprocessor are common today. these applications can reduce power consumption and system costs by incorporating the microprocessor in the FPGA. For such applications, a microprocessor which has good performance, occupies a minimal amount of FPGA resources, has a good high level language software development environment and good code density is desirable. In this paper a 16 bit FPGA based microprocessor, called MSL16, optimised for such applications is described. MSL16 utilises a stack architecture with each instruction occupying only 4 bits, leading to a small instruction set, simple datapath and control, and high code density. MSL16 was specifically designed to efficiently execute the programming language "Forth". the Forth language has the desirable features of portability and high code density, and it is well suited to control, DSP, real-time and embedded applications.
Explicitly Parallel Instruction computing (EPIC) architectures require the compiler to express program instruction level parallelism directly to the hardware. EPIC techniques which enable the compiler to represent con...
详细信息
Explicitly Parallel Instruction computing (EPIC) architectures require the compiler to express program instruction level parallelism directly to the hardware. EPIC techniques which enable the compiler to represent control speculation, data dependence speculation, and predication have individually been shown to be very effective. However these techniques have not been studied in combination with each other. this paper presents the IMPACT EPIC architecture to address the issues involved in designing processors based on these EPIC concepts. In particular we focus on new execution and recovery models in which microarchitectural support for predicated execution is also used to enable efficient recovery from exceptions caused by speculatively executed instructions. this paper demonstrates that a coherent framework to integrate the three techniques can be elegantly designed to achieve much better performancethan each individual technique could alone provide.
the SHRIMP cluster-computing system has progressed to a point of relative maturity; a variety of applications are running on a 16-node system. We have enough experience to understand what we did right and wrong in des...
详细信息
the SHRIMP cluster-computing system has progressed to a point of relative maturity; a variety of applications are running on a 16-node system. We have enough experience to understand what we did right and wrong in designing and building the system. In this paper we discuss some of the lessons we learned about computerarchitecture, and about the challenges involved in building a significant working system in an academic research environment. We evaluate significant design choices by modifying the network interface firmware and the system software in order to empirically compare our design to other approaches.
In modern processors, the dynamic translation of virtual addresses to support virtual memory is done before or in parallel withthe first-level cache access. As processor technology improves at a rapid pace and the wo...
详细信息
In modern processors, the dynamic translation of virtual addresses to support virtual memory is done before or in parallel withthe first-level cache access. As processor technology improves at a rapid pace and the working sets of new applications grow insatiably the latency and bandwidth demands on the TLB (Translation Lookaside Buffer) are getting more and more difficult to meet. the situation is worse in multiprocessor systems, which run larger applications and are plagued by the TLB consistency problem. We evaluate and compare five options for virtual address translation in the context of COMAs (Cache Only Memory architectures). the dynamic address translation mechanism can be located after the cache access provided the cache is virtual. In a particular design, which we call V-COMA for Virtual COMA, the physical address concept and the traditional TLB are eliminated. While still supporting virtual memory, V-COMA reduces the address translation overhead to a minimum. V-COMA scales well and works better in systems with large number of processors. As a machine running on virtual addresses, V-COMA provides a simple and consistent hardware model to the operating system and the compiler, in which further optimization opportunities are possible.
暂无评论