Our new architecture, known as Scheduled DataFlow (SDF) system deviates from current trend of building complex hardware to exploit Instruction Level parallelism (ILP) by exploring a simpler, yet powerful execution par...
详细信息
ISBN:
(纸本)0769515126
Our new architecture, known as Scheduled DataFlow (SDF) system deviates from current trend of building complex hardware to exploit Instruction Level parallelism (ILP) by exploring a simpler, yet powerful execution paradigm that is based on dataflow, multithreading and decoupling of memory accesses from execution. A program is partitioned into non-blocking threads. In addition, all memory accesses are decoupled from the thread's execution. Data is pre-loaded into the thread's context (registers), and all results are post-stored after the completion of the thread's execution. Even though multithreading and decoupling are possible with control-flow architecture, the non-blocking and functional nature of the SDF system make it easier to coordinate the memory accesses and execution of a thread. In this paper we show some recent improvements on SDF implementation, whereby threads exchange data directly in register contexts, thus eliminating the need for creating thread frames. thus it is now possible to explore the scalability of our architecture's performance when more register contexts are included on the chip.
In the recent years multimedia technology has emerged as a key technology, mainly because of its ability to represent information in disparate forms as a bit-stream. this enables everything from text to video and soun...
ISBN:
(纸本)3540440496
In the recent years multimedia technology has emerged as a key technology, mainly because of its ability to represent information in disparate forms as a bit-stream. this enables everything from text to video and sound to be stored, processed, and delivered in digital form. A great part of the current research community effort has emphasized the delivery of the data as an important issue of multimedia technology. However, the creation, processing, and management of multimedia forms are the issues most likely to dominate the scientific interest in the long run. the aim to deal with information coming from video, text, and sound will result in a data explosion. this requirement to store, process, and manage large data sets naturally leads to the consideration of programmable parallelprocessing systems as strong candidates in supporting and enabling multimedia technology. therefore, this fact taken together withthe inherent data parallelism in these data types makes multimedia computing a natural application area for parallel and distributed processing. In addition to this, the concepts developed for parallel and distributed algorithms are quite useful for the implementation of distributed multimedia systems and applications. thus, the adaptation of these methods for distributed multimedia systems is an interesting topic to be studied.
this paper presents an advanced ASIC architecture for a wideband digital frequency demultiplexer designed for on board satellite communication systems. the structure is able to handle up to 32 uniformly spaced channel...
详细信息
ISBN:
(纸本)0780375963
this paper presents an advanced ASIC architecture for a wideband digital frequency demultiplexer designed for on board satellite communication systems. the structure is able to handle up to 32 uniformly spaced channels extracted from a digital, real frequency division multiplexed signal sampled at 672 MHz;contiguous demultiplexing and remultiplexing is supported. An efficient demux two-stage parallel architecture was developed where all internal coefficients (including the FFT ones) are coded with canonic signed digit technique allowing area saving with respect to 2's complement code. Performance evaluations, in terms of bit error rate, noise to power ratio and gate complexity, were carried out for the architecture definition. Subsequently the design was coded in VHDL and synthesised on a 0.18 mum CMOS technology by means of Synopsys(TM) tools.
the proceedings contain 70 papers. the special focus in this conference is on High Performance Computing. the topics include: High-performance computing and visualization;2-d wavelet transform enhancement on general-p...
ISBN:
(纸本)3540003037
the proceedings contain 70 papers. the special focus in this conference is on High Performance Computing. the topics include: High-performance computing and visualization;2-d wavelet transform enhancement on general-purpose microprocessors;a general data layout for distributed consistency in data parallel applications;duplication-based scheduling algorithm for interconnection-constrained distributed memory machines;evaluating arithmetic expressions using tree contraction;a mechanism to reduce i-cache power consumption in high performance microprocessors;exploiting web document structure to improve storage management in proxy caches;high performance multiprocessor architecture design methodology for application-specific embedded systems;a low latency messaging infrastructure for Linux clusters;low-power high-performance adaptive computing architectures for multimedia processing;a technique to construct high performance CORBA applications;automatic search for performance problems in parallel and distributed programs by using multi-experiment analysis;an adaptive value-based scheduler and its RT-Linux implementation;effective selection of partition sizes for moldable scheduling of parallel jobs;runtime support for multigrain and multiparadigm parallelism;a fully compliant openMP implementation on software distributed shared memory;a fast connection-time redirection mechanism for internet application scalability;an efficient resource sharing scheme for dependable real-time communication in multihop networks;improving web server performance by network aware data buffering and caching;wraps scheduling and its efficient implementation on network processors;performance comparison of pipelined hash joins on workstation clusters and iterative algorithms on heterogeneous network computing.
this paper introduces PAPA: Packed Arithmetic on a Prefix Adder, a new approach to parallel prefix adder design that supports a wide variety of packed arithmetic computations, including packed add and subtract with sa...
详细信息
ISBN:
(纸本)0769517129
this paper introduces PAPA: Packed Arithmetic on a Prefix Adder, a new approach to parallel prefix adder design that supports a wide variety of packed arithmetic computations, including packed add and subtract with saturation, packed rounded average, and packed absolute difference the approach consists of altering the prefix adder cell logic equations to take advantage of a previously unused "don't care " state. Logical Effort is employed to assess the delay of the new adder architecture by establishing the extra effort needed to select and drive the appropriate carry signal to the requisite sum sub-word. this adder will find applications in video processors and other multimedia-orientated processor chips that implement packed arithmetic operations.
Huang, et al. (1996, 2002) proposed architecture selection algorithm called SEDNN to find the minimum architectures for feedforward neural networks based on the Golden section search method and the upper bounds on the...
详细信息
In the plasmas used in semiconductor fabrication, collisions between electrons and polyatomic molecules produce reactive fragments that drive etching and other processes at the wafer surface.E xtensive and reliable da...
详细信息
In most distributed memory computations, node programs are executed on processors according to the owner computes rule. However, the owner computes rule is not best suited for irregular application codes. In irregular...
详细信息
A novel endoscope device has been developed based on a new technology of laser scanning using silicon micromirrors for superior resolution and chromatic representation compared withthe existing endoscopes. A critical...
详细信息
ISBN:
(纸本)0780375963
A novel endoscope device has been developed based on a new technology of laser scanning using silicon micromirrors for superior resolution and chromatic representation compared withthe existing endoscopes. A critical part of this device, that is reported here, is the data- acquisition, control and processing (DACP) system which acquires, processes, displays, stores in real-time the collected images and handles the control signals. the software developed uses the multithreading technology for the parallel execution of specific software tasks in the available CPUs of the multiprocessing system employed. In this manner, the necessary computational power is provided for the realization of a high-performance real-time imaging system. First-trial results are also given.
Power dissipation reduction is a stringent constraint in modern mobile devices. It can be obtained by supply voltage or frequency reduction, but a strong reduction of number of cycles for operation must be achieved. T...
详细信息
暂无评论