Reconfigurable computing is an emerging paradigm of research that offers cost-effective solutions for computationally intensive applications through hardware reuse. there is a growing need in this domain for technique...
详细信息
ISBN:
(纸本)0769521525
Reconfigurable computing is an emerging paradigm of research that offers cost-effective solutions for computationally intensive applications through hardware reuse. there is a growing need in this domain for techniques to exploit parallelism inherent in the target application and to schedule the parallelized application. this paper proposes a method to estimate the optimal number of resources through critical path analysis while keeping resource utilization near optimal. We also propose a novel algorithm to optimally schedule the parallelthreads of execution in linear time. Our algorithm is based on the idea of enhanced Partial Critical Path (ePCP) and handles memory latencies and reconfiguration overheads. Results obtained show the effectiveness of our approach over other critical path based methods.
We present a library for the parallel computation of particle simulations called ParaSPH. It is portable and performs well on a variety of parallelarchitectures with shared and distributed memory. We give details of ...
详细信息
In the context of Italian Space Agency COSMO SkyMed project a quantitative and qualitative study of a set of image processingalgorithms for SAR Processors has been carried out. the algorithms showed some interesting ...
详细信息
When adding reconfigurability to custom hardware, one must take great care that the reduction in speed due to the reconfigurable logic should not cancel out the gains obtained by reconfiguration. these gains are great...
详细信息
ISBN:
(纸本)0769522297
When adding reconfigurability to custom hardware, one must take great care that the reduction in speed due to the reconfigurable logic should not cancel out the gains obtained by reconfiguration. these gains are greatest in very specific and computation-intensive applications, and lessen as the applications become more general and heterogeneous. In the case of superscalar processors, this leads to limiting the amount of reconfigurability to precise changes in existing functional units instead of adding a fully configurable functional unit. We present a detailed study of the modifications necessary in a superscalar processor to allow an FPU to be dynamically reconfigured as several ALUs with a minimal increase in the latency of these functional units. the timing of the FPU's multiplier tree and the decision about reconfiguration are exposed. As there is more than one simple unit involved, this decision is more global than a cycle-by-cycle reconfiguration and must be made for a longer period of time. We discuss possible policies for the dynamic reconfiguration decisions. the results show interesting gains of lip to 56% in the best cases, and average gains of 10%, on typical architectures over a wide range of applications.
Concavity trees are structures for 2-D shape representation. In this paper, we present a new recursive method for concavity tree matching that returns the distance between two attributed concavity trees. the matching ...
详细信息
ISBN:
(纸本)3540225706
Concavity trees are structures for 2-D shape representation. In this paper, we present a new recursive method for concavity tree matching that returns the distance between two attributed concavity trees. the matching is based both on the structure of the tree as well as on the attributes stored at each node. Moreover, the method can be implemented on parallelarchitectures, and it supports occluded and partial matching. To the best of our knowledge, this is the first work to detail a method for concavity tree matching. We test our method on 625 silhouettes in the context of shape-based nearest-neighbour retrieval.
this paper discusses efficient hash-partitioning using workload access patterns to place and process relations in a cluster or distributed query-intensive database environment. In such an environment, there is usually...
详细信息
ISBN:
(纸本)0769521525
this paper discusses efficient hash-partitioning using workload access patterns to place and process relations in a cluster or distributed query-intensive database environment. In such an environment, there is usually more than one partitioning alternative for each relation. We discuss a method and algorithm to determine the hash partitioning attributes and placement. Among the alternatives, our algorithm chooses a placement that reduces repartitioning overheads using expected or historical query workloads. the paper includes a simulation study showing how our strategy outperforms ad-hoc placement and previously proposed distributed database strategies.
A high parallel-pipelined VLSI architecture for MPEG4 motion estimation is proposed in this paper, searching for the best match to the reference block by full search block matching algorithm to enhance the video quali...
详细信息
ISBN:
(纸本)0780386019
A high parallel-pipelined VLSI architecture for MPEG4 motion estimation is proposed in this paper, searching for the best match to the reference block by full search block matching algorithm to enhance the video quality. It possesses the characteristics of low embedded memory, low clock rate with high accuracy, and flexibility to adapt different search window size, aiming at both mobile applications and high-definition fields. Full search block matching algorithm has been mapped onto this architecture using an optimized processing. element array that has the ability to evaluate the motion vectors of CIF video real-timely at 8 MHZ clock rate with only 5 10 embedded memory. the proposed architecture has been prototyped, simulated and synthesized for 0.35 um CMOS technology using FUJITSU CE66 cells. the prototyped architecture consumes 190.3 mW with 3.3 V supply voltage and has core area of 5.52 mm(2) with 5 layers of metal. It largely improves the performance to more than 33 times faster with only 34.6% more core area cost than the reference motion estimation architecture of MPEG-4 Part 9.
暂无评论