Efficient use of data-reuse transformations combined with a custom memory hierarchy that exploits the temporal locality of data related memory accesses can have a significant impact on system power consumption, especi...
详细信息
Efficient use of data-reuse transformations combined with a custom memory hierarchy that exploits the temporal locality of data related memory accesses can have a significant impact on system power consumption, especially in data dominated applications e.g. multimedia processing. In this paper the effect of data-reuse decisions on power consumption, area and performance of multimedia applications implemented on uni- and dual-processor embedded cores is explored. By this work it is clarified that conclusions for the transformations effect on multi-processor architectures can be extracted by the corresponding effect on the uniprocessor architecture. In this way the exploration space can be significantly reduced. A motion estimation algorithm, namely the two-dimensional logarithmic search, and a discrete cosine transform (DCT) algorithm are used as demonstrator applications.
When employing fuzzy relational structures in the development of intelligent systems, a unified generic tool is needed to assist the designers, knowledge experts, and users in constructing the application9;s data d...
详细信息
When employing fuzzy relational structures in the development of intelligent systems, a unified generic tool is needed to assist the designers, knowledge experts, and users in constructing the application's data dictionary and relational structures for the observed environment being modeled. Such a tool must apply some form of "computing with words" to help users conceptualize the semantics of the fuzzy relations themselves. Everyday terms and those used in special environments form a natural means of conceptualizing the reasoning process of fuzzy analysis on fuzzy relations using words rather than numbers. the recognized words and terms allow the potential users of fuzzy systems the opportunity to step back and see the big-picture of a typical application's overall relational structures and compositions. A front-end English Query Language (EQL) tool is specified along withthe supporting grammar to view the emerging technologies employed in representing fuzzy relational structures and how the relational approach can be used for "computing with words" systems. therefore, the desired logical analysis can be expressed using natural language queries as opposed to the mathematical products forms of the multi-valued logics used.
Retrograde analysis is an efficient exhaustive search method. It is a powerful tool that can be used in solving problems where end states have known values but starting states do not. It has been widely used to solve ...
详细信息
Retrograde analysis is an efficient exhaustive search method. It is a powerful tool that can be used in solving problems where end states have known values but starting states do not. It has been widely used to solve mathematically-precise games such as chess endgames, and is potentially usable in energy-minimization problems. With increasing computing power, both in speed and storage capacity, retrograde analysis will become more and more useful. this paper looks at successful applications to games, the challenges ahead and the modifications that are required to utilize distributed hardware. the power and the usefulness of retrograde analysis are still limited by the computing resources one has access to. Today, the best sequential retrograde algorithms are capable of solving problems with about 10/sup 9/ states in a few hours on a standard personal computer. Bigger problems need more powerful computers, or take much longer to solve, or are simply out of the reach of today's technologies. Introducing parallelism to retrograde analysis is a natural way to attack the bigger problems. there are today three main architectures available for doing parallel retrograde analysis, namely symmetric multiprocessor (SMP) systems, high-speed network-based distributed systems and Internet-based distributed systems. In this paper, we discuss some of the key issues in doing parallel retrograde analysis on these different architectures. Technical challenges are addressed in detail, as well as some examples and proposals. these examples and proposals are drawn from various board games, but the ideas can be applied to other problem domains.
ICA3PP 2000 was an important conferencethat brought together researchers and practitioners from academia, industry and governments to advance the knowledge of parallel and distributed computing. the proceedings const...
详细信息
ISBN:
(数字)9789812792037
ISBN:
(纸本)9789810244811
ICA3PP 2000 was an important conferencethat brought together researchers and practitioners from academia, industry and governments to advance the knowledge of parallel and distributed computing. the proceedings constitute a well-defined set of innovative research papers in two broad areas of parallel and distributed computing: (1) architectures, algorithms and networks; (2) systems and applications.
In this paper we present an approach to determine scheduling functions suitable for the design of processor arrays. the considered scheduling functions support a followed LSGP-partitioning of the processor array by al...
详细信息
ISBN:
(纸本)0769507166
In this paper we present an approach to determine scheduling functions suitable for the design of processor arrays. the considered scheduling functions support a followed LSGP-partitioning of the processor array by allowing to execute the tasks of processors of the frill-size array mapped into one processor of the partitioned processor array in art arbitrary order: Several constraints are derived to ensure the causality of computations and to prevent access conflicts to bath modules and registers. We propose an optimization problem generating the scheduling functions and outline its implementation as an integer linear program. the proposed methods are also applicable for the mapping of algorithms to parallelarchitectures. In this case, the scheduling function produces identical, independent small threads which can be combined to utilize the target architecture as much as possible.
the emergence of multimedia technology in recent years is strongly driven by an enormous commercial potential. For the scientific community this development is interesting because a number of attractive disciplines fo...
ISBN:
(纸本)3540679561
the emergence of multimedia technology in recent years is strongly driven by an enormous commercial potential. For the scientific community this development is interesting because a number of attractive disciplines for computer science and engineering flow together into the multimedia mainstream: image processing, computer graphics, data compression, encoding, cryptography, and broadband communication, to mention just a few of them. these fields have always been driving forces behind the design of massively parallelarchitectures and algo- rithms as well as special purpose processors and storage systems.
this paper presents the design of highly optimized TTA architectures for image processing applications. An automatic processor design framework as described in [2] is used. Specialized hardware is used to improve the ...
详细信息
ISBN:
(纸本)3540679561
this paper presents the design of highly optimized TTA architectures for image processing applications. An automatic processor design framework as described in [2] is used. Specialized hardware is used to improve the performance-cost ratio of the processors. An explorer searches the design space for solutions that are good in terms of cost and performance. We show that architectures can be found that efficiently execute very different algorithms at low cost. A hardware feasible architecture is presented that efficiently executes a set of image processingalgorithms and performs almost equally or better than alternative, commercial-available solutions do.
Networks of workstations are widely used to carry out resource intensive applications. Applications running on these architectures may not produce the anticipated speedups. this is mainly because of the existence of m...
详细信息
this paper presents a new implementation of a 2D wavelet transform in a VLSI circuit, for real-time digital signal processing. the parallel algorithm of the 2D wavelet transform (2D-WT) used for designing and implemen...
详细信息
ISBN:
(纸本)0780365429
this paper presents a new implementation of a 2D wavelet transform in a VLSI circuit, for real-time digital signal processing. the parallel algorithm of the 2D wavelet transform (2D-WT) used for designing and implementing this new architecture enhances the performance of computations. the proposed multi-elementary processor architecture of 2D-WT yields a very flexible hardware configuration. this approach offers a high processing speed, relative to other methods, for providing the wavelet coefficients. the 2D-WT is a powerful tool for several applications, the most important one being image processing.
this paper examines implementations of a multi-layer perceptron (MLP) on bus-based shared memory (SM) and on distributed memory (DM) multiprocessor systems. the goal has been to optimize HW and SW architectures in ord...
详细信息
this paper examines implementations of a multi-layer perceptron (MLP) on bus-based shared memory (SM) and on distributed memory (DM) multiprocessor systems. the goal has been to optimize HW and SW architectures in order to obtain the fastest response possible. Prototyping parallel MLP algorithms for up to 8 processing nodes withthe DM as well as SM memory was done using CSP-based TRANSIM tool. the results of prototyping MLPs of different sizes on various number of processing nodes demonstrate the feasible speedups, efficiency and time responses for the given CPU speed, link speed or bus bandwidth.
暂无评论