In the past few years, increasing interest has been shown in using Java as a language for performance-oriented distributed and parallel computing. Most Java-based systems that support portable parallel and distributed...
详细信息
In the past few years, increasing interest has been shown in using Java as a language for performance-oriented distributed and parallel computing. Most Java-based systems that support portable parallel and distributed computing either require the programmer to deal with intricate low level details of Java which can be a tedious, time-consuming and error-prone task, or prevent the programmer from controlling locality of data. In contrast to most existing systems, JavaSymphony - a class library written entirely in Java - allows to control parallelism, load balancing and locality at a high level. Objects can be explicitly distributed and migrated based on virtual architectures which impose a virtual hierarchy on a distributed/parallel system of physical computing nodes. The concept of blocking/nonblocking remote method invocation is used to exchange data among distributed objects and to process work by remote objects. We evaluate the JavaSymphony programming API for a variety of distributed/parallel algorithms which comprises backtracking, N-body, encryption/decryption algorithms and asynchronous nested optimizationalgorithms. performance results are presented for both homogeneous and heterogeneous cluster architectures. Moreover, we compare JavaSymphony with an alternative well-known semi-automatic system.
Conventional approaches for fixed-point implementation of digital signal processing algorithms require the scaling and word-length (WL) optimization in the algorithm level and the high-level synthesis for functional u...
详细信息
Conventional approaches for fixed-point implementation of digital signal processing algorithms require the scaling and word-length (WL) optimization in the algorithm level and the high-level synthesis for functional unit sharing in the architecture level. However, the algorithm-level WL optimization has a few limitations because it can neither utilize the functional unit sharing information for signal grouping nor estimate the hardware cost for each operation accurately. In this study, we develop a combined WL optimization and high-level synthesis algorithm not only to minimize the hardware implementation cost, but also to reduce the optimization time significantly, This software initially finds the WL sensitivity or minimum WL of each signal throughout fixed-point simulations of a signal flow graph, performs the WL conscious high-level synthesis where signals having the similar WL sensitivity are assigned to the same functional unit, and then conducts the final WL optimization by iteratively modifying the WLs of the synthesized hardware model. A list-scheduling-based and an integer linear-programming-based algorithms are developed for the WL conscious high-level synthesis. The hardware cost function to minimize is generated by using a synthesized hardware model. Since fixed-point simulation is used to measure the performance, this method can be applied to general, including nonlinear and time-varying, digital signal processing systems. A fourth-order infinite-impulse response filter, a fifth-order elliptic filter, and a 12th-order adaptive least mean square filter are implemented using this software.
This paper presents a high-level design methodology, called input space adaptive design, and new design automation algorithms for optimizing energy consumption and performance. An input space adaptive design exploits ...
详细信息
ISBN:
(纸本)1581132972
This paper presents a high-level design methodology, called input space adaptive design, and new design automation algorithms for optimizing energy consumption and performance. An input space adaptive design exploits the well-known fact that the quality of hardware circuits and software programs can be significantly optimized by employing algorithms and implementation architectures that adapt to the input statistics. We propose a methodology for such designs which includes identifying parts of the behavior to be optimized, selecting appropriate input sub-spaces, transforming the behavior, and verifying the equivalence of the original and optimized designs- Experimental results indicate that such designs can reduce energy consumption by up to 70.6% (average of 55.4%), and simultaneously improve performance by up. to 85.1% (average of 58.1%), leading to a reduction in the energy-delay product by up to 95.6% (average of 80.7%), compared to well-optimized designs that do not employ such techniques.
Non-mechanical beam steering technologies, utilizing acousto-optics, allow for achieving the high bandwidth laser beam positioning required for optical communications, laser scanners, LADARs, etc. Properties of the Br...
详细信息
ISBN:
(纸本)0819441589
Non-mechanical beam steering technologies, utilizing acousto-optics, allow for achieving the high bandwidth laser beam positioning required for optical communications, laser scanners, LADARs, etc. Properties of the Bragg cell, chiefly responsible for the efficiency and attainable characteristics of the entire positioning system, are assured by successful design of this optical component. However, design of Bragg cells is dominated by experience and intuition of the designers and the potential of this technology is not fully utilized. An optimal design problem of a Bragg cell is formulated on the basis of known equations of the underlying physical phenomena, and a genetic optimization scheme is applied for the solution of the resultant formidable problem. The approach not only yields a design solution, but also allows for the variation of the design criterion and emphasizing particular properties of the resultant component. The prowess of the proposed approach has been demonstrated by design optimization examples.
In this paper we will describe about the computer middle-ware called Hyper Artificial Life optimization? System, abbreviated to HAL, based on Artificial Life theories, which is effective for almost all kinds of combin...
详细信息
ISBN:
(纸本)0769511538
In this paper we will describe about the computer middle-ware called Hyper Artificial Life optimization? System, abbreviated to HAL, based on Artificial Life theories, which is effective for almost all kinds of combinatorial optimization problems in our actual world. This middle-ware helps efficient development of parallel processing of an Application Program (AP) for combinatorial optimization problems by adopting a conventional evolution procedure. The application based on this middle-ware becomes to have high-autonomy and high-robustness, and improves its performance on a parallel computer: This rime, SCM (Supply Chain Management) scheduling program, which is actually used by many users, has been applied to this middle-ware in parallel to verify and evaluate HAL. In its evaluation, we could get the remarkable improvement of the performance. This model has characteristics of "Reproduction" "Mutation" and "Genetics". And we could find a rare phenomenon considered as "Emergence" in the actual result. This obvious transcends the concept of many conventional algorithms and the ability of optimization. Moreover the model has hyper structure. Therefore we named it Hyper Artificial Life System, abbreviated to HAL.
In this paper we describe a software pipelining framework, CALiBeR (Cluster Aware Load Balancing Retiming Algorithm), suitable for compilers targeting clustered embedded VLIW processors. CALiBeR can be effectively use...
详细信息
ISBN:
(纸本)0780372476
In this paper we describe a software pipelining framework, CALiBeR (Cluster Aware Load Balancing Retiming Algorithm), suitable for compilers targeting clustered embedded VLIW processors. CALiBeR can be effectively used by embedded system designers to explore different code optimization alternatives, i.e., can assist the generation of high-quality customized retiming solutions for desired program memory size and throughput requirements, while minimizing register pressure. An extensive set of experimental results is presented, considering several representative benchmark loop kernels and a wide variety of clustered datapath configurations, demonstrating that our algorithm compares favorably with one of the best state-of-the-art algorithms, achieving up to 50% improvement in performance and up to 47% improvement in register requirements.
Phylogeny reconstruction from molecular data poses com- plex optimization problems: Almost all optimization models are NP-hard and thus computationally intractable. Yet approximations must be of very high quality in o...
详细信息
Various composite structural configurations are increasingly manufactured using liquid composite molding processes. These processes provide unitized structures with repeatability and excellent dimensional tolerances. ...
详细信息
The application of receding horizon control (RHC) with the linear parameter varying (LPV) design methodology to a high fidelity, nonlinear F-16 aircraft model is demonstrated. The highlights of the paper are: i) use o...
详细信息
The application of receding horizon control (RHC) with the linear parameter varying (LPV) design methodology to a high fidelity, nonlinear F-16 aircraft model is demonstrated. The highlights of the paper are: i) use of RHC to improve upon the performance of a LPV regulator; ii) discussion on details of implementation such as control space formulation, tuning of RHC parameters, computation time and numerical properties of the algorithms; and iii) simulated response of nonlinear RHC and LPV regulator.
Routers must perform packet classification at high speeds to efficiently implement functions such as firewalls and diffserv. Classification can be based on an arbitrary number of fields in the packet header. Performin...
详细信息
Routers must perform packet classification at high speeds to efficiently implement functions such as firewalls and diffserv. Classification can be based on an arbitrary number of fields in the packet header. Performing classification quickly on an arbitrary number of fields is known to be difficult, and has poor worst-case complexity. In this paper, we re-examine two basic mechanisms that have been dismissed in the literature as being too inefficient: backtracking search and set pruning tries. We find using real databases that the time for backtracking search is much better than the worst-case bound;instead of Ω((logN)k-1), the search time is only roughly twice the optimal search time1. Similarly, we find that set pruning tries (using a DAG optimization) have much better storage costs than the worst-case bound. We also propose several new techniques to further improve the two basic mechanisms. Our major ideas are (i) backtracking search on a small memory budget, (ii) a novel compression algorithm, (iii) pipelining the search, (iv) the ability to trade-off smoothly between backtracking and set pruning, and (v) algorithms to effectively make use of hardware if hardware is available. We quantify the performance gain of each technique using real databases. We show that on real firewall databases our schemes, with the accompanying optimizations, are close to optimal in time and storage.
暂无评论