Recently there have been growing interests in the applications of wireless sensor networks. Innovative techniques that improve energy efficiency to prolong the network lifetime are highly required. Clustering is an ef...
详细信息
Utilizing desktop grid infrastructures is challenging for parallel discrete event simulation (PDES) codes due to characteristics such as inter-process messaging, restricted execution, and overall lower concurrency tha...
详细信息
ISBN:
(纸本)9781424415595
Utilizing desktop grid infrastructures is challenging for parallel discrete event simulation (PDES) codes due to characteristics such as inter-process messaging, restricted execution, and overall lower concurrency than typical volunteer computing projects. The Aurora2 system uses an approach that simultaneously provides both replicated execution support and scalable performance of PDES applications through public resource computing. This is accomplished through a multi-threaded distributed back-end system, low overhead communications middleware, and an efficient client implementation. This paper describes the Aurora2 architecture and issues pertinent to PDES executions in a desktop grid environment that must be addressed when distributing back-end services across multiple machines. We quantify improvement over the first generation Aurora system through a comparative performance study detailing PDES programs with various scalability characteristics for execution over desktop grids.
This paper present a comparison of scheduling algorithms applied to the context of load balancing the query traffic on distributed inverted files. We implemented a number of algorithms taken from the literature. We pr...
详细信息
ISBN:
(纸本)9781595938299
This paper present a comparison of scheduling algorithms applied to the context of load balancing the query traffic on distributed inverted files. We implemented a number of algorithms taken from the literature. We propose a novel method to formulate the cost of query processing so that these algorithms can be used to schedule queries onto processors. We avoid measuring load balance at the search engine side because this can lead to imprecise evaluation. Our method is based on the simulation of a bulk-synchronous parallel computer at the broker machine side. This simulation determines an optimal way of processing the queries and provides a stable baseline upon which both the broker and search engine can tune their operation in accordance with the observed query traffic. We conclude that the simplest load balancing heuristics are good enough to achieve efficient performance. Our method can be used in practice by broker machines to schedule queries efficiently onto the cluster processors of search engines. Copyright 2007 ACM.
An inductively-degenerated common-source (CS) open-drain cascode LNA was designed for W-CDMA application. The operating frequency for the design was at 2.14 GHz which is at the center of the reception range Of the W-C...
详细信息
ISBN:
(纸本)9781424413072
An inductively-degenerated common-source (CS) open-drain cascode LNA was designed for W-CDMA application. The operating frequency for the design was at 2.14 GHz which is at the center of the reception range Of the W-CDMA standard. The supply voltage is 1.8V at 0.18 mu m CMOS process. The LNA was designed using power-constrained noise optimization method in obtaining the width of the transistor of 290 gm. Post-layout simulations with distributed resistors and capacitors were performed. On-chip inductors with quality factor of 8 were utilized to resonate with the metal-insulator-metal capacitor (mimcap). The mimcap was also used to isolate V-DD and ground. The input was 50 Omega matched using the transistor as well as an inductor at the gate and three parallel 1.65 nH inductors acting as a 0.55 nH degeneration inductor at the source. Detailed design steps are described in this paper wih plots of the post-layout simulation and measurement results provided. These plots are analyzed extensively in this paper and justification for the errors are given. The 12.8 dB of S-21 obtained from the post-layout and a much less 7.8 dB from the measurement shows that there is an offset by 5 dB. Derivations are given to show that the unmatched output is the cause of the gain offset. S-11 is measured at -24 dB which is very close to the simulated value of -25.4 dB. The current measured and simulated at a bias voltage of 0.65 V is 4.1 mA.
Virtual observatories will give astronomers easy access to anunprecedented amount of data. Extracting scientific knowledge from these data will increasingly demand both efficient algorithms as well as the power of par...
详细信息
ISBN:
(纸本)1595937145
Virtual observatories will give astronomers easy access to anunprecedented amount of data. Extracting scientific knowledge from these data will increasingly demand both efficient algorithms as well as the power of parallel computers. Nearly all efficient analyses of large astronomical datasets use trees as their fundamental data structure. Writing efficient tree-based techniques, a task that is time-consuming even on single-processor computers, is exceedingly cumbersome on massively parallel platforms (MPPs). Most applications that run on MPPs are simulation codes, since the expense of developing them is offset by the fact that they will be used for many years by many researchers. In contrast, data analysis codes change far more rapidly, are often unique to individual researchers, and therefore accommodate little reuse. Consequently, the economics of the current high-performance computing development paradigm for MPPs does not favor data analysis applications. We have therefore built a library, called Ntropy, that provides a flexible, extensible, and easy-to-use way of developing tree-based data analysis algorithms for both serial and parallel platforms. Our experience has shown that not only does our library save development time, it can also deliver excellent serial performance and parallel scalability. Furthermore, Ntropy makes it easy for an astronomer with little or noparallel programming experience to quickly scale their application to a distributed multiprocessor environment. By minimizing development time for efficient and scalable data analysis, we enable wide-scale knowledge discovery on massive datasets. Copyright 2007 ACM.
Andrews et al. [Automatic method for hiding latency in high bandwidth networks, in: proceedings of the ACM Symposium on Theory of Computing, 1996, pp. 257-265;Improved methods for hiding latency in high bandwidth netw...
详细信息
Andrews et al. [Automatic method for hiding latency in high bandwidth networks, in: proceedings of the ACM Symposium on Theory of Computing, 1996, pp. 257-265;Improved methods for hiding latency in high bandwidth networks, in: proceedings of the eighth Annual ACM Symposium on parallel Algorithms and Architectures, 1996, pp. 52-61] introduced a number of techniques for automatically hiding latency when performing simulations of networks with unit delay links on networks with arbitrary unequal delay links. In their work, they assume that processors of the host network are identical in computational power to those of the guest network being simulated. They further assume that the links of the host are able to pipeline messages, i.e., they are able to deliver P packets in time O(P + d) where d is the delay on the link. In this paper we examine the effect of eliminating one or both of these assumptions. In particular, we provide an efficient simulation of a linear array of homogeneous processors connected by unit-delay links on a linear array of heterogeneous processors connected by links with arbitrary delay. We show that the slowdown achieved by our simulation is optimal. We then consider the case of simulating cliques by cliques;i.e., a clique of heterogeneous processors with arbitrary delay links is used to simulate a clique of homogeneous processors with unit delay links. We reduce the slowdown from the obvious bound of the maximum delay link to the average of the link delays. In the case of the linear array we consider both links with and without pipelining. For the clique simulation the links are not assumed to support pipelining. The main motivation of our results (as was the case with Andrews et al.) is to mitigate the degradation of performance when executing parallel programs designed for different architectures on a network of workstations (NOW). In such a setting it is unlikely that the links provided by the NOW will support pipelining and it is quite probab
暂无评论