Server Load Balancing (SLB) is a popular technique to build high-availability web services as offered from Google and Amazon for example. Credit based load balancing strategies have been proposed in the literature whe...
详细信息
Server Load Balancing (SLB) is a popular technique to build high-availability web services as offered from Google and Amazon for example. Credit based load balancing strategies have been proposed in the literature where the back end servers dynamically report a metric called Credit to the Load Balancer (LB) which reflects their current capacity. This enables the LB to adapt the load balancing strategy. The benefit of Credit based SLB has been shown by simulations, but up to now, it is not used in productive systems, since efficient implementations were missing. This paper presents the evaluation of an implementation of Credit based SLB, the so-called Self-Adapting Load Balancing Network (salbnet). We evaluate salbnet for a cluster of web servers. The measurements are done with a representative workload based on a Wikipedia trace and confirm the benefit of the self-adapting load balancing approach.
We investigate the relative significance of kernelization versus branching for parallel FPT implementations. Using the well-known vertex cover problem as a familiar example, we build and experiment with a testbed of f...
详细信息
ISBN:
(纸本)9780889869431
We investigate the relative significance of kernelization versus branching for parallel FPT implementations. Using the well-known vertex cover problem as a familiar example, we build and experiment with a testbed of five different classes of difficult graphs. For some, we find that kernelization alone obviates the need for parallelism. For others, we show that kernelization and branching work in synergy to produce efficient implementations. And yet for others, kernelization fails completely, leaving branching to solve the entire problem. Structural graph properties are studied in an effort to explicate this trichotomy. The NP-completeness of vertex cover makes scalability an extreme challenge. We mainly employ Hopper, named after the famous computing pioneer Admiral Grace Murray Hopper. The Hopper platform is currently one of the world's fastest supercomputers.
A distributed data store can satisfy two properties out of three properties which are (strict) consistency, availability and partition-tolerance. In case of distributed data stores satisfying availability and partitio...
详细信息
ISBN:
(纸本)9780889869790
A distributed data store can satisfy two properties out of three properties which are (strict) consistency, availability and partition-tolerance. In case of distributed data stores satisfying availability and partition-tolerance, they can satisfy weak consistency, especially causal consistency, which is the strongest consistency that can cohabit with other two properties. Moreover, if any networks between nodes have no problem and very low latency, the distributed data store can satisfy stronger consistency than causal consistency. Sequential consistency is one of the stronger consistency than causal consistency. In order to satisfy sequential consistency, a distributed data store needs to equalize an order of data changing in all nodes. In this paper, we propose distributed data store model containing special nodes "casting nodes" and algorithms in order to decide an order of operations. Thanks to the casting nodes, our model can satisfy sequential consistency when all networks can connect, and our model can satisfy causal consistency when any networks disconnect.
The idea behind Cloud computing is to deliver Infrastructure-, Platform-, and Software as a Service (IaaS, PaaS, and SaaS) on a simple pay-per-use basis. In this paper, we introduce our work, OSGi Service Platform as ...
详细信息
ISBN:
(纸本)9780889868649
The idea behind Cloud computing is to deliver Infrastructure-, Platform-, and Software as a Service (IaaS, PaaS, and SaaS) on a simple pay-per-use basis. In this paper, we introduce our work, OSGi Service Platform as a Service (OSPaaS), a PaaS model for running an OSGi service platform in the cloud for e-Learning and teaching purposes. OSPaaS leverages OpenNebula, a virtual infrastructure manager, to dynamically launch virtual machines (VMs) on idle resources or dedicated servers. In addition, OSPaaS uses Shibboleth as a Single Sign-On mechanism for seamless authentication and authorization. To assess the suitability of OSGi for cloud computing, this paper investigates and analyzes three OSGi frameworks, i.e. Knopflerfish, Equinox and Apache Felix. Subsequently, an OSPaaS architecture is presented and described. Finally, this paper shows a use case scenario and advantages of OSPaaS for e-Learning & teaching purposes.
A modern high-performance multi-core processor has large shared cache memories. However, simultaneously running threads do not always require the entire capacities of the shared caches. Besides, some threads cause sev...
详细信息
ISBN:
(纸本)9780889867840
A modern high-performance multi-core processor has large shared cache memories. However, simultaneously running threads do not always require the entire capacities of the shared caches. Besides, some threads cause severe performance degradation by inter-thread cache conflicts and shortage of capacity on the shared cache. To achieve high performance processing on multi-core processors, effective usage of shared cache memories plays important role. In this paper, we propose a cache-aware thread scheduling policy for multi-core processors with multiple shared cache memories. The total processor performance becomes more sensitive to the cache capacity shortage, as larger caches are requested by the threads sharing one cache. The proposed policy can prevent multiple threads requesting a large cache capacity from sharing one cache. As a result, the policy can prevent inter-thread resource conflicts and hence severe performance degradation. Experimental results clearly demonstrate that the policy assists the cache partitioning mechanisms and avoids unfair performance degradation among threads. Thread scheduling based on the proposed policy can improve the performance by up to 10% and an average of 5% compared with thread scheduling without the proposed policy.
This paper demonstrates a distributed on-line service selection( probe/access) scheme: optimal stopping web service selection scheme based on the rate of return problem from optimal stopping theory. There are three di...
详细信息
ISBN:
(纸本)9780889869431
This paper demonstrates a distributed on-line service selection( probe/access) scheme: optimal stopping web service selection scheme based on the rate of return problem from optimal stopping theory. There are three differences between our scheme and the conventional schemes. Firstly, it does not need to probe all web services, and only probe a few web services. Secondly, our scheme focuses on maximizing the average QoS(Quality of Service) return per unit of cost over all stages of probe and access for a long period rather than maximizing QoS return per single stage of probe and access in usual schemes. Thirdly, our scheme develops a return function based on three factors: QoS return, user's requirement and probe cost which are seldom considered simultaneously before. Through theory analysis and computation, we demonstrate that compared with the conventional schemes our scheme has additional advantages while achieving same good performances.
Many attempts have been made to optimize the median filter from the software and hardware approach. An architectural design of hardware capable of performing real-time median filtering is presented. The architecture u...
详细信息
ISBN:
(纸本)9780889868205
Many attempts have been made to optimize the median filter from the software and hardware approach. An architectural design of hardware capable of performing real-time median filtering is presented. The architecture uses the histogram approach to calculate the median, while optimizing the sliding window method to reuse all its calculations. Data is output row by row and every input pixel is processed only once. The design is independent of window size or image size, and supports adding more processing elements to support wider images. The control unit design is minimized to enable self-adjustment of plug-and-play processing elements. The architecture is implemented in VHDL and synthesized to a Virtex-2 Pro FPGA. The architecture's performance as well as operation is compared to previous work.
This paper proposes an optimization method of task-allocation for reducing contentions. There have been some attempts for optimizing task-allocation that minimizes the product of the amount of communications and the n...
详细信息
ISBN:
(纸本)9780889868649
This paper proposes an optimization method of task-allocation for reducing contentions. There have been some attempts for optimizing task-allocation that minimizes the product of the amount of communications and the number of the communication hops. However, since those methods do not consider the occurrence of contentions, the effect has not been sufficient. The method proposed in this paper uses information of concurrent communication to estimate the effect of contentions to find the optimal task-allocation. In three environments examined on the experiments, the proposed method has shown a better effect than the existing method on two environments, tree and fat tree. On these environments, the maximum gain of performance over the existing method was about 25%. On the other hand, on a mesh environment, IBM BlueGene/L, the existing method better effect than the proposed method. As one of the reasons for this, the influence of the packet priority on the network of BlueGene/L to the behavior of the proposed method is discussed.
We consider the problem of scheduling parallel applications, represented by directed acyclic graphs (DAGs), onto Grid style resource pools. The core issues are that the availability and performance of grid resources, ...
详细信息
ISBN:
(纸本)9780889866379
We consider the problem of scheduling parallel applications, represented by directed acyclic graphs (DAGs), onto Grid style resource pools. The core issues are that the availability and performance of grid resources, which are already by their nature heterogeneous, can be expected to vary dynamically, even during the course of an execution. Typical scheduling methods in the literature partially address this issue because they consider static heterogenous computing environments (i.e. heterogeneous resources are dedicated and unchanging over time). This paper presents the Grid Task Positioning GTP scheduling method, which addresses the problem by allowing rescheduling of an executing application in response to significant variations in resource characteristics. GTP considers the impact of partial completion of tasks and task migration. We compare the performance of GTP with that of the well-known, and static, Heterogeneous Earliest Finish Time (HEFT) algorithm.
High Performance computing has been very useful to researchers in the Bioinformatics, Medical and related fields. The bioinformatics domain is rich in applications that require extracting useful information from very ...
详细信息
ISBN:
(纸本)9780889868205
High Performance computing has been very useful to researchers in the Bioinformatics, Medical and related fields. The bioinformatics domain is rich in applications that require extracting useful information from very large and continuously growing sequence of databases. Automated techniques such as DNA sequencers, DNA microarrays & others are continually growing the dataset that is stored in large public databases such as GenBank and Protein DataBank. Most methods used for analyzing genetic/protein data have been found to be extremely computationally intensive, providing motivation for the use of powerful computers or systems with high throughput characteristics. In this paper, we provide a case study for one such bioinformatics application called BLAT running in a high performance computing environment. We use sequences gathered from researchers and parallelize the runs to study the performance characteristics under three different query and data partitioning models. This research highlights the need to carefully develop a parallel model with energy awareness in mind, based on our understanding of the application and then appropriately designing a parallel model that works well for the specific application and domain. We found that the BLAT program is highly parallelizable and a high degree of speedup is achievable. The experiments suggest that the speed up depends on model used for query and database segmentation.
暂无评论