this paper presents a concept called virtual clusters (VCs) to allocate resources for an application from a computing utility with a geographically distributed resource base. the VC creation process is modeled as a fa...
详细信息
this paper presents a concept called virtual clusters (VCs) to allocate resources for an application from a computing utility with a geographically distributed resource base. the VC creation process is modeled as a facility location problem and an efficient heuristic is devised to solve it. We extend the model to include an "overload partition" to a VC such that demand surges can be efficiently handled. Extensive simulations have been conducted to examine the performance of VCs under different scenarios and to compare it with a fully dynamic scheme called the Service Grid. the results indicate that VC is more cost-effective and robust than Service Grid.
Program analysis is an important activity to evaluate and subsequently improve the quality of software. Many different visualization tools offer more or less sophisticated functionality for this task. However, the vis...
详细信息
Program analysis is an important activity to evaluate and subsequently improve the quality of software. Many different visualization tools offer more or less sophisticated functionality for this task. However, the visual capabilities of the tool are usually pre-defined by the tool developers' intentions or are only marginally adaptable to the user's needs. On contrary, the VisWiz tool offers a means of providing user-defined visualization for analysis of parallel and distributed programs. By configuring the mapping of observed events and their relations using a XML configuration file, users are able to develop specialized graphical displays, which better suit their expectations and improve program comprehension. Examples of VisWiz are given for debugging, performance tuning, and runtime monitoring of parallel and distributed programs.
On parallelsystems, jobs that request a large fraction of the maximum resources available on the system may incur poor wait time. this paper evaluates whether giving a reservation to every waiting job can improve lar...
详细信息
On parallelsystems, jobs that request a large fraction of the maximum resources available on the system may incur poor wait time. this paper evaluates whether giving a reservation to every waiting job can improve large jobs without significantly degrading the performance of other jobs. Using a wide range of workloads, including more recent workloads than SP2 workloads, and a more complete set of performance measures than in previous studies, we provide new observations of potential benefit and problem of reservation policies that give all jobs a reservation.
As the complexity of chip designs increase, simulation time also increases. Unit and variable delay simulation takes the most simulation time in IC design process;however, parallel processing performs inefficiently du...
详细信息
As the complexity of chip designs increase, simulation time also increases. Unit and variable delay simulation takes the most simulation time in IC design process;however, parallel processing performs inefficiently due to large amount of synchronization. In this paper, techniques to reduce the number of synchronization points in synchronous designs are proposed, and a partitioner to partition designs along flip-flop boundaries is also proposed so that these techniques can be employed on real designs.
this paper describes a general technique to identify control flow errors in parallel programs, which can be automated into a compiler. the compiler builds a system of linear equations that describes the global control...
详细信息
this paper describes a general technique to identify control flow errors in parallel programs, which can be automated into a compiler. the compiler builds a system of linear equations that describes the global control flow of the whole program. Solving these equations using standard techniques of linear algebra can locate a wide range of control flow bugs at compile time. this paper also describes an implementation of this control flow analysis technique in a prototype compiler for a well-known parallel programming language. In contrast to previous research in automated parallel program analysis, our technique is efficient for large programs, and does not limit the range of language features.
A methodology is presented that allows for a distributed execution of systems on several micro controllers and a FPGA (Field Programmable Gate Array). By using a FPGA the system performance can be increased significan...
详细信息
A methodology is presented that allows for a distributed execution of systems on several micro controllers and a FPGA (Field Programmable Gate Array). By using a FPGA the system performance can be increased significantly by means of parallel processing. thereby, hybrid electronic systems are focused on, which contain both state-based and continuous model parts. In order to fulfill real time requirements a real time operating system is used. For the measurement of the system performance a method is presented to analyze the time behavior that enables a graphical representation of the execution time interval and of the execution points in time of the tasks and the recognition of idle running times, and thus supports an optimization of the task scheduling. the data exchange is realized with CAN (Controller Area Network).
Update methods are an important aspect of the burgeoning Artificial Life research area. Artificial Life models, like the Predator-Prey model, are able to operate quite efficiently when implemented in a sequential mann...
详细信息
Update methods are an important aspect of the burgeoning Artificial Life research area. Artificial Life models, like the Predator-Prey model, are able to operate quite efficiently when implemented in a sequential manner only while population numbers are low to moderate. We find that for large populations sequential implementations are too slow to extract meaningful measurement statistics. In this paper we discuss the parallelisation of sequential update methods for use in Artificial Life systems. We also discuss the ramifications that parallel update algorithms introduce to data dependencies and also the meaning of correctness in parallel models.
this paper presents a data format for the parallel numerical integration package PARINT using XML. As with many other numeric computation programs, PARINT accepts a long list of arguments for describing the user's...
详细信息
this paper presents a data format for the parallel numerical integration package PARINT using XML. As with many other numeric computation programs, PARINT accepts a long list of arguments for describing the user's problem, the algorithm to be used and for specifying parallel run characteristics. Supporting XML input allows platform-independent creation and manipulation of input specifications and simplifies the addition of new integration algorithms. We discuss the purpose of each section in the proposed XML data format, and describe how new sections can be added to the XML data structure in order to support new computing paradigms. We also explain how data are processed efficiently and give some application examples. the format can serve more generally for various software packages.
this paper presents coordinated virtual partition (CVP) for Grid computingsystems. the CVP is a way for regulating the resources supplied to different components of an application in unison according to an agreed rel...
详细信息
this paper presents coordinated virtual partition (CVP) for Grid computingsystems. the CVP is a way for regulating the resources supplied to different components of an application in unison according to an agreed relative proportion. this study shows that coordinated resource provisioning has several benefits including: (a) reducing the wait times experienced by an application and (b) improving the overall application performance by reducing the wait times. the CVP achieves these benefits by releasing resources from "fast" running application components that can be reallocated by the Grid for other applications.
We present a novel dynamic on-the-fly race detection mechanism called parallel Nondeterminator to check for determinacy races during the parallel execution of a program with Spawn-Sync parallelism. the parallel Nondet...
详细信息
We present a novel dynamic on-the-fly race detection mechanism called parallel Nondeterminator to check for determinacy races during the parallel execution of a program with Spawn-Sync parallelism. the parallel Nondeterminator provides provable correctness and efficiency. Let D denote the maximum depth of the recursion in the parallel program. the worst case slowdown in execution incurred for each spawn operation is O(D), the overhead for each sync operation is O(1) and the time required to monitor any shared memory access is O(log D). Moreover, we have implemented the parallel Nondeterminator in Cilk, a parallel language developed at MIT. Boththeoretical and experimental results give strong evidences for the efficiency of our algorithm.
暂无评论