This article introduces SYD, a tool for the generation of time-stamps synchronized with the Universal Time Coordinate: the implementation includes an NTP client and is written in Java. Possible applications include (b...
详细信息
This article introduces SYD, a tool for the generation of time-stamps synchronized with the Universal Time Coordinate: the implementation includes an NTP client and is written in Java. Possible applications include (but are not limited to) the measurement of the performance of distributed applications, their debugging, and the implementation of real-time requirements in distributed systems. The features that characterize SYD are its portability across platforms, the low cost in terms of network overhead, and the absence of interference with other applications running on the same host (SYD does not synchronize the system clock!). In this paper, we prove that the choice of Java does impact on the performance of the tool, when compared with an equivalent application written in C. SYD is available on the Web as public domain software.
Most successful examples of parallelsimulation models were developed for parallel execution, from the beginning. A number of simulation models are designed only for sequential simulation, even in languages like PARSE...
详细信息
Most successful examples of parallelsimulation models were developed for parallel execution, from the beginning. A number of simulation models are designed only for sequential simulation, even in languages like PARSEC, that support both sequential and parallelsimulation algorithms. Converting such simulation models to a form that yields good performance with a parallel implementation can be non-trivial. In this paper we describe a case study showing this conversion process for a simulation model of replicated file systems. The details of the major steps taken in converting the simulation into a parallelsimulation are presented: correctness changes;performance changes such as communication topology simplification and lookahead specification;and modeling changes to eliminate performance bottlenecks. The details and performance improvements of each step are presented in this paper.
Although there exist many approaches for classification of computer architectures, no system is able to distinguish parallel computers adequately yet. The first part of this paper is a brief survey of the main charact...
详细信息
Although there exist many approaches for classification of computer architectures, no system is able to distinguish parallel computers adequately yet. The first part of this paper is a brief survey of the main characteristics of various parallel architecture concepts and several approaches of classification. We realized that the idea of defining a sharp classification system is questionable in the world of parallel computing. But we also recognized important structural similarities of computing systems. So we introduce a modeling system, called /spl rho/ (recursive hierarchical objects), which allows to describe the structure of various computing systems hierarchically However, we do not aim at a new classification system but at a scheme for modeling parallel architectures.
Leader election is a fundamental problem in distributed computing and regards a wide number of applications. In order to solve this problem, it is possible and convenient to exploit the topological properties of the s...
详细信息
Leader election is a fundamental problem in distributed computing and regards a wide number of applications. In order to solve this problem, it is possible and convenient to exploit the topological properties of the specific distributed systems, so to reduce time and message complexity. In this paper we study the problem of leader election in a hypercube network on the assumption that the system possesses a sense of direction, i.e. it is capable to distinguish between adjacent communication links; to demonstrate, two new optimal algorithms are presented. The correctness of the proposed algorithms is not constrained by the simultaneous activation of a subset of the processors in the network but the awake of only one processor suffices. The time and message complexity shown by both algorithms lets them be competitive compared to other solutions found in literature.
The Scalable simulation Framework (SSF) is an effort established to address the concern about tool quality. This work addresses the concern over unpredictable behavior. It is shown how the internal overheads of the Da...
详细信息
The Scalable simulation Framework (SSF) is an effort established to address the concern about tool quality. This work addresses the concern over unpredictable behavior. It is shown how the internal overheads of the Dartmouth implementation of the SSF API (DaSSF) have been measured, and how such measurements can be used to predict the performance of a given model, using given features of the simulator, without having to run, or even build, the model.
In a journalistic scenario new dimensions of cooperative feature writing are described. In order to compose a highly up-to-date article on economic events with global uses and consequences under the tight time constra...
详细信息
In a journalistic scenario new dimensions of cooperative feature writing are described. In order to compose a highly up-to-date article on economic events with global uses and consequences under the tight time constraints of a daily newspaper by utilizing the geographic competence of correspondents and economic or political experts, their latest observations and views (in contrast to heavily using archived material, even for weekly magazines), novel forms of cooperative work, including distributed on-line writing of feature sections, would have to be supported by innovative distributed operating system services, in particular complex group-based authorization and authentication procedures. The related aspects of our distributed operating system DRAGON SLAYER III are presented, the utilization of services is explained in the scenario context and the authentication of all users and sites participating in the scenario is described. Outlines of ongoing and future work are also discussed.
The High Level Architecture (HLA) provides the specification of a software architecture for distributedsimulation. The baseline definition of the HLA includes the HLA Rules, The HLA Interface Specification, and the H...
详细信息
The High Level Architecture (HLA) provides the specification of a software architecture for distributedsimulation. The baseline definition of the HLA includes the HLA Rules, The HLA Interface Specification, and the HLA Object Model Template. The HLA Rules are a set of 10 basic rules that define the responsibilities and relationships among the components of an HLA federation. The HLA Interface Specification provides a specification of the functional interfaces between HLA federates and the HLA Runtime Infrastructure. The HLA OMT provides a common presentation format for HLA simulation and Federation Object Models. The HLA was developed over the past three years. It is currently in the process of being applied with simulations developed for analysis, training and test and evaluation and incorporated into industry standards for distributedsimulation by both the Object Management Group and the IEEE. This paper provides a discussion of key areas where there are technology challenges in the future implementation and application of the HLA.
We present two different algorithms implemented through neural networks on a multiprocessor device. The parallel single-chip TI TMS32C80 Multimedia Video Processor (MVP). The goal of this experimentation is to test, o...
详细信息
We present two different algorithms implemented through neural networks on a multiprocessor device. The parallel single-chip TI TMS32C80 Multimedia Video Processor (MVP). The goal of this experimentation is to test, on real problems, the performance of this powerful unit made up by one Master Risc Processor and by four Slave Digital Signal Processors (DSPs) and to evaluate its suitability to neural network applications. The first problem implemented is a typical classification algorithm in which the network recognises which points belong to different regions inside a 2D space. The second problem is more computationally heavy and consists of a network able to recognise 'handwritten' digits. The parallel version of the first algorithm, was also tested on a commercially available supercomputer.
PQE2000 is an Italian project on High Performance Computing (HPC), whose goal is the realization of innovative general purpose systems and programming tools, as well as the development of new strategic HPC application...
详细信息
PQE2000 is an Italian project on High Performance Computing (HPC), whose goal is the realization of innovative general purpose systems and programming tools, as well as the development of new strategic HPC applications for industry, commerce and public services. The research activities of PQE2000 include MPP architecture, software tools and programming environments, and applications in technical, transactional and new media areas. The results of these activities are transferred into industrial products. This paper describes the general guidelines, the current products and technologies of PQE2000, and the trends in the area of relationships between programming environments and architectural models for the efficient exploitation of large degree of variable grain parallelism in complex applications.
In this work, we propose a heuristic algorithm based on genetic algorithm for the task-to-processor mapping problem in the context of local-memory multiprocessors with a hypercube interconnection topology. Hyper-cube ...
详细信息
In this work, we propose a heuristic algorithm based on genetic algorithm for the task-to-processor mapping problem in the context of local-memory multiprocessors with a hypercube interconnection topology. Hyper-cube multiprocessors have offered a cost effective and feasible approach to supercomputing through parallelism at the processor level by directly connecting a large number of low-cost processors with local memory which communicate by message passing instead of shared variables. We use concepts of the graph theory (task graph precedence to represent parallel programs, graph partitioning to solve the program decomposition problem, etc.) to model the problem. This problem is NP-complete which means heuristic approaches must be adopted. We develop a heuristic algorithm based on genetic algorithms to solve it.
暂无评论