distributed real-time control systems require a powerful communication link between remote subsystems. The Controller Area Network (CAN) was developed to support such applications. The CAN is a serial bus with high sp...
详细信息
distributed real-time control systems require a powerful communication link between remote subsystems. The Controller Area Network (CAN) was developed to support such applications. The CAN is a serial bus with high speed, high reliability, and low cost for distributed real time control applications. The CAN is a desirable, cheap solution for networks in industrial environments, but there is limit on the maximum length of a single CAN. A solution is to divide the CAN network into segments and connect them using bridges. Bridges are high performance devices that are used to interconnect LANs at the Logical Link Control (LLC) or Medium Access Control (MAC) level in the protocol hierarchy. Unlike many serial communication protocols the CAN message contains no information relating to the destination and source addresses. Because of this feature of the CAN message it is not possible to use traditional address based bridges to connect CAN segments. The aim of this study is to design and implement a bridge to connect CAN segments based on CAN protocol features.
Various traditional solvers have been proposed in recent years for different parallel platforms. In this paper, the performance of three tridiagonal solvers, namely, the parallel partition LU algorithm, the parallel d...
详细信息
Various traditional solvers have been proposed in recent years for different parallel platforms. In this paper, the performance of three tridiagonal solvers, namely, the parallel partition LU algorithm, the parallel diagonal dominant algorithm, and the reduced diagonal dominant algorithm, is studied. These algorithms are designed for distributed-memory machines and are tested on an Intel Paragon and an IBM SP2 machines. Measured results are reported in terms of execution time and speedup. The measured results match analytical results closely. In addition to address implementation issues, performance considerations such as problem sizes and models of speedup are also discussed.
In this paper, we describe an object-based distributed shared memory called Adsmith. In an object-based DSM, the shared memory consists of many shared objects, through which the shared memory is accessed. Adsmith is b...
详细信息
ISBN:
(纸本)0818674601
In this paper, we describe an object-based distributed shared memory called Adsmith. In an object-based DSM, the shared memory consists of many shared objects, through which the shared memory is accessed. Adsmith is built on top of PVM at the library layer using C++. PVM is used as the communication subsystem, because it is a de facto standard and encapsulates many system related details. Several mechanisms are used to improve the performance of Adsmith, such as release memory consistency, load/store-like memory accesses, nonblocking accesses, and atomic operations, etc. Performance results show that even though Adsmith is implemented on top of PVM, programs running on Adsmith can achieve a performance comparable with those running directly on PVM.
SmartNet is a scheduling framework for heterogeneous systems. Preliminary conservative simulation results for one of the optimization criteria, show a 1.21 improvement over Load Balancing and a 25.9 improvement over L...
详细信息
SmartNet is a scheduling framework for heterogeneous systems. Preliminary conservative simulation results for one of the optimization criteria, show a 1.21 improvement over Load Balancing and a 25.9 improvement over Limited Best Assignment, the two policies that evolved from homogeneous environments. SmartNet achieves these improvements through the implementation of several innovations. It recognizes and capitalizes on the inherent heterogeneity of computers in today's distributed environments; it recognizes and accounts for the underlying non-determinism of the distributed environment; it implements an original partitioning approach, making runtime prediction more accurate and useful; it effectively schedules based on all shared resource usage, including network characteristics; and it uses statistical and filtering techniques, making a greater amount of prediction information available to the scheduling engine. In this paper, the issues associated with automatically managing a heterogeneous environment are reviewed, SmartNet's architecture and implementation are described, and performance data is summarized.
Two facts that suggest the desirability of a hierarchical approach to cost-effective high-performance computing are empirically established in this paper. The first fact is the temporal locality of programs with respe...
详细信息
Two facts that suggest the desirability of a hierarchical approach to cost-effective high-performance computing are empirically established in this paper. The first fact is the temporal locality of programs with respect to the degree of parallelism. Two temporal (instruction and data) locality principles are identified and empirically established for a set of programs. The impact of this behavior is discussed with respect to the proposed heterogeneous multilevel architecture. The second fact that supports the hierarchical architecture is the cost-efficiency advantage of heterogeneous over homogeneous multiprocessor systems. An initial performance analysis is presented which quantifies this fact for the proposed heterogeneous hierarchical organization. The proposed multilevel processor configuration uses fast and costly resources sparingly to reduce sequential and low parallelism bottlenecks. The resulting organization tries to balance cost, speed and parallelism granularity.
Shared memory is widely believed to provide an easier programming model than message passing for expressing parallel algorithms. distributed Shared Memory (DSM) systems provide the illusion of shared memory on top of ...
详细信息
Shared memory is widely believed to provide an easier programming model than message passing for expressing parallel algorithms. distributed Shared Memory (DSM) systems provide the illusion of shared memory on top of standard message passing hardware at very low implementation cost, but provide acceptable performance for only a limited class of applications. We argue that the principal sources of overhead overhead in DSM systems can be dramatically reduced with modest amounts of hardware support (substantially less than is required for hardware cache coherence). Specifically, we present and evaluate a family of protocols designed to exploit hardware support for a global, but non-coherent, physical address space. We consider systems both with and without remote cache fills, fine-grain access faults, "doubled" writes to local and remote memory, and merging write buffers. We also consider varying levels of latency and bandwidth. We evaluate our protocols using execution driven simulation, comparing them to each other and to a state-of-the-art protocol for traditional message-based networks. For the programs in our application suite, protocols taking advantage of the global address space improve performance by a minimum of 50% and sometimes by as much as an order of magnitude.
We present a unified approach for delivering hypermedia/multimedia objects over broadband networks. Documents are stored in various multimedia servers, while the inline data may reside in their own media servers, atta...
详细信息
ISBN:
(纸本)9780818675829
We present a unified approach for delivering hypermedia/multimedia objects over broadband networks. Documents are stored in various multimedia servers, while the inline data may reside in their own media servers, attached to the multimedia servers. The described service consists of several multimedia servers and a set of functions that intend to present to the end user interactive information in real time. Users interact with the service requesting multimedia documents on demand. Various media streams are transmitted over different parallel connections according to their transmission requirements. The hypermedia documents are structured using a hypermedia markup language that keeps information of the spatio temporal relationships among document's media components. In order to deal with the variant network behavior, buffering manipulation mechanisms and grading of the transmitted media quality techniques are proposed to smooth presentation and synchronization anomalies.
In this paper we investigate the combination of multitasking and multithreading in a (virtual) shared memory parallel machine running a number of parallel applications. In particular, we investigate whether it is bett...
详细信息
In this paper we investigate the combination of multitasking and multithreading in a (virtual) shared memory parallel machine running a number of parallel applications. In particular, we investigate whether it is better to run related threads, or unrelated threads on each node to achieve the best system throughput and to complete a mix of applications as quickly as possible. The experiments provide results for a range of mixes of applications. One of our benchmarks has a clear preference to place its threads across the whole machine, while the others have a slight preference to run their threads on smaller partitions of the machine. The differences are mostly slight, suggesting that the system scheduler has considerable flexibility in thread placement without jeopardising performance.
Networks of workstations and high-performance microcomputers have been rarely used for running high-performance applications like multimedia, simulations, scientific and engineering applications, because, although the...
详细信息
Networks of workstations and high-performance microcomputers have been rarely used for running high-performance applications like multimedia, simulations, scientific and engineering applications, because, although they have significant aggregate computing power, they lack the support for efficient message-passing and shared-memory communication. In this paper we present Telegraphos, a distributed system that provides efficient shared-memory support on top of a workstation cluster. We focus on the network interface of Telegraphos that provides a variety of shared-memory operations like remote reads, remote writes, remote atomic operations, all launched from user level without any intervention of the operating system. Telegraphos I, the first Telegraphos prototype has been implemented. Emphasis was put on rapid prototyping, so the technology used was conservative: FPGA's, SRAM's, and TTL buffers. Telegraphos II, is the single-chip version of the Telegraphos architecture; its switch was implemented and its network interface is being debugged.
The main contribution of this work is to show that a number of seemingly unrelated problems in database design, pattern recognition, robotics, and image processing can be solved simply and elegantly by formulating the...
详细信息
The main contribution of this work is to show that a number of seemingly unrelated problems in database design, pattern recognition, robotics, and image processing can be solved simply and elegantly by formulating them as instances of a general problem-the multiple query (MQ) problem. An arbitrary instance of the multiple query problem consists of a collection A={a/sub 1/, a/sub 2/, ..., a/sub n/} of items, a collection Q={q/sub 1/, q/sub 2/, ..., q/sub m/} (1/spl les/m/spl les/n) of queries, a decision problem /spl phi/:Q/spl times/A/spl rarr/{"yes", "no"}, and an associative and commutative function f operating on subsets of A. For every query q/sub i/, let S/sub i/ be the set of items a/sub j/ in A for which /spl phi/(q/sub i/, a/sub j/)="yes". The solution of q/sub i/ is defined to be f(S/sub i/). In this context, the multiple query problem involves solving all the queries in Q. We begin by showing that if the collections A and Q are stored one item and at most one query per processor on a mesh with multiple broadcasting of size /spl radic/n/spl times//spl radic/n then any algorithm that solves the MQ problem requires /spl Omega/(m1/3n1/6) time in the worst case. second, we show that a number of fundamental problems can be solved simply and elegantly by formulating them as instances of the MQ problem.
暂无评论