In this paper, we propose a new virtual cache architecture that reduces memory latency by encompassing the merits of both direct-mapped cache and set-associative cache, The entire cache memory is divided into n banks,...
详细信息
In this paper, we propose a new virtual cache architecture that reduces memory latency by encompassing the merits of both direct-mapped cache and set-associative cache, The entire cache memory is divided into n banks, and the operating system assigns one of the banks to a process when it is created. Then, each process runs on the assigned bank, and the bank behaves like in a direct-mapped cache. If a cache miss occurs in the active home bank, then the data will be fetched either from other banks or from the main memory like a set-associative cache. A victim for cache replacement is selected from those that belong to a process which is most remote from being scheduled. Trace-driven simulations confirm that the new scheme removes almost as many conflict misses as does the set-associative cache, while cache access time is similar to a direct-mapped cache.
A program's working set is the collection of segments recently referenced. This concept has led to efficient methods for measuring a program's intrinsic memory demand. It has assisted in understanding and mo...
详细信息
A program's working set is the collection of segments recently referenced. This concept has led to efficient methods for measuring a program's intrinsic memory demand. It has assisted in understanding and modeling program behavior, and it has been used as the basis of optimal multiprogrammed memory management. The total cost of a working set dispatcher is no larger than the total cost of other common dispatchers. It is unlikely that anyone will find a cheaper nonlookahead memory policy that delivers significantly better *** with real programs revealed that the working set policy is the most likely to generate minimum space-time for any given program, and that one properly chosen control parameter value is normally sufficient to cause any program's working-set space-time to be within 10% of the minimum possible for that program. Working set dispatchers automatically control the level of multiprogramming while maintaining near-minimum space-time for each program.
a new derivation of Peterson's well-known mutual exclusion algorithm is presented. The derivation is driven by the formally stated requirements of Individual Progress, as opposed to the more traditional approach w...
详细信息
a new derivation of Peterson's well-known mutual exclusion algorithm is presented. The derivation is driven by the formally stated requirements of Individual Progress, as opposed to the more traditional approach which starts from the requirement of Mutual Exclusion. The only formalisms used in the derivation are the predicate calculus and the theory of Owicki and Gries. No use is made of temporal logic. In particular, the so complicating oscillating behaviour of an await-condition is fully absorbed by the use of a variant function. (C) 1997 Elsevier Science B.V.
Permeability prediction from well logs is of great importance in reservoir characterization and engineering. In this paper, a new method is proposed to correlate conventional well logs and core permeability data. It u...
详细信息
Permeability prediction from well logs is of great importance in reservoir characterization and engineering. In this paper, a new method is proposed to correlate conventional well logs and core permeability data. It uses an improved "windowing" technique to incorporate adjacent core data to the permeability predictor in such a way that the scales of the well log and core measurements are matched It also has the capability to evaluate the reliability of each and every prediction. The method is implemented by the use of a neural network and is demonstrated by means of a case study. The study uses a set of well logs and limited core permeability data to produce continuous permeability profiles. The results show that the permeability profiles are consistent with the core permeability and the geological sequence of the reservoir The reliability indicator is particularly useful for examining reservoir heterogeneity and sampling.
Most multiprocessors are multiprogrammed to achieve acceptable response time and to increase their utilization. Unfortunately, inopportune preemption may significantly degrade the performance of synchronized parallel ...
详细信息
Most multiprocessors are multiprogrammed to achieve acceptable response time and to increase their utilization. Unfortunately, inopportune preemption may significantly degrade the performance of synchronized parallel applications. To address this problem, researchers have developed two principal strategies for a concurrent, atomic update of shared data structures: (1) preemption-safe locking and (2) nonblocking (lock-free) algorithms. Preemption-safe locking requires kernel support. Nonblocking algorithms generally require a universal atomic primitive such as compare-and-swap or load-linked/store- conditional and are widely regarded as inefficient. We evaluate the performance of preemption-safe lock-based and nonblocking implementations of important data structures-queues, stacks, heaps, and counters-including nonblocking and lock-based queue algorithms of our own. in microbenchmarks and real applications on a 12-processor SGI Challenge multiprocessor. Our results indicate that our nonblocking queue consistently outperforms the best known alternatives and that data-structure-specific nonblocking algorithms, which exist for queues, stacks, and counters, can work extremely well. Not only do they outperform preemption-safe lock-based algorithms on multiprogrammed machines, they also outperform ordinary locks on dedicated machines. At the same time, since general-purpose nonblocking techniques do not yet appear to be practical, preemption-safe locks remain the preferred alternative for complex data structures: they outperform conventional locks by significant margins on multiprogrammed systems. (C) 1998 Academic Press.
A program ROM has been developed for the BBC microcomputer, which makes it function as a powerful window management terminal, with a high resolution display, mouse and programmable keyboard. The BBC Window terminal RO...
详细信息
A program ROM has been developed for the BBC microcomputer, which makes it function as a powerful window management terminal, with a high resolution display, mouse and programmable keyboard. The BBC Window terminal ROM provides a cheap alternative to the more expensive windowing workstations that provide the same resolution and facilities. It also enables windowing interfaces to be attached to existing computer hardware without having to port the software to the new workstation environment – a fast and cheap method for improved man–machine interface. All window management operations are handled locally, with control information being passed between the terminal and the host computer via an RS232 interface line.
In the letter, the analysis of a new technique for fault-tolerant computing is presented. This technique is applicable to systems with a large number of modules which require the simultaneous and reliable execution of...
详细信息
In the letter, the analysis of a new technique for fault-tolerant computing is presented. This technique is applicable to systems with a large number of modules which require the simultaneous and reliable execution of multiple jobs.
System designs that do not require a significant investment in training are important. A toolbox of simple utilities provides a way to create multiple execution threads in a single program, and does this with a minima...
详细信息
System designs that do not require a significant investment in training are important. A toolbox of simple utilities provides a way to create multiple execution threads in a single program, and does this with a minimal learning curve. This solution is quick to implement, requires no special training in the theory of operation, and avoids some common problems found in more advanced systems.
The design and implementation of a microcomputer network to support laboratory automation is described and discussed. It is a multi-level hierarchical star network connected to a multiprogrammed computer on which all ...
详细信息
The design and implementation of a microcomputer network to support laboratory automation is described and discussed. It is a multi-level hierarchical star network connected to a multiprogrammed computer on which all program development is done. The system is capable of supporting a large number of experimental setups. The microcomputers are either used as (1) local microcomputers for experiment control, (2) multiplexers for other microcomputers or (3) controllers for peripherals known in the multiprogrammed computer. The system combines the advantages of a large multiprogrammed computer with those of a small cheap dedicated computer close to the experiment. Character-oriented peripherals are connected to the multiprogrammed computer only. This reduces the amount of system software to be written for the network by an order of magnitude and eliminates the need for interfacing to existing small computer software. The system software consist of three small programs (monitors) providing a process concept and multibuffering of data in the involved computers. The monitor establishes a hierarchy of control and they eliminate the need for any local load device for the microcomputers as their monitors are in a read-only store. The system developed is designed for control of experiments in an environment where the experiments and the control strategies change with time and where the data refinement required is beyond what can be done on the present generation of microcomputers.
Small-scale shared-memory multiprocessors are commonly used in a workgroup environment where multiple applications, both parallel and sequential, are executed concurrently while sharing the processors and other system...
详细信息
Small-scale shared-memory multiprocessors are commonly used in a workgroup environment where multiple applications, both parallel and sequential, are executed concurrently while sharing the processors and other system resources. To utilize the processors efficiently, an effective allocation strategy is required. In this paper, we use performance data obtained from an SGI multiprocessor to evaluate several processor allocation strategies when running two parallel programs simultaneously. We examine gang scheduling (coscheduling), static space-sharing (space partitioning), and a dynamic allocation scheme called lop-level process control (LLPC) with three different dynamic allocation heuristics. We use regression analysis to quantify the measured data and thereby explore the relationship between the degree of parallelism of the application, specific system parameters (such as the size of the system), the processor allocation strategy, and the resulting performance. This study shows that dynamically partitioning the system using LLPC or similar heuristics provides better performance for applications with a high degree of parallelism than either gang scheduling or static space-sharing. (C) 1998 Academic Press.
暂无评论