To realize the requirement of next mobile communication, the adaptive allocation schema of power and bit is presented by using discrete wavelet packet transform (DWPT). the subcarrier modulation and rate allocation me...
详细信息
ISBN:
(纸本)3540292357
To realize the requirement of next mobile communication, the adaptive allocation schema of power and bit is presented by using discrete wavelet packet transform (DWPT). the subcarrier modulation and rate allocation method is used for OFDM-DS/CDMA system. According to the downlink channel feedback bit error rate (BER) acquired from uplink channel, an optimal wavelet packet multicarrier modulation allocation is established. For the given BER and QoS, the transmitting power is minimum. the system realizes the high frequency spectrum efficiency. the result shows that the adaptive wavelet packet algorithm not only has the fast convergence rate, but also achieves minimum complexity. the allocation proposed has better performance compared with traditional OFDM system. It is valuable that the system can adaptively adjust the power and bit rate to achieve minimum total transmission power in high rate and efficiency.
the star graph, as an interesting network topology, has been extensively studied in the past. In this paper, we address some of the combinatorial properties of the star graph. In particular, we consider the problem of...
详细信息
the star graph, as an interesting network topology, has been extensively studied in the past. In this paper, we address some of the combinatorial properties of the star graph. In particular, we consider the problem of calculating the surface area of the star graph, answering an open problem previously posed in (K. Qiu et al., 1995).
Shared Memory is an interesting communication paradigm for SMP machines and clusters. Weak consistency models have been proposed to improve efficiency of shared memory applications. In a programming environment offeri...
详细信息
ISBN:
(数字)9783540320715
ISBN:
(纸本)3540292357
Shared Memory is an interesting communication paradigm for SMP machines and clusters. Weak consistency models have been proposed to improve efficiency of shared memory applications. In a programming environment offering weak consistency it is a necessity to worry about individual load and store operations and about proper synchronization. In contrast to this explicit style of distributed programming hared memory systems implementing strong consistency models are easy to program and consistency is implicit. In this paper we compare two representatives: Kerrighed and Plurix implementing sequential and transactional consistency respectively. Kerrighed is a single system image operating system (OS) based on Linux whereas Plurix is a native OS for PC clusters designed for shared memory operation. the measurements presented in this paper show that strong consistency models implemented at the OS level are competitive.
Static algorithms have been proposed to parallelize loops with uniform dependencies for networks of workstations. However, the heterogeneous and dynamic nature of these networks demands a dynamic solution to the sched...
详细信息
Static algorithms have been proposed to parallelize loops with uniform dependencies for networks of workstations. However, the heterogeneous and dynamic nature of these networks demands a dynamic solution to the scheduling and load balancing problem. At the same time many dynamic scheduling algorithms have been proposed, but all of them are dealing with programs withparallel loops, i.e. loops without dependencies. In this paper we extend the applicability of dynamic algorithms by presenting a dynamic scheduling algorithm that uses simple data structures in order to handle programs with data dependencies. Experimental results validate the proposed algorithm in both homogeneous and heterogeneous networks.
PLVIP, experimental library for parallel image processing, designed and implemented in the Image processing Laboratory of the Institute of Computational Mathematics and Mathematical Geophysics SB RAS, is described. th...
详细信息
Efficient utilization of processing resources in a multicomputer system depends on the fast allocation algorithmsthat minimize system fragmentation. A small number of jobs with large submesh allocation requirements m...
详细信息
Efficient utilization of processing resources in a multicomputer system depends on the fast allocation algorithmsthat minimize system fragmentation. A small number of jobs with large submesh allocation requirements may significantly increase external fragmentation and the queuing delay of the rest of jobs. Under such circumstances, the proposed strategy further tries to allocate L-shaped submeshes instead of signaling the allocation failure. A simple and effective algorithm to find the allocatable L-shaped submeshes is proposed, that is shown to reduce the average turnaround time by minimizing the queuing delay, even though jobs are scheduled in an FCFS to preserve fairness. the extensive simulations show that the strategy performs more efficiently in terms of the task turnaround time and the system utilization.
the paper is dedicated to an open T-system (OpenTS) - a programming system that supports automatic parallelization of computations for high-performance and distributed applications. In this paper, we describe the syst...
详细信息
ISBN:
(纸本)3540281266
the paper is dedicated to an open T-system (OpenTS) - a programming system that supports automatic parallelization of computations for high-performance and distributed applications. In this paper, we describe the system architecture and input programming language as well as system's distinctive features. the paper focuses on the achievements of the last two years of development, including support of distributed, meta-cluster computations.
作者:
E. FrachtenbergModeling
Algorithms and Informatics Group (CCS-3) Computer and Computational Sciences Division Los Alamos National Laboratory USA
Commodity hardware and software are growing increasingly more complex, with advances such as chip heterogeneity and specialization, deeper memory hierarchies, fine-grained power management, and most importantly, chip ...
详细信息
Commodity hardware and software are growing increasingly more complex, with advances such as chip heterogeneity and specialization, deeper memory hierarchies, fine-grained power management, and most importantly, chip parallelism. Similarly, workloads are growing more concurrent and diverse. Withthis new complexity in hardware and software, process scheduling in the operating system (OS) becomes more challenging. Nevertheless, most commodity OS schedulers are based on design principles that are 30 years old. this disparity may soon lead to significant performance degradation. Most significantly, parallelarchitectures such as multicore chips require more than scalable OSs: parallel programs require parallel-aware scheduling. this paper posits that imminent changes in hardware and software warrant reevaluating the scheduler's policies in the commodity OS. We discuss and demonstrate the main issues that the emerging parallel desktops are raising for the OS scheduler. We propose that a new approach to scheduling is required, applying and generalizing lessons from different domain-specific scheduling algorithms, and in particular, parallel job scheduling. Future architectures can also assist the OS by providing better information on process scheduling requirements.
A parallel computational code is developed for the execution of the Proper Orthogonal Decomposition (POD) of turbulent flow fields in fluid dynamics, the POD is an analytically-founded statistical technique that permi...
详细信息
ISBN:
(纸本)3540281266
A parallel computational code is developed for the execution of the Proper Orthogonal Decomposition (POD) of turbulent flow fields in fluid dynamics, the POD is an analytically-founded statistical technique that permits the eduction of appropriately-defined modes of the flow from the background flow, allowing the determination of the coherent structures of turbulence. the computational aspects of the different phases of the computing procedure are analyzed and the development of the related parallel computational code is described. Computational tests corresponding to different computing domains and number of processors are executed on a HP-V2500 parallel computing system and the results are shown in terms of parallel performance of the different phases of the calculations separately considered and of the computational code in the whole.
A chip-multiprocessor is one of the promising architecturesthat can overcome the ILP limitation, high power consumption and high heating that current processors face. On a shared memory multiprocessor, a performance ...
详细信息
A chip-multiprocessor is one of the promising architecturesthat can overcome the ILP limitation, high power consumption and high heating that current processors face. On a shared memory multiprocessor, a performance improvement relies on an efficient communication and synchronization method via shared variables. the TSVM cache combines communication and synchronization withthe coherence maintenance on a chip-multiprocessor. that is, the communication and synchronization via shared variables are realized by one coherence transaction through a high-speed on chip inter-connection. the TSVM cache provides several instructions that each instruction has the individual coherence maintenance scheme. the combinations of these instructions can realize the producer-consumers synchronization, mutual exclusion and barrier synchronization with communication easily and systematically. this paper describes how those instructions construct three primitives and shows effect of these primitives using a clock cycle-accurate simulator written in VHDL. the result shows that the TSVM cache can improve a performance of 9.8 times compared with a traditional cache memory, and improve a performance of 2 times compared with a conventional cache memory with synchronization mechanism.
暂无评论