We investigate two strategies to improve the performance of collective communications on a cluster of SMPs. In the first part of the paper, we explore the advantages and disadvantages of the commonly used strategy &qu...
详细信息
We investigate two strategies to improve the performance of collective communications on a cluster of SMPs. In the first part of the paper, we explore the advantages and disadvantages of the commonly used strategy "optimize shared memory send and receive only". In the second part of the paper, we show a model to design collective communications on SMP clusters which can be used to improve the disadvantages of the former strategy. Several collective operations are implemented through the later process, and the experimental results show better performance than vendor supplied implementations.
A real-time unwanted-audio cancellation system is developed. The system enhances recorded sound by canceling unwanted loudspeaker sounds picked up during the recording. After cancellation, the resulting sound gives an...
详细信息
We propose a training-sequence-based frequency offset estimator for a multiple transmit-and-receive antenna system in frequency-flat fading channels. The estimator is based on a maximum likelihood (ML) criterion and d...
详细信息
ISBN:
(纸本)0780385217
We propose a training-sequence-based frequency offset estimator for a multiple transmit-and-receive antenna system in frequency-flat fading channels. The estimator is based on a maximum likelihood (ML) criterion and does not require channel information. To reduce the computational load, we propose to use special training sequences - the periodic orthogonal codes. Using these codes, we get a closed form estimator which requires much lower computational load (some additions and multiplications). For a high signal-to-noise ratio (SNR) and small frequency offset, the proposed estimator achieves the performance of the optimal ML estimator, which locates the peak of the likelihood function. We also apply the proposed estimator to an orthogonal frequency division multiplexing (OFDM) system. With simulations, we test the performance of the proposed estimator.
We consider beamforming methods for the downlink of space division multiple access (SDMA) systems when there are channel information errors. A single antenna at the mobile is generally assumed in SDMA systems, but we ...
详细信息
ISBN:
(纸本)0780382552
We consider beamforming methods for the downlink of space division multiple access (SDMA) systems when there are channel information errors. A single antenna at the mobile is generally assumed in SDMA systems, but we consider multi-input and multi-output (MIMO) systems which adopt multiple antennas at both transmitter and receiver. If we know the exact channel information of each user, cochannel interference among users can be perfectly removed by the zero-forcing nulling method in MIMO/SDMA systems. However, the channel estimate always contains estimation errors in real systems, which can cause significant performance degradation. To cope with channel information errors, we develop two beamforming weights based on null-space constraints. The first one is designed to minimize the transmit power while not influencing the desired receive signal. The interfering effect from channel information errors is reduced since the transmit power of cochannel interfering users is minimized. The second one is designed to minimize the power of the received interference signal. It computes the beamforming weight which has the minimum expectation of interference power with knowledge of the variance of channel estimation errors. The proposed beamforming weights are tested by computer simulations.
This paper proposes a novel coordinated aggregate scheduling (CAS) algorithm that combines both EOF (earliest-deadline-first) scheduling and rate-based fair queueing. CAS uses guaranteed rate (GR) scheduling (Goyal, P...
详细信息
This paper proposes a novel coordinated aggregate scheduling (CAS) algorithm that combines both EOF (earliest-deadline-first) scheduling and rate-based fair queueing. CAS uses guaranteed rate (GR) scheduling (Goyal, P et al., 1995) for traffic aggregates at the inter-aggregate level, but employs EDF-like scheduling at the intra-aggregate level. Computation of the deadline D/sub N/ of a packet at an intermediate node N is coordinated between the node N and its upstream nodes, and D/sub N/ is related to the packet's guaranteed rate clock (GRC) value at the flow-aggregation node. CAS provides tighter end-to-end (e2e) delay bounds than the "vanilla" GR aggregate scheduling that relies on FIFO queueing within an aggregate. Our in-depth simulation results demonstrate CAS's superior performance. Moreover, as an aggregate-based work-conserving scheduling algorithm, CAS incurs lower scheduling and state-maintenance overheads at routers than per-flow scheduling. These salient features make CAS very attractive for use in Internet core networks.
Allocating software components while meeting multiple platform resource constraints is crucial for model-based design of large embedded real-time software and automatic design model transformation. In this paper, we p...
详细信息
Allocating software components while meeting multiple platform resource constraints is crucial for model-based design of large embedded real-time software and automatic design model transformation. In this paper, we propose a new method for component allocation using an informed branch-and-bound and forward checking mechanism subject to a combination of resource constraints. We have implemented this method in the automatic integration of reusable embedded software (AIRES) toolkit - which has been developed under the DARPA MoBIES Program - and applied it to an automotive electronic throttle control (ETC) system. Our evaluation based on randomly-generated design models has shown that the proposed method scales well for large and complex embedded real-time software.
The Extended Finite Element Method (XFEM) is a technique used in fracture mechanics to predict how objects deform as cracks form and propagate through them. Here, we propose the use of XFEM to model the deformations r...
详细信息
Adaptive algorithms are increasingly acknowledged in leading parallel and distributed research. In the past, algorithms were manually tuned to be executed efficiently on a particular architecture. However, interest ha...
ISBN:
(纸本)9783540241287
Adaptive algorithms are increasingly acknowledged in leading parallel and distributed research. In the past, algorithms were manually tuned to be executed efficiently on a particular architecture. However, interest has shifted towards algorithms that can adapt themselves to the computational resources. A cost model representing the behavior of the system (i.e. system parameters) and the algorithm (i.e algorithm parameters) plays an important role in adaptive parallel algorithms. In this paper, we contribute a computational model based on Bulk Synchronous Parallel processing that predicts performance of a parallelized split-step Fourier transform. We extracted the system parameters of a cluster (upon which our algorithm was executed) and showed the use of an algorithmic parameter in the model that exhibits optimal behavior. Our model can thus be used for the purpose of self-adaption.
In this paper, we propose an Expectation-Maximization (EM) approach to separate a shape database into different shape classes, while simultaneously estimating the shape contours that best exemplify each of the differe...
详细信息
The design of multi-layer printed circuit boards is vital in the construction of complex electronic systems. Wire routing is a crucial step in the overall design process, which can be decomposed into a number of singl...
The design of multi-layer printed circuit boards is vital in the construction of complex electronic systems. Wire routing is a crucial step in the overall design process, which can be decomposed into a number of single row routing (SRR) problems. This paper proposes an approach to solve the SRR problem based on parallel meta-heuristics. The development of this technique involves the design of an encoding strategy that allows all possible routings to be uniquely represented and the derivation of cost functions that maximizes the quality of the developed solutions. Further, parallelization of the proposed approach is attempted to improve the computational efficiency. The different stages of the development are backed by experiments to show the pros and cons of the sequential and parallel implementations.
暂无评论