this paper is a practical study of the performance impact of avoiding data-dependencies at the algorithm level, when targeting recent deeply pipelined, superscalar processors. We are interested in multiple-precision l...
详细信息
ISBN:
(纸本)3540406735
this paper is a practical study of the performance impact of avoiding data-dependencies at the algorithm level, when targeting recent deeply pipelined, superscalar processors. We are interested in multiple-precision libraries offering the equivalent of quad-double precision. We show that a combination of today's processors, today's compilers, and algorithms written in C using a data representation which exposes parallelism, is able to outperform the reference GMP library which is partially written in assembler. We observe that the gain is related to a better use of the processor's instruction parallelism.
the proceedings contain 149 papers. the special focus in this conference is on parallel, Distributed architectures, Scheduling and Load Balancing. the topics include: Session guarantees to achieve pram consistency of ...
ISBN:
(纸本)3540219463
the proceedings contain 149 papers. the special focus in this conference is on parallel, Distributed architectures, Scheduling and Load Balancing. the topics include: Session guarantees to achieve pram consistency of replicated shared objects;an extended atomic consistency protocol for recoverable DSM systems;hyper-threading technology speeds clusters;configurable microprocessor array for DSP applications;on generalized moore digraphs;RDMA communication based on rotating buffers for efficient parallel fine-grain computations;communication on the fly in dynamic SMP clusters;accelerated diffusion algorithms on general dynamic networks;suitability of load scheduling algorithms to workload characteristics;minimizing time-dependent total completion time on parallel identical machines;diffusion based scheduling in the agent-oriented computing system;approximation algorithms for scheduling jobs with chain precedence constraints;combining vector quantization and ant-colony algorithm for mesh-partitioning;wavelet-neuronal resource load prediction for multiprocessor environment;fault-tolerant scheduling in distributed real-time systems;online scheduling of multiprocessor jobs with idle regulation;predicting the response time of a new task on a beowulf cluster;space decomposition solvers and their performance in pc-based parallel computing environments;evaluation of execution time of mathematical library functions based on historical performance information;empirical modelling of parallel linear algebra routines;efficiency of divisible load processing;gray box based data access time estimation for tertiary storage in grid environment;performance modeling of parallel fem computations on clusters;asymptotical behaviour of the communication complexity of one parallel algorithm and analytical modeling of optimized sparse linear code.
Applications based on Fast Fourier Transform (FFT) such as signal and image processing require high computational power, plus the ability to experiment withalgorithms. To try to meet the dual requirements of high per...
详细信息
ISBN:
(纸本)0780381637
Applications based on Fast Fourier Transform (FFT) such as signal and image processing require high computational power, plus the ability to experiment withalgorithms. To try to meet the dual requirements of high performance and case of development, in this work we present a High Level framework for the implementation of FFTs for real-time image processing applications. the frequency-domain (convolution-based) image filtering problem is targeted by developing an FPGA-based parametrisable environment based on the proposed parallel 2-D FFT architecture for real-time operation. Results show that the parallel implementation of 2-D FFT achieves virtually linear speed-up and real-time performance for large matrix sizes.
We examined the sound wave dissemination model in laminar composite as one-dimensional elastically dissipative system of nodes and ties. this model is defined by the differential equations system. We used the implemen...
详细信息
this paper presents the Multimedia C language, which is appropriate for the multimedia extensions included in all modern microprocessors. the paper discusses the language syntax, the implementation of its compiler and...
详细信息
ISBN:
(纸本)3540406735
this paper presents the Multimedia C language, which is appropriate for the multimedia extensions included in all modern microprocessors. the paper discusses the language syntax, the implementation of its compiler and its use in developing multimedia applications. the goal was to provide programmers withthe most natural way of using multimedia processing facilities in the C language.
Withthe increase of size and complexity of interconnected power system, the dynamic stability simulation in time domain is becoming more time consuming. In order to speed up the dynamic analysis to meet the demands o...
详细信息
ISBN:
(纸本)0889863350
Withthe increase of size and complexity of interconnected power system, the dynamic stability simulation in time domain is becoming more time consuming. In order to speed up the dynamic analysis to meet the demands of real-time simulation of large-scale power system, a parallel piecewise solution method is proposed for solving sparse network equations on a MIMD computer. A large grain parallel algorithm of LU factorization, forward and backward substitution is purposed and validated for the application of solving the time-consuming sparse network solution arising from the network analysis. this algorithm has been successfully incorporated to a conventional dynamic stability simulator for practical power system and the results conducted on 8 processors SGI origin 2100 is reported.
this paper presents a new scheme for parallel computations on cluster systems for time consuming problems of globally optimal decision making. this uniform scheme (without any centralized control processor) is based o...
详细信息
ISBN:
(纸本)3540406735
this paper presents a new scheme for parallel computations on cluster systems for time consuming problems of globally optimal decision making. this uniform scheme (without any centralized control processor) is based on the idea of multidimensional problem reduction. Using same new multiple mappings (of the Peano curve type), a multidimensional problem is reduced to a family of univariate problems which can be solved in parallel in such a way that each of these processors shares the information obtained by the other processors.
In this paper we analyze parallelprocessing in clusters of computers of an improved prediction method based on RBF neural networks and matrix decomposition techniques (SVD and QR-cp). parallelprocessing is required ...
详细信息
ISBN:
(纸本)354040211X
In this paper we analyze parallelprocessing in clusters of computers of an improved prediction method based on RBF neural networks and matrix decomposition techniques (SVD and QR-cp). parallelprocessing is required because of the extensive computation found in sucn an hybrid prediction technique, the reward being better prediction performance and also less network complexity. We discuss two alternatives of concurrency: parallel implementation of the prediction procedure over the ScaLAPACK suite, and the formulation of another parallel routine customized to a higher degree for better performance in the case of the QR-cp procedure.
the paper presents new approach to processing capacity protection in the service system with multiple server units. In the integrated service communication networks an important problem is to implement call admission ...
详细信息
ISBN:
(纸本)3540408045
the paper presents new approach to processing capacity protection in the service system with multiple server units. In the integrated service communication networks an important problem is to implement call admission control and routing so as to optimally use the network resources. We assumed reality of several classes of jobs and tried to preserve an amount of processing capacity for high priority jobs that can arrive in a burst. In parallel, current low priority jobs were processed with continuous regulation of servers load. Simulation results showed rapid adaptation and good balancing around the predetermined maximum processing level. Inspiration was found in the paradigm of software agent technology and potential advantages appearing when applied in telecommunication network. Main characteristic of the usage of intelligent agent is the opportunity to permanently transfer adaptation regarding one or more parameters following optimal and requested policy.
In order to estimate the velocity of sea ice movement, according to the characters of sea ice image, an effect method has been adopted: Proper candidate points are found by using interest operators, a locally parallel...
详细信息
In order to estimate the velocity of sea ice movement, according to the characters of sea ice image, an effect method has been adopted: Proper candidate points are found by using interest operators, a locally parallel model for matching is constructed to analyze the disparity of sequential images, after a certain post processing, the estimation of the velocity could be got by using a orientation choosing and maximum probability decision method. the algorithm has shown good performance in practice for estimating velocity of sea ice in Bohai.
暂无评论