In this paper, the directivity of the circular array is analyzed, real-time beamforming algorithm of circular array in frequency domain similar to parallel FIR filter structure is proposed by using the characteristic ...
详细信息
VASP (Vienna Ab initio Simulation Package) is a prevalent first-principle software framework. It is so widely used that its runtime usually dominates the usage of current supercomputers. the porting and optimization o...
详细信息
Recent decades have seen the rapid development of cloud computing, resulting in a huge breakthrough for people to handle the data produced every second and everywhere. Meanwhile, data compression is becoming increasin...
详细信息
ISBN:
(纸本)9783030050573;9783030050566
Recent decades have seen the rapid development of cloud computing, resulting in a huge breakthrough for people to handle the data produced every second and everywhere. Meanwhile, data compression is becoming increasingly important, due to its great potential in benefiting boththe network transportation and the storage. Based on the urgent demand in high-efficient compression method with balanced performance in both merits of compression time and ratio, this paper presents PLZMA, a parallel design of LZMA. Process-level and thread-level parallelisms are implemented according to the algorithm of LZMA, which have gained great improvement in compression time, while ensuring a fair compression ratio. Experimental results on real-world application showed that PLZMA is able to achieve more balanced performance over other famous methods. the parallel design is able to achieve a performance speedup of 8x over the serial baseline, using 12 threads.
In this manuscript, we present an optimized and parallel version of our previous work IMSAME, an exhaustive gapped aligner for the pairwise and accurate comparison of metagenomes. parallelization strategies are applie...
详细信息
ISBN:
(纸本)9783319654829;9783319654812
In this manuscript, we present an optimized and parallel version of our previous work IMSAME, an exhaustive gapped aligner for the pairwise and accurate comparison of metagenomes. parallelization strategies are applied to take advantage of modern multiprocessor architectures. In addition, sequential optimizations in CPU time and memory consumption are provided. these algorithmic and computational enhancements enable IMSAME to calculate near optimal alignments which are used to directly assess similarity between metagenomes without requiring reference databases. We show that the overall efficiency of the parallel implementation is superior to 80% while retaining scalability as the number of parallel cores used increases. Moreover, we also show that sequential optimizations yield up to 8x speedup for scenarios with larger data.
the proceedings contain 59 papers. the special focus in this conference is on Applications of parallel and Distributed Computing. the topics include: On exploring a virtual agent negotiation inspired approach for rout...
ISBN:
(纸本)9783319271361
the proceedings contain 59 papers. the special focus in this conference is on Applications of parallel and Distributed Computing. the topics include: On exploring a virtual agent negotiation inspired approach for route guidance in urban traffic networks;optimization of binomial option pricing on intel MIC heterogeneous system;stencil computations on HPC-oriented ARMv8 64-bit multi-core processor;a particle swarm optimization algorithm for controller placement problem in software defined network;a streaming execution method for multi-services in mobile cloud computing;economy-oriented deadline scheduling policy for render system using IaaS cloud;towards detailed tissue-scale 3D simulations of electrical activity and calcium handling in the human cardiac ventricle;task parallel implementation of matrix multiplication on multi-socket multi-core architectures;refactoring for separation of concurrent concerns;exploiting scalable parallelism for remote sensing analysis models by data transformation graph;resource-efficient vibration data collection in cyber-physical systems;a new approach for vehicle recognition and tracking in multi-camera traffic system;a scalable distributed fingerprint identification system;energy saving and load balancing for SDN based on multi-objective particle swarm optimization;pre-stack kirchhoff time migration on hadoop and spark;a cyber physical system with GPU for CNC applications;a solution of the controller placement problem in software defined networks;parallel column subset selection of kernel matrix for scaling up support vector machines;real-time deconvolution with GPU and spark for big imaging data analysis and parallel kirchhoff pre-stack depth migration on large high performance clusters.
Scheduling precedence constrained stochastic tasks on heterogenous cluster systems is an important issue which impact the performance of clusters significantly. Different with deterministic tasks, stochastic task mode...
详细信息
ISBN:
(纸本)9783319271613;9783319271606
Scheduling precedence constrained stochastic tasks on heterogenous cluster systems is an important issue which impact the performance of clusters significantly. Different with deterministic tasks, stochastic task model assumes that the workload of task and quantity of data transmission between tasks are stochastic variables, which is more realistic than other task models. Scheduling model and algorithms of precedence constrained stochastic tasks attract a large number of researchers' attention recently. An algorithm SDLS (Stochastic Dynamic Level Scheduling) has been proved performing well in scheduling stochastic tasks on heterogenous clusters. However, the assumption about communication time between tasks in SDLS is much simpler than its assumptions about task computing time, which makes it cannot depict the communication cost among heterogenous links well. In this paper, it is assumed that the quantity of data communication between tasks is a stochastic variable of normal distribution, instead of assuming communication time among heterogenous links a same stochastic variable immediately. Moreover, a modified scheduling model and algorithm SDLS-HC (Stochastic Dynamic Level Scheduling on Heterogenous Communication links) are proposed. Work in this paper focus on considering much more detailed communication cost in task scheduling based on SDLS. Evaluation on many random generated tasks experiments demonstrates that SDLS-HC achieves better performance than SDLS on cluster systems with heterogenous links.
Matrix multiplication is a very important computation kernel in many science and engineering applications. this paper presents a parallel implementation framework for dense matrix multiplication on multi-socket multi-...
详细信息
R is a widely-used statistical programming language in the data science community. However, in the big data era, R faces the challenges from large scale data analysis tasks. It lacks the ability of distributed linear ...
详细信息
Today, millions of legacy programs are awaiting their parallelization. For this reason, the automatic discovery of parallelism in sequential programs is now receiving considerable attention. However, past efforts main...
详细信息
the Intel Xeon Phi is a many-core accelerator which focuses on the high performance applications. To characterize the performance of the Intel Xeon Phi, a system of dual 8-core Intel Xeon E5-2670 processors is employe...
详细信息
暂无评论