Background High throughput DNA/RNA sequencing has revolutionized biological and clinical research. Sequencing is widely used, and generates very large amounts of data, mainly due to reduced cost and advanced technolog...
详细信息
Background High throughput DNA/RNA sequencing has revolutionized biological and clinical research. Sequencing is widely used, and generates very large amounts of data, mainly due to reduced cost and advanced technologies. Quickly assessing the quality of giga-to-tera base levels of sequencing data has become a routine but important task. Identification and elimination of low-quality sequence data is crucial for reliability of downstream analysis results. There is a need for a high-speed tool that uses optimized parallel programming for batch processing and simply gauges the quality of sequencing data from multiple datasets independent of any other processing steps. Results FQStat is a stand-alone, platform-independent software tool that assesses the quality of FASTQ files using parallel programming. Based on the machine architecture and input data, FQStat automatically determines the number of cores and the amount of memory to be allocated per file for optimum performance. Our results indicate that in a core-limited case, core assignment overhead exceeds the benefit of additional cores. In a core-unlimited case, there is a saturation point reached in performance by increasingly assigning additional cores per file. We also show that memory allocation per file has a lower priority in performance when compared to the allocation of cores. FQStat's output is summarized in HTML web page, tab-delimited text file, and high-resolution image formats. FQStat calculates and plots read count, read length, quality score, and high-quality base statistics. FQStat identifies and marks low-quality sequencing data to suggest removal from downstream analysis. We applied FQStat on real sequencing data to optimize performance and to demonstrate its capabilities. We also compared FQStat's performance to similar quality control (QC) tools that utilize parallel programming and attained improvements in run time. Conclusions FQStat is a user-friendly tool with a graphical interface that emplo
Motivation: In modern IT systems, the increasing demand for computational power is tightly coupled with ever higher energy consumption. Traditionally, energy efficiency research has focused on reducing energy consumpt...
详细信息
Motivation: In modern IT systems, the increasing demand for computational power is tightly coupled with ever higher energy consumption. Traditionally, energy efficiency research has focused on reducing energy consumption at the hardware level. Nevertheless, the software itself provides numerous opportunities for improving energy efficiency. Goal: Given that energy efficiency for IT systems is a rising concern, we investigate existing work in the area of energy-aware software development and identify open research challenges. Our goal is to reveal limitations, features, and tradeoffs regarding energy-performance for software development and provide insights on existing approaches, tools, and techniques for energy-efficient programming. Method: We analyze and categorize research work mostly extracted from top-tier conferences and journals concerning energy efficiency across the software development lifecycle phases. Results: Our analysis shows that related work in this area has focused mainly on the implementation and verification phases of the software development lifecycle. Existing work shows that the use of parallel and approximate programming, source code analyzers, efficient data structures, coding practices, and specific programming languages can significantly increase energy efficiency. Moreover, the utilization of energy monitoring tools and benchmarks can provide insights for the software practitioners and raise energy-awareness during the development phase.
We present the most recent release of our parallel implementation of the BFS and BC algorithms for the study of large scale graphs. Although our reference platformis a high-end cluster of new generation Nvidia GPUs an...
详细信息
We present the most recent release of our parallel implementation of the BFS and BC algorithms for the study of large scale graphs. Although our reference platformis a high-end cluster of new generation Nvidia GPUs and some of our optimizations are CUDA specific, most of our ideas can be applied to other platforms offering multiple levels of parallelism. We exploit multi level parallel processing through a hybrid programming paradigm that combines highly tuned CUDA kernels, for the computations performed by each node, and explicit data exchange through the Message Passing Interface (MPI), for the communications among nodes. The results of the numerical experiments show that the performance of our code is comparable or better with respect to other state-of-the-art solutions. For the BFS, for instance, we reach a peak performance of 200 Giga Teps on a single GPU and 5.5 Terateps on 1024 Pascal GPUs. We release our source codes both for reproducing the results and for facilitating their usage as a building block for the implementation of other algorithms.
A novel parallel technique that couples the lattice-Boltzmann method and a finite volume scheme for the prediction of concentration polarisation and pore blocking in axisymmetric cross-flow membrane separation process...
详细信息
A novel parallel technique that couples the lattice-Boltzmann method and a finite volume scheme for the prediction of concentration polarisation and pore blocking in axisymmetric cross-flow membrane separation process is presented. The model uses the Lattice-Boltzmann method to solve the incompressible Navier-Stokes equations for hydrodynamics and the finite volume method to solve the convection-diffusion equation for solute particles. Concentration polarisation is modelled for micro-particles by having the diffusion coefficient defined as a function of particle concentration and shear rate. The model considers the effect of an incompressible cake formation. Pore blocking phenomenon is predicted for filtration membrane fouling by using the rate of particles arriving at the membrane surface. The simulation code is parallelised in two ways. Compute Unified Device Architecture (CUDA) is used for a cluster of graphical processing units (GPUs) and Message Passing Interface (MPI) is utilised for a cluster of central processing units (CPUs), with various parallelisation techniques to optimise memory usage for higher performance. The proposed model is validated by comparing to analytical solutions and experimental result.
Over the past years, frameworks such as MapReduce and Spark have been introduced to ease the task of developing big data programs and applications. However, the jobs in these frameworks are roughly defined and package...
详细信息
Over the past years, frameworks such as MapReduce and Spark have been introduced to ease the task of developing big data programs and applications. However, the jobs in these frameworks are roughly defined and packaged as executable jars without any functionality being exposed or described. This means that deployed jobs are not natively composable and reusable for subsequent development. Besides, it also hampers the ability for applying optimizations on the data flow of job sequences and pipelines. In this paper, we present the Hierarchically Distributed Data Matrix (HDM) which is a functional, strongly-typed data representation for writing composable big data applications. Along with HDM, a runtime framework is provided to support the execution, integration and management of HDM applications on distributed infrastructures. Based on the functional data dependency graph of HDM, multiple optimizations are applied to improve the performance of executing HDM jobs. The experimental results show that our optimizations can achieve improvements between 10 to 40 percent of the Job-Completion-Time for different types of applications when compared with the current state of art, Apache Spark.
As the open standard for parallel programming of heterogeneous systems, OpenCL has been used in this study in the context of a particular and intensive computing task, namely the voxelization of tessellated objects. F...
详细信息
The article discusses a method to increase the efficiency of solving the problem of finding images in a database. This method is based on the use of perceptual hashing of an image, three levels of data parallelization...
详细信息
ISBN:
(纸本)9781728103396
The article discusses a method to increase the efficiency of solving the problem of finding images in a database. This method is based on the use of perceptual hashing of an image, three levels of data parallelization and image search procedures. To implement parallel data processing, the principle of symmetric horizontal data distribution and the capabilities of modern processors (SIMD registers and corresponding instructions) are used. The results of a computational experiment, confirming the effectiveness of the proposed method, are presented.
A new approach to the use of nanotubes as an alloying tool for living biological objects is proposed. The basis of the mechanism of doping is chemical detonation in the nanotube. A procedure for the accelerated calcul...
详细信息
ISBN:
(纸本)9781728103396
A new approach to the use of nanotubes as an alloying tool for living biological objects is proposed. The basis of the mechanism of doping is chemical detonation in the nanotube. A procedure for the accelerated calculation of nanotube parameters for the purpose of doping is proposed. The possibilities of parallel programming in calculating the parameters of a detonation gas mixture are shown.
Data parallelism is inherent to multidimensional matrices Algebra. Therefore, the operations of this algebra can be implemented in parallel using generalizations of parallel multiplication algorithms for ordinary matr...
详细信息
ISBN:
(纸本)9781728103396
Data parallelism is inherent to multidimensional matrices Algebra. Therefore, the operations of this algebra can be implemented in parallel using generalizations of parallel multiplication algorithms for ordinary matrices. The article discusses a recursive approach to multiplying multidimensional matrices for the operation of (2, 1)-contracted multiplication of four-dimensional matrices.
暂无评论