parallel video coding has emerged from the need to map video algorithms in many/multi-core architectures and achieve ever-growing performance goals in video-based applications. Several parallelization methods have bee...
详细信息
parallel video coding has emerged from the need to map video algorithms in many/multi-core architectures and achieve ever-growing performance goals in video-based applications. Several parallelization methods have been proposed around H.264 algorithm but it was only until the new HEVC video standard that two parallelization strategies-Tiles and Wave front parallelprocessing (WPP) became part of the specification. Effective selection and usage of Tiles or WPP is an open issue. In this paper we evaluate the performance of both strategies in terms of video decoding speed-up including their correlation with additional optimization possibilities like parallel filtering and low-level SIMD operations.
High-throughput molecular biology techniques are widely used to identify physical interactions between genetic elements located throughout the human genome. Chromosome Conformation Capture (3C) and other related techn...
详细信息
High-throughput molecular biology techniques are widely used to identify physical interactions between genetic elements located throughout the human genome. Chromosome Conformation Capture (3C) and other related techniques allow to investigate the spatial organisation of chromosomes in the cell's natural state. Recent results have shown that there is a large correlation between co-localization and co-regulation of genes, but these important information are hampered by the lack of biologists-friendly analysis and visualisation software. In this work we introduce NuChart-II, a tool for Hi-C data analysis that provides a gene-centric view of the chromosomal neighbourhood in a graph-based manner. NuChart-II is an efficient and highly optimized C++ re-implementation of a previous prototype package developed in R. Representing Hi-C data using a graph-based approach overcomes the common view relying on genomic coordinates and permits the use of graph analysis techniques to explore the spatial conformation of a gene neighbourhood.
Convolution-based detection models (CDM) have achieved tremendous success in computer vision in last few years, such as deformable part-based models (DPM) and convolutional neural networks (CNN). The simplicity of the...
详细信息
The advancements in the field of internet and cloud computing has resulted in a huge amount of multimedia data and processing of this data have become more complex and computationally intensive. As a result, it has be...
详细信息
The proceedings contain 25 papers. The special focus in this conference is on Reconciling parallelism and Predictability in Mixed-Critical Systems. The topics include: parallel-operation-oriented optically reconfigura...
ISBN:
(纸本)9783319160856
The proceedings contain 25 papers. The special focus in this conference is on Reconciling parallelism and Predictability in Mixed-Critical Systems. The topics include: parallel-operation-oriented optically reconfigurable gate array;exploiting outer loops vectorization in high level synthesis;cache- and communication-aware application mapping for shared-cache multicore processors;parallelizing convolutional neural networks on Intel many integrated core architecture;mobile ecosystem driven dynamic pipeline adaptation for low power;a virtual execution environment for cyber-physical applications;trustworthy self-optimization in organic computing environments;improving reliability and endurance using end-to-end trust in distributed low-power sensor networks;privacy preserved content disclosure for data sharing in cloud;virtualized communication controllers in safety-related automotive embedded systems;network interface with task spawning support for NoC-based DSM architectures;MESI-based cache coherence for hard real-time multicore systems and speeding up static probabilistic timing analysis.
The proceedings contain 25 papers. The special focus in this conference is on Accelerator Programming, Algorithms for parallelism, Compilers, Debugging and Vectorization. The topics include: A practical GPGPU optimizi...
ISBN:
(纸本)9783319174723
The proceedings contain 25 papers. The special focus in this conference is on Accelerator Programming, Algorithms for parallelism, Compilers, Debugging and Vectorization. The topics include: A practical GPGPU optimizing compiler using data sharing and thread coarsening;evaluating performance portability of openACC;NAS parallel benchmarks for GPGPUs using a directive-based programming model;understanding co-run degradations on integrated heterogeneous processors;hiding the overhead of inspector-executor style dynamic parallelization;tiled linear algebra a system for parallel graph algorithms;fast automatic heuristic construction using active learning;jagged tiling for intra-tile parallelism and fine-grain multithreading;memory management techniques for exploiting RDMA in PGAS languages;change detection based parallelism mapping;automatic streamization of image processingapplications;parallelism-aware array data flow analysis for openMP;static approximation of MPI communication graphs for optimized process placement;automatic parallelism through macro dataflow in MATLAB;re-engineering compiler transformations to outperform database query optimizers;systematic debugging of concurrent systems using coalesced stack trace graphs;unification of static and dynamic analyses to enable vectorization and efficient exploitation of hyper loop parallelism in vectorization.
Color space conversion and downsampling are among the major computationally intensive steps in typical image and video codec standards, and accelerating these steps will improve the performances of these applications ...
详细信息
Color space conversion and downsampling are among the major computationally intensive steps in typical image and video codec standards, and accelerating these steps will improve the performances of these applications significantly. In this paper, we describe the parallel implementation of the color space conversion and downsampling as pre-processing steps for the JPEG encoder in a heterogeneous environment using the most recent cross-platform Open Computing Language (OpenCL). This work combines a multi-core CPU and a many-core GPU in a single solution to perform the computation of the JPEG encoder pre-processing stages. In comparing with CPU-based implementation, our OpenCL parallel implementation results in an increase in the speed of the computations by factors of 8.78 on both CPU and GPU devices.
This paper presents the first distributed triangle listing algorithm with provable CPU, I/O, Memory, and Network bounds. Finding all triangles (3-cliques) in a graph has numerous applications for density and connectiv...
详细信息
This paper presents the first distributed triangle listing algorithm with provable CPU, I/O, Memory, and Network bounds. Finding all triangles (3-cliques) in a graph has numerous applications for density and connectivity metrics, but the majority of existing algorithms for massive graphs are sequential, while distributed versions of algorithms do not guarantee their CPU, I/O, Memory, or Network requirements. Our parallel and distributed Triangle Listing (PDTL) framework focuses on efficient external-memory access in distributed environments instead of fitting sub graphs into memory. It works by performing efficient orientation and load-balancing steps, and replicating graphs across machines by using an extended version of Hu et al.'s Massive Graph Triangulation algorithm. PDTL suits a variety of computational environments, from single-core machines to high-end clusters, and computes the exact triangle count on graphs of over 6B edges and 1B vertices (e.g. Yahoo graphs), outperforming and using fewer resources than the state-of-the-art systems Power Graph, OPT, and PATRIC by 2x to 4x. Our approach thus highlights the importance of I/O in a distributed environment.
The Vehicular Ad-Hoc Network (VANET) is one of the most important techniques in smart cities. The service discovery protocol is a foundation stone of VANET. All the location-based requests could be replied only if the...
详细信息
applications in many domains search moving object trajectory databases. The distance threshold search finds all trajectories within a given distance of a query trajectory. We develop three GPU distance threshold searc...
详细信息
applications in many domains search moving object trajectory databases. The distance threshold search finds all trajectories within a given distance of a query trajectory. We develop three GPU distance threshold search implementations that use indexing techniques significantly different from those used in CPU implementations. We determine experimentally under which conditions each approach performs well using one real-world astrophysics dataset and two synthetic datasets. Overall, we find that the GPU is an attractive technology for a broad range of relevant trajectory database scenarios.
暂无评论