FastFlow is a structured parallel programming framework targeting shared memory multi-core architectures. In this paper we introduce a FastFlow extension aimed at supporting also a network of multi-core workstations. ...
详细信息
ISBN:
(纸本)9783642369490
FastFlow is a structured parallel programming framework targeting shared memory multi-core architectures. In this paper we introduce a FastFlow extension aimed at supporting also a network of multi-core workstations. the extension supports the execution of FastFlow programs by coordinating-in a structured way-the fine grain parallel activities running on a single workstation. We discuss the design and the implementation of this extension presenting preliminary experimental results validating it on state-of-the-art networked multi-core nodes.
the proceedings contain 74 papers. the topics discussed include: similarity joins on item set collections using zero-suppressed binary decision diagrams;adaptive query scheduling in key-value data stores;near-optimal ...
ISBN:
(纸本)9783642374494
the proceedings contain 74 papers. the topics discussed include: similarity joins on item set collections using zero-suppressed binary decision diagrams;adaptive query scheduling in key-value data stores;near-optimal partial linear scan for nearest neighbor search in high-dimensional space;distributed ah-tree based index technology for multi-channel wireless data broadcast;beyond click graph: topic modeling for search engine query log analysis;shortest path computation over disk-resident large graphs based on extended bulk synchronous parallel methods;NameNode and DataNode coupling for a power-proportional Hadoop distributed file system;ServiceBase: a programming knowledge-base for service oriented development;a mechanism for stream program performance recovery in resource limited compute clusters;MUSTBLEND: blending visual multi-source twig query formulation and query processing in RDBMS;and content based retrieval for lunar exploration image databases.
the proceedings contain 74 papers. the topics discussed include: similarity joins on item set collections using zero-suppressed binary decision diagrams;adaptive query scheduling in key-value data stores;near-optimal ...
ISBN:
(纸本)9783642374869
the proceedings contain 74 papers. the topics discussed include: similarity joins on item set collections using zero-suppressed binary decision diagrams;adaptive query scheduling in key-value data stores;near-optimal partial linear scan for nearest neighbor search in high-dimensional space;distributed ah-tree based index technology for multi-channel wireless data broadcast;beyond click graph: topic modeling for search engine query log analysis;shortest path computation over disk-resident large graphs based on extended bulk synchronous parallel methods;NameNode and DataNode coupling for a power-proportional Hadoop distributed file system;ServiceBase: a programming knowledge-base for service oriented development;a mechanism for stream program performance recovery in resource limited compute clusters;MUSTBLEND: blending visual multi-source twig query formulation and query processing in RDBMS;and content based retrieval for lunar exploration image databases.
the ever decreasing price/performance ratio of microcontrollers makes it economically attractive to replace more and more conventional mechanical or electronic control systems within many products by embedded real-tim...
ISBN:
(纸本)9781475780123
the ever decreasing price/performance ratio of microcontrollers makes it economically attractive to replace more and more conventional mechanical or electronic control systems within many products by embedded real-time computer systems. An embedded real-time computer system is always part of a well-specified larger system, which we call an intelligent product. Although most intelligent products start out as stand-alone units, many of them are required to interact with other systems at a later stage. At present, many industries are in the middle of this transition from stand-alone products to networked embedded systems. this transition requires reflection and architecting: the complexity of the evolving distributed artifact can only be controlled if careful planning and principled design methods replace the ad-hoc engineering of the first version of many standalone embedded products. Design Methods and Applications for distributed Embedded systems documents recent approaches and results presented at the IFIP TC10 Working conference on distributed and parallel Embedded systems (DIPES 2004), which was held in August 2004 as a co-located conference of the 18th IFIP World Computer Congress in Toulouse, France, and sponsored by the international Federation for Information Processing (IFIP). the topics which have been chosen for this working conference are very timely: model-based design methods, design space exploration, design methodologies and user interfaces, networks and communication, scheduling and resource management, fault detection and fault tolerance, and verification and analysis. these topics are supplemented by several hardware and application oriented papers.
parallel transactions in distributed DBs incur high overhead for concurrency control and aborts. We propose an alternative approach by pre-serializing possibly conflicting transactions, and parallelizing non-conflicti...
详细信息
ISBN:
(纸本)9781467345651;9780769549033
parallel transactions in distributed DBs incur high overhead for concurrency control and aborts. We propose an alternative approach by pre-serializing possibly conflicting transactions, and parallelizing non-conflicting update transactions to different replicas. Our system provides strong transactional guarantees. In effect, Gargamel partitions the database dynamically according to the update workload. Each database replica runs sequentially, at full bandwidth;mutual synchronisation between replicas remains minimal. Our simulations show that Gargamel improves both response time and load by an order of magnitude when contention is high (highly loaded system with bounded resources), and that otherwise slow-down is negligible.
Cloud computing enables diverse new application areas for distributedcomputing. Many upcoming cloud applications do not fit to simple programming models such as "embarrassing parallelism" but have complex d...
详细信息
ISBN:
(纸本)9781467345651;9780769549033
Cloud computing enables diverse new application areas for distributedcomputing. Many upcoming cloud applications do not fit to simple programming models such as "embarrassing parallelism" but have complex data dependencies and require atomic operations spanning multiple objects. Some large-scale storage systems already implement atomic multi-object operations, but they do not address the complementary problem of efficiently propagating replica updates. In this paper, we present the design and implementation of a smart replication protocol in the ECRAM in-memory storage, which supports atomic multi-object operations. the performance analysis shows that the adaptive mechanism requires much less bandwidth, less memory, and results in improved application performance and responsiveness.
Molecular docking is a widely used tool in Computer-aided Drug Design and Discovery. Due to the complexity of simulating the chemical events when two molecules interact, highly accelerated molecular docking programs a...
详细信息
ISBN:
(纸本)9781467345651
Molecular docking is a widely used tool in Computer-aided Drug Design and Discovery. Due to the complexity of simulating the chemical events when two molecules interact, highly accelerated molecular docking programs are of great interest and importance for practical use. In this paper, we present a GPU accelerated docking program implemented with CUDA. the hardware-enabled texture interpolation is employed for fast energy evaluation. Two types of parallel genetic algorithms are mapped to the CUDA computing architecture and used for the search of optimal docking result. Comparing to the CPU implementation, the GPU accelerated docking program achieved significant speedup while producing comparable results to the CPU version. the source code is made public at http://***/p/cudock/.
this paper presents a GPU-based spherical coordinate conversion system for panorama video image stitching. Modern programmable GPU makes it possible to process multiple images in an interactive frame rates. To perform...
详细信息
ISBN:
(纸本)9781467345651;9780769549033
this paper presents a GPU-based spherical coordinate conversion system for panorama video image stitching. Modern programmable GPU makes it possible to process multiple images in an interactive frame rates. To perform image stitching to form a panorama view, we use OpenCL to stitch multiple images and then texture map it to a spherical object. this allows us to compose an immersive environment. In the case study presented in this paper, we achieve a speedup factor of 76x.
Efficient mapping of a real-time HD video application to graphics hardware is challenging. Developers face the challenges of choosing the right parallelism model, balancing thread's process granularity between mas...
详细信息
ISBN:
(纸本)9781467345651
Efficient mapping of a real-time HD video application to graphics hardware is challenging. Developers face the challenges of choosing the right parallelism model, balancing thread's process granularity between massive computing resources on the GPU, and partitioning tasks between the CPU and GPU. the paper illustrated the mapping approaches by a case of HD H.264 encoderbased on X264 reference code and then evaluating it on state-of-the-art CPU and GPUs in depth. In the paper, we first split most of the computing task into Single-Instruction Multiple-thread (SIMT) kernels, which are then chained intocertaininput/output data stream. then we implementeda completedH.264 encoding on the computer unified device architecture (CUDA) platform. Finally, we present methods for exploiting multi-level parallelism and memory efficiency when mapping H.264 code, which we use to increase the efficiency of the execution on GPUs. Our experimental results show that computation efficiencyof GPU and then real-time encoding performance are achievedwith CUDA.
Cloud computing is a suitable platform for execution of complex computational tasks and scientific simulations that are described in the form of workflows. Such applications are managed by Workflow Management System (...
详细信息
ISBN:
(纸本)9781467345651;9780769549033
Cloud computing is a suitable platform for execution of complex computational tasks and scientific simulations that are described in the form of workflows. Such applications are managed by Workflow Management System (WfMS). Because existing WfMSs are not able to autonomically provision resources to real-time applications and schedule them while supporting fault tolerance and data privacy, we present a highly-scalable workflow-enabled analytics system that manages inter-dependable analytics tasks adaptively with varying operational requirements on a common platform and enables visualization of multidimensional datasets of real world phenomena. In this paper, we present the architecture of such a WfMS and evaluate it in terms of performance for execution of workflows in Clouds. A real world application of climate-associated dengue fever prediction was evaluated on public, private, and hybrid Clouds and experienced effective speedup in all the environments.
暂无评论