Numerical reproducibility failures rise in parallel computation because of the non-associativity of floating-point summation. Optimizations on massively parallelsystems dynamically modify the floating-point operation...
详细信息
ISBN:
(纸本)9781509057078
Numerical reproducibility failures rise in parallel computation because of the non-associativity of floating-point summation. Optimizations on massively parallelsystems dynamically modify the floating-point operation order. Hence, numerical results may change from one run to another. We propose to ensure reproducibility by extending as far as possible the IEEE-754 correct rounding property to larger operation sequences. Our RARE-BLAS (Reproducible, Accurately Rounded and Efficient BLAS) benefits from recent accurate and efficient summation algorithms. Solutions for level 1 (asum, dot and nrm2) and level 2 (gemv) routines are provided. We compare their performance to the Intel MKL library and to other existing reproducible algorithms. For both shared and distributed memory parallelsystems, we exhibit an extra-cost of 2x in the worst case scenario, which is satisfying for a wide range of applications. For Intel Xeon Phi accelerator a larger extra-cost (4x to 6x) is observed, which is still helpful at least for debugging and validation.
This paper introduces DEVS distributed Modeling Framework (DEVS-DMF), a publicly available implementation of DEVS for integrating simulation models as parallel and distributed microservices suitable for containerizati...
详细信息
The proceedings contain 311 papers. The topics discussed include: a robust image hashing with enhanced randomness by using random walk on zigzag blocking;securing fast learning! ridge regression over encrypted big dat...
ISBN:
(纸本)9781509032051
The proceedings contain 311 papers. The topics discussed include: a robust image hashing with enhanced randomness by using random walk on zigzag blocking;securing fast learning! ridge regression over encrypted big data;computational trust model for repeated trust games;formal analysis of selective disclosure attribute-based credential system in applied Pi calculus;node trust prediction framework in mobile ad hoc networks;trust enhancement over range search for encrypted data;healthcare fraud detection based on trustworthiness of doctors;integrated security for services hosted in virtual environments;a permissioned blockchain framework for supporting instant transaction and dynamic block size;trust validation of cloud IaaS: a customer-centric approach;distributed bitcoin account management;trusted Boolean search on cloud using searchable symmetric encryption;and dynamic attribute-based access control in cloud storage systems.
Big graphs are finding increasing applications in many science and engineering domains, such as computational biology, cybermanufacturing and social media. Graphs provide a very flexible mathematical abstraction for d...
详细信息
ISBN:
(纸本)9781467388450
Big graphs are finding increasing applications in many science and engineering domains, such as computational biology, cybermanufacturing and social media. Graphs provide a very flexible mathematical abstraction for describing relationships between entities in complex systems. Real world graphs are characterized by high connectivity and high irregularity. Such non-uniform characteristics increase the mismatch between the vertex centric parallel computation model and the computer hardware resources. Another problem with the vertex-centric computation model is that it treats vertices symmetrically and this uniform assumption breaks when graphs exhibit high irregularity and graph algorithms reveal non-uniform workloads. In this keynote, I will advocate a fundamental revisit of graph computation models and promotes a methodical framework for support high performance graph parallel abstractions that are resource aware, composable and programmable. I will discuss a suite of graph optimization techniques that explore workload characteristics of graph algorithms and irregularity hidden in graph structures. I will conclude the talk by presenting some interesting research problems and unique opportunities for big graph analytics.
By maintaining the data in main memory, in-memory databases dramatically reduce the I/O cost of transaction processing. However, for recovery purposes, in-memory systems still need to flush the log to disk which incur...
详细信息
ISBN:
(纸本)9781450335317
By maintaining the data in main memory, in-memory databases dramatically reduce the I/O cost of transaction processing. However, for recovery purposes, in-memory systems still need to flush the log to disk which incurs a substantial number of I/Os. Recently, command logging has been proposed to replace the traditional data log (e.g., ARIES logging) in in-memory databases. Instead of recording how the tuples are updated, command logging only tracks the transactions that are being executed, thereby effectively reducing the size of the log and improving the performance. However, when a failure occurs, all the transactions in the log after the last checkpoint must be redone sequentially and this significantly increases the cost of recovery. In this paper, we first extend the command logging technique to a distributed system, where all the nodes can perform their recovery in parallel. We show that in a distributed system, the only bottleneck of recovery caused by command logging is the synchronization process that attempts to resolve the data dependency among the transactions. We then propose an adaptive logging approach by combining data logging and command logging. The percentage of data logging versus command logging becomes a tuning knob between the performance of transaction processing and recovery to meet different OLTP requirements, and a model is proposed to guide such tuning. Our experimental study compares the performance of our proposed adaptive logging, ARIES-style data logging and command logging on top of H-Store. The results show that adaptive logging can achieve a 10x boost for recovery and a transaction throughput that is comparable to that of command logging.
Given the large amount of data from different sources that have become available to researchers in multiple fields, Data Science has emerged as a new paradigm for exploring and getting value from that data. In that co...
详细信息
ISBN:
(纸本)9781509061082
Given the large amount of data from different sources that have become available to researchers in multiple fields, Data Science has emerged as a new paradigm for exploring and getting value from that data. In that context, new parallel processing environments with abstract programming interfaces, like Spark, were proposed to try to simplify the development of distributed programs. Although such solutions have become widely used, achieving the best performance with them is still not always straight-forward, despite the multiple run-time strategies they use. In this work we analyze some of the causes of performance degradation in such systems and, based on that analysis, we propose a tool to improve performance by dynamically adjusting data partitioning and parallelism degree in recurrent applications based on previous executions. Our results applying that methodology show consistent reductions in execution time for the applications considered, with gains of up to 50%.
The quantity of data transmitted in the network intensified rapidly with the increased dependency on social media applications, sensors for data acquisitions and smartphones utilizations. Typically, such data is unstr...
详细信息
ISBN:
(纸本)9781509004126
The quantity of data transmitted in the network intensified rapidly with the increased dependency on social media applications, sensors for data acquisitions and smartphones utilizations. Typically, such data is unstructured and originates from multiple sources in different format. Consequently, the abstraction of data for rendering is difficult, that lead to the development of a computing system that is able to store data in unstructured format and support distributedparallel computing. To data, there exist approaches to handle big data using NoSQL. This paper provides a review and the comparison between NoSQL and Relational Database Management System (RDBMS). By reviewing each approach, the mechanics of NoSQL systems can be clearly distinguished from the RDBMS. Basically, such systems rely on multiple factors, that include the query language, architecture, data model and consumer API. This paper also defines the application that matches the system and subsequently able to accurately correlates to a specific NoSQL system.
The microservices architecture is widely regarded as a promising approach to service-oriented systems. However, developing applications in the microservices architecture presents three main challenges: (a) how to prog...
详细信息
ISBN:
(纸本)9781509022533
The microservices architecture is widely regarded as a promising approach to service-oriented systems. However, developing applications in the microservices architecture presents three main challenges: (a) how to program systems that consists of a large number of services running in parallel and distributed over a cluster of computers;(b) how to reduce the communication overhead caused by executing a large number of small services;(c) how to support the flexible deployment of services to a network to achieve system load balance. This paper presents a programming language called CAOPLE and reports the implementation of the language on a virtual machine called CAVM-2. The paper demonstrates how this approach meets these challenges.
Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.
Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.
The proceedings contain 81 papers. The topics discussed include: evaluation of topology-aware broadcast algorithms for dragonfly networks;compiler-assisted overlapping of communication and computation in MPI applicati...
ISBN:
(纸本)9781509036530
The proceedings contain 81 papers. The topics discussed include: evaluation of topology-aware broadcast algorithms for dragonfly networks;compiler-assisted overlapping of communication and computation in MPI applications;vProbe: scheduling virtual machines on NUMA systems;GLAP: distributed dynamic workload consolidation through gossip-based learning;CORP: cooperative opportunistic resource provisioning for short-lived jobs in cloud systems;time optimization modeling for big data placement and analysis for geo-distributed data centers;reduced-precision floating-point formats on GPUs for high performance and energy efficient computation;conflict prediction-based transaction execution for transactional memory in multi-core in-memory databases;skyline service selection based on QoS prediction;minimizing CMT miss penalty in selective page-level address mapping table;efficient semantic-aware coflow scheduling for data-parallel jobs;spatial locality aware, fast, and scalable SLINK algorithm for commodity clusters;high throughput log-based replication for many small in-memory objects;TwinPCG: dual thread redundancy with forward recovery for preconditioned conjugate gradient methods;active learning in performance analysis;and dynamically building energy proportional data centers with heterogeneous computing resources.
暂无评论