In many domains, Discrete-Event Simulations (DES) are usually used to reproduce the behavior of a certain system or process, where events are processed one after another in chronological and sequential order. Classica...
详细信息
ISBN:
(纸本)9781728106762
In many domains, Discrete-Event Simulations (DES) are usually used to reproduce the behavior of a certain system or process, where events are processed one after another in chronological and sequential order. Classical DES will no longer be a possible solution for Complex and Large-scale systems, System of Systems (SoS), and Performance Evaluation Systems that compare multiple different simulations running simultaneously in parallel. advances in network and communications made the distributed Simulation (DS) approach one of the best solutions for the aforementioned Systems Simulations. One of the challenges faced when developing a DS from DES components is the federation behavior including time management and synchronization between these components. In most of the traditional DES platforms, simulations cannot exchange messages, nor change the configuration at run time. This makes the DES connection and integration very hard and at times, impossible to implement. This article presents the method used to integrate different DES components, using High-Level Architecture (HLA) Evolved Standard, Business Process Model and Notation (BPMN), and Jaamsim, a Java open source DES.
The present work investigates a new strategy of applying single-phase distributed generations (DGs) for the problem of phase load balancing. The work demonstrates that single-phase DG can be applied at specific locati...
详细信息
Brain-inspired computing is a novel computing technology based on neural morphological engineering, which draws lessons from methods of human brain information processing and storage. Combining with the high-performan...
详细信息
ISBN:
(纸本)9781728126074
Brain-inspired computing is a novel computing technology based on neural morphological engineering, which draws lessons from methods of human brain information processing and storage. Combining with the high-performance computing (HPC) platform, they constitute the foundation of general artificial intelligence. However, current brain HPC platforms generally suffer from slow speed, poor scalability, and high energy consumption, which severely restrain its potential and circumscribe the development of general artificial intelligence. The dataflow model was first proposed in the 1970s, providing a novel idea for the development of HPC. In addition, the dataflow model shares similar information processing mechanisms with human's neural system, which makes dataflow models naturally suit the emulation of brain-inspired computing. Based on the contemporary progress of the dataflow model, the Codelet model was proposed. Through a fine-grained asynchronous program execution and resource allocation, the Codelet model successfully realized the distributedcomputing on the heterogeneous system, effectively improved the computing power and speed, and open up a new path to overcome the shortcomings of the existing high-performance computing technology. We propose a dataflow-based emulation platform, aiming at providing high-performance computing technology support for general brain-inspired intelligent system, as well as using characteristics of dataflow models to fully explore the potential of brain-inspired intelligence. As an example, we will select a convolutional neural network (LeNet5) that already has a spectacular user base to initially verify the superiority and feasibility of our proposal.
The multi-way join query plays a fundamental role in many big data analytic scenarios. Recently, the hybrid join query is becoming increasingly important. However, the existing one-round and multi-round algorithms hav...
详细信息
ISBN:
(纸本)9781538674741
The multi-way join query plays a fundamental role in many big data analytic scenarios. Recently, the hybrid join query is becoming increasingly important. However, the existing one-round and multi-round algorithms have limitations in the process of the hybrid query. In this paper, we present a novel hybrid structure-aware multi-way join algorithm called HyMJ, which combines the one-round and multi-round algorithms to compute the hybrid query efficiently. First, we propose the query structure graph (QSG) to represent the internal query structure of a given join query and the query structure decomposition tree (QSDT) to represent the structure-aware query plan. Each internal node of the QSDT denotes a subquery with a cyclic or acyclic query structure. Then, we design a graph contraction based algorithm to construct QSDT from QSG. Furthermore, to select the optimal join strategy for each subquery in the QSDT, we introduce a heuristic strategy selection model. Experimental results on Apache Spark reveal that HyMJ outperforms both the one-round and multi-round algorithms for hybrid multi-way join queries on real-world datasets.
Increasing the number of computational cores is a primary way of achieving high performance of contemporary supercomputers. However, developing parallel applications capable to harness the enormous amount of cores is ...
详细信息
With the development of information technology, more and more data information uploaded to the cloud server, cloud computing encryption storage technology for data security has become particularly important. This pape...
详细信息
Zookeeper was designed to be a robust service, it exposes a simple API, inspired by the filesystem API, that allows to implement common coordination tasks, such as electing a master server, consensus, managing group m...
详细信息
Parameter updating is an important stage in parallelism-based distributed deep learning. Synchronous methods are widely used in distributed training the Deep Neural Networks (DNNs). To reduce the communication and syn...
详细信息
ISBN:
(数字)9781728169262
ISBN:
(纸本)9781728169279
Parameter updating is an important stage in parallelism-based distributed deep learning. Synchronous methods are widely used in distributed training the Deep Neural Networks (DNNs). To reduce the communication and synchronization overhead of synchronous methods, decreasing the synchronization frequency (e.g., every n mini-batches) is a straightforward approach. However, it often suffers from poor convergence. In this paper, we propose a new algorithm of integrating Particle Swarm Optimization (PSO) into the distributed training process of DNNs to automatically compute new parameters. In the proposed algorithm, a computing work is encoded by a particle, the weights of DNNs and the training loss are modeled by the particle attributes. At each synchronization stage, the weights are updated by PSO from the sub weights gathered from all workers, instead of averaging the weights or the gradients. To verify the performance of the proposed algorithm, the experiments are performed on two commonly used image classification benchmarks: MNIST and CIFAR10, and compared with the peer competitors at multiple different synchronization configurations. The experimental results demonstrate the competitiveness of the proposed algorithm.
Lattice-based cryptography has received attention as a next-generation encryption technique, because it is believed to be secure against attacks by classical and quantum computers. Its essential security depends on th...
详细信息
ISBN:
(数字)9781728199986
ISBN:
(纸本)9781728199993
Lattice-based cryptography has received attention as a next-generation encryption technique, because it is believed to be secure against attacks by classical and quantum computers. Its essential security depends on the hardness of solving the shortest vector problem (SVP). In the cryptography, to determine security levels, it is becoming significantly more important to estimate the hardness of the SVP by high-performance computing. In this study, we develop the world's first distributed and asynchronous parallel SVP solver, the MAssively parallel solver for SVP (MAP-SVP). It can parallelize algorithms for solving the SVP by applying the Ubiquity Generator framework, which is a generic framework for branch-and-bound algorithms. The MAP-SVP is suitable for massive-scale parallelization, owing to its small memory footprint, low communication overhead, and rapid checkpoint and restart mechanisms. We demonstrate its performance and scalability of the MAP-SVP by using up to 100,032 cores to solve instances of the Darmstadt SVP Challenge.
Because training a deep neural network (DNN) takes arduous amounts of time and computation, often researchers expedite the training process via distributedparallel training on GPUs. On one hand, this lower computing-...
详细信息
ISBN:
(纸本)9781728151410
Because training a deep neural network (DNN) takes arduous amounts of time and computation, often researchers expedite the training process via distributedparallel training on GPUs. On one hand, this lower computing-to-communication ratio makes traditional data parallelism difficult to scale, and traditional model parallelism leads to low GPU utilization. Both make it difficult to obtain a higher speedup. On the other hand, multi-GPU systems exhibit complex connectivity among GPUs. Overall, workload schedulers must consider hardware topology and workload communication requirements to allocate GPU resources for optimal execution time and improved utilization in GPU clusters with heterogeneous networking. Thus, in this paper, we introduce Pipe-torch, an improved pipeline-hybrid parallelism method (using both data and model parallelism) in a heterogeneous network environment. Ultimately, the framework's model partition algorithm aims to expedite pipeline-hybrid parallelism training between heterogeneous network-connected GPUs. Experiments with four different DNN models show that Pipe-torch averages 1.4x speedup compared to data parallelism.
暂无评论