Delay Tolerant Network is a network which is wireless and which has no infrastructure. It is designed in a way that it can operate effectively even in extreme diverse conditions and can cover large distances, as in th...
详细信息
To improve the task scheduling efficiency in computing power networks, this paper proposes a global task scheduling method based on network measurement and prediction in computing power networks (GTS-MP), which select...
To improve the task scheduling efficiency in computing power networks, this paper proposes a global task scheduling method based on network measurement and prediction in computing power networks (GTS-MP), which selects a more suitable computing power cluster and storage platform with a higher network availability score between them to implement tasks. The experiments are conducted on a test bench of urban area networks in 12 cities. The results show that this method greatly reduces the execution time of artificial intelligence model training tasks, with a maximum reduction of 40% compared to other plans.
In this paper, a machine learning approach to detect the Denial-of-Service attack over software-defined networks (SDN) is introduced. It builds the training model using NS-3 traces to detect the Denial-of-Service (DoS...
详细信息
The proceedings contain 25 papers. The topics discussed include: temporally synchronized emulation of devices with simulation of networks;adaptive partitioning for distributed multi-agent simulations;addressing white-...
ISBN:
(纸本)9781450392617
The proceedings contain 25 papers. The topics discussed include: temporally synchronized emulation of devices with simulation of networks;adaptive partitioning for distributed multi-agent simulations;addressing white-box modeling and simulation challenges in parallelcomputing;explainable artificial intelligence: requirements for explainability;data assimilation of opinion dynamics based on particle filter;towards an open repository for reproducible performance comparison of parallel and distributed discrete-event simulators;a data-driven approach for pedestrian intention prediction in large public places;integrating I/O time to virtual time system for high fidelity container-based network emulation;effective simulation of on-demand services;simulation of entanglement generation and distribution - towards practical quantum networks;and formal methods for modeling and simulation of emergent behavior in complex adaptive systems.
Multi-layer Perceptron (MLP) is a class of Artificial Neural networks widely used in regression, classification, and prediction. To accelerate the training of MLP, more cores can be used for parallelcomputing on many...
详细信息
ISBN:
(纸本)9783030967727;9783030967710
Multi-layer Perceptron (MLP) is a class of Artificial Neural networks widely used in regression, classification, and prediction. To accelerate the training of MLP, more cores can be used for parallelcomputing on many-core systems. With the increasing number of cores, interconnection of cores has a pivotal role in accelerating MLP training. Currently, the chip-scale interconnection can either use electrical signals or optical signals for data transmission among cores. The former one is known as Electrical Network-on-Chip (ENoC) and the latter one is known as Optical Network-on-Chip (ONoC). Due to the differences of optical and electrical characteristics, the performance and energy consumption of MLP training on ONoC and ENoC can be very different. Therefore, comparing the performance and energy consumption between ENoC and ONoC for MLP training is worthy of study. In this paper, we first compare the differences between ONoC and ENoC based on a parallel MLP training method. Then, we formulate their performance model by analyzing communication and computation time. Furthermore, the energy model is formulated according to their static energy and dynamic energy consumption. Finally, we conduct extensive simulations to compare the performance and energy consumption between ONoC and ENoC. Results show that compared with ENoC, the MLP training time of ONoC is reduced by 70.12% on average and the energy consumption of ONoC is reduced by 48.36% under batch size 32. However, with a small number of cores in MLP training, ENoC consumes less energy than ONoC.
Matching theory has been efficiently applied in fog computingnetworks (FCNs) to design distributed task offloading algorithms in the presence of selfishness and rationals of fog nodes. Given the dynamic nature of fog...
详细信息
ISBN:
(数字)9798350349948
ISBN:
(纸本)9798350349955
Matching theory has been efficiently applied in fog computingnetworks (FCNs) to design distributed task offloading algorithms in the presence of selfishness and rationals of fog nodes. Given the dynamic nature of fog computing environment, it is challenging to obtain the stable matching since the preference relations of two sides of matching game is unknown a prior. To address this challenge, this paper proposes RL-MATCH, a framework for parallel computation offloading in dynamic fog computingnetworks (FCNs). RL-MATCH is based on the matching theory and Thompson Sampling (TS) empowered Multi-Armed Bandit (MAB) learning to deal with the inherent challenges allowing task nodes with needed computation tasks to estimate the informed preference relations of helper nodes with available computing resource quickly and accurately. Extensive simulation results demonstrate the potential advantages of the TS based learning over the
$\epsilon$
-greedy and upper confidence bound (UCB) based baselines.
The resource-hungry and time-consuming process of training Deep Neural networks (DNNs) can be accelerated by optimizing and/or scaling computations on accelerators such as GPUs. However, the loading and pre-processing...
详细信息
ISBN:
(纸本)9781450397339
The resource-hungry and time-consuming process of training Deep Neural networks (DNNs) can be accelerated by optimizing and/or scaling computations on accelerators such as GPUs. However, the loading and pre-processing of training samples then often emerges as a new bottleneck. This data loading process engages a complex pipeline that extends from the sampling of training data on external storage to delivery of those data to GPUs, and that comprises not only expensive I/O operations but also decoding, shuffling, batching, augmentation, and other operations. We propose in this paper a new holistic approach to data loading that addresses three challenges not sufficiently addressed by other methods: I/O load imbalances among the GPUs on a node;rigid resource allocations to data loading and data preprocessing steps, which lead to idle resources and bottlenecks;and limited efficiency of caching strategies based on pre-fetching due to eviction of training samples needed soon at the expense of those needed later. We first present a study of key bottlenecks observed as training samples flow through the data loading and preprocessing pipeline. Then, we describe Lobster, a data loading runtime that uses performance modeling and advanced heuristics to combine flexible thread management with optimized eviction for distributed caching in order to mitigate I/O overheads and load imbalances. Experiments with a range of models and datasets showthat the Lobster approach reduces both I/O overheads and end-to-end training times by up to 1.5x compared with state-of-the-art approaches.
The current trend of using artificial neural networks to solve computationally intensive problems is omnipresent. In this scope, DeepQ learning is a common choice for agent-based problems. DeepQ combines the concept o...
详细信息
ISBN:
(纸本)9783030967727;9783030967710
The current trend of using artificial neural networks to solve computationally intensive problems is omnipresent. In this scope, DeepQ learning is a common choice for agent-based problems. DeepQ combines the concept of Q-Learning with (deep) neural networks to learn different Q-values/matrices based on environmental conditions. Unfortunately, DeepQ learning requires hundreds of thousands of iterations/Q-samples that must be generated and learned for large-scale problems. Gathering data sets for such challenging tasks is extremely time consuming and requires large data-storage containers. Consequently, a common solution is the automatic generation of input samples for agent-based DeepQ networks. However, a usual workflow is to create the samples separately from the training process in either a (set of) pre-processing step(s) or interleaved with the training process. This requires the input Q-samples to be materialized in order to be fed into the training step of the attached neural network. In this paper, we propose a new GPU-focussed method for on-the-fly generation of training samples tightly coupled with the training process itself. This allows us to skip the materialization process of all samples (e.g. avoid dumping them disk), as they are (re)constructed when needed. Our method significantly outperforms usual workflows that generate the input samples on the CPU in terms of runtime performance and memory/storage consumption.
The proceedings contain 34 papers. The topics discussed include: large-scale parallel exact diagonalization algorithm of the Hubbard model on Tianhe-2 supercomputer;reinforcement learning enabled throughput optimizati...
ISBN:
(纸本)9781450396295
The proceedings contain 34 papers. The topics discussed include: large-scale parallel exact diagonalization algorithm of the Hubbard model on Tianhe-2 supercomputer;reinforcement learning enabled throughput optimization for interconnection networks of interposer-based system;a node selection scheme for data repair using erasure code in distributed storage system;research on recognition method of textual implication;research on used car valuation problem based on model fusion;an efficient parallel architecture for convolutional neural networks accelerator on FPGAs;performance optimization of sparse deep neural networks based on GPU;a quantum group signature based on quantum walk in d dimensions;parallel performance and optimization of the lattice Boltzmann method software Palabos using CUDA;secure mechanism of intelligent urban railway cloud platform based on zero-trust security;and a digital signature virtual platform based on hardware-software co-design.
The proceedings contain 19 papers. The special focus in this conference is on distributedcomputing and Intelligent Technology. The topics include: An Introduction to Graph Neural networks from a distributedcomputing...
ISBN:
(纸本)9783030948757
The proceedings contain 19 papers. The special focus in this conference is on distributedcomputing and Intelligent Technology. The topics include: An Introduction to Graph Neural networks from a distributedcomputing Perspective;Towards Temporally Uncertain Explainable AI Planning;the Impact of Synchronization in parallel Stochastic Gradient Descent;a distributed Algorithm for Constructing an Independent Dominating Set;An Approach to Cost Minimization with EC2 Spot Instances Using VM Based Migration Policy;transforming Medical Resource Utilization Process to Verifiable Timed Automata Models in Cyber-Physical Systems;An SDN Implemented Adaptive Load Balancing Scheme for Mobile networks;rewriting Logic and Petri Nets: A Natural Model for Reconfigurable distributed Systems;preface;Modern AI/ML Methods for Healthcare: Opportunities and Challenges;MCDPS: An Improved Global Scheduling Algorithm for Multiprocessor Mixed-Criticality Systems;replication Based Fault Tolerance Approach for Cloud;A Comparative Study on MFCC and Fundamental Frequency Based Speech Emotion Classification;efficient Traffic Routing in Smart Cities to Minimize Evacuation Time During Disasters;early Detection of Parkinson’s Disease as a Pre-diagnosis Tool Using Various Classification Techniques on Vocal Features;extracting Emotion Quotient of Viral Information Over Twitter;a Novel Modified Harmonic Mean Combined with Cohesion Score for Multi-document Summarization.
暂无评论