Size databases have constantly increased from advances in technology and the Internet, so processing this vast amount of information has been a great challenge. The neural network Extreme Learning Machine (ELM) have b...
详细信息
ISBN:
(纸本)9781665456753
Size databases have constantly increased from advances in technology and the Internet, so processing this vast amount of information has been a great challenge. The neural network Extreme Learning Machine (ELM) have been widely accepted in the scientific community due to their simplicity and good generalization capacity. This model consists of randomly assigning the weights of the hidden layer, and analytically calculating the weights of the output layer through the Moore- Penrose generalized inverse. High-Performance computing has emerged as an excellent alternative for tackling problems involving large-scale databases and reducing processing times. The use of parallelcomputing tools in Extreme Learning Machines and their variants, especially the Online Sequential Extreme Learning Machine (OS-ELM), has proven to be a good alternative to tackle regression and classification problems with largescale databases. In this paper, we present a parallel training methodology consisting of several Online Sequential Extreme Learning Machines running on different cores of the Central Processing Unit, with a balanced fingerprint database having 2,000,000 samples distributed in five classes. The results show that training and validation times decrease as the number of processes increases since the number of samples to train in each process decreases. In addition, by having several Online Sequential Extreme Learning Machines trained, new samples can beclassified on any of them.
Internet of Things (IoT) has attracted the attention of researchers from both industry and academia. Smart city, as one of the IoT applications, includes several sub-applications, such as intelligent transportation sy...
详细信息
ISBN:
(纸本)9783030389611;9783030389604
Internet of Things (IoT) has attracted the attention of researchers from both industry and academia. Smart city, as one of the IoT applications, includes several sub-applications, such as intelligent transportation system (ITS), smart car parking and smart grid. Focusing on traffic flow management and car parking systems because of their correlation, this paper aims to provide a framework solution to both systems using online detection and prediction based on fog computing. Online event detection plays a vital role in traffic flow management, as circumstances, such as social events and congestion resulting from accidents and roadworks, affect traffic flow and parking availability. We developed an online prediction model using an incremental decision tree and distributed the prediction process on fog nodes at each intersection traffic light responsible for a connecting road. It effectively reduces the load on the communication network, as the data is processed, and the decision is made locally, with low storage requirements. The spatially correlated fog nodes can communicate if necessary to take action for an emergency. The experiments were conducted using the Melbourne city open data.
Graph Analytics is important in different domains: social networks, computer networks, and computational biology to name a few. This paper describes the challenges involved in programming the underlying graph algorith...
详细信息
ISBN:
(纸本)9783030369873;9783030369866
Graph Analytics is important in different domains: social networks, computer networks, and computational biology to name a few. This paper describes the challenges involved in programming the underlying graph algorithms for graph analytics for distributed systems with CPU, GPU, and multi-GPU machines and how to deal with them. It emphasizes how language abstractions and good compilation can ease programming graph analytics on such platforms without sacrificing implementation efficiency.
Energy theft is an old and multifaceted phenomenon affecting our society on a global scale from both an operational as well as from a monetary perspective. The relatively recent decentralisation of the grid infrastruc...
详细信息
Stencil computation or general sparse matrix-vector product (SpMV) are key components in many algorithms like geometric multigrid or Krylov solvers. But their low arithmetic intensity means that memory bandwidth and n...
详细信息
ISBN:
(纸本)9781728174457
Stencil computation or general sparse matrix-vector product (SpMV) are key components in many algorithms like geometric multigrid or Krylov solvers. But their low arithmetic intensity means that memory bandwidth and network latency will be the performance limiting factors. The current architectural trend favors computations over bandwidth, worsening the already unfavorable imbalance. Previous work approached stencil kernel optimization either by improving memory bandwidth usage or by providing a Communication Avoiding (CA) scheme to minimize network latency in repeated sparse vector multiplication by replicating remote work in order to delay communications on the critical path. Focusing on minimizing communication bottleneck in distributed stencil computation, in this study we combine a CA scheme with the computation and communication overlapping that is inherent in a dataflow task-based runtime system such as PaRSEC to demonstrate their combined benefits. We implemented the 2D five point stencil (Jacobi iteration) in PETSc, and over PaRSEC in two flavors, full communications (base-PaRSEC) and CA-PaRSEC which operate directly on a 2D compute grid. Our results running on two clusters, NaCL and Stampede2 indicate that we can achieve 2X speedup over the standard SpMV solution implemented in PETSc, and in certain cases when kernel execution is not dominating the execution time, the CA-PaRSEC version achieved up to 57% and 33% speedup over base-PaRSEC implementation on NaCL and Stampede2 respectively.
The rapid growth in edge computing devices as part of Internet of Things (IoT) allows real-time access to time-series data from 1000's of sensors. Such observations are often queried to optimize the health of the ...
详细信息
ISBN:
(纸本)9783030576752;9783030576745
The rapid growth in edge computing devices as part of Internet of Things (IoT) allows real-time access to time-series data from 1000's of sensors. Such observations are often queried to optimize the health of the infrastructure. Recently, edge storage systems allow us to retain data on the edge rather than moving them centrally to the cloud. However, such systems do not support flexible querying over the data spread across 10-100's of devices. There is also a lack of distributed time-series databases that can run on the edge devices. Here, we propose TorqueDB, a distributed query engine over time-series data that operates on edge and fog resources. TorqueDB leverages our prior work on ElfStore, a distributed edge-local file store, and InfluxDB, a time-series database, to enable temporal queries to be decomposed and executed across multiple fog and edge devices. Interestingly, we move data into InfluxDB on-demand while retaining the durable data within ElfStore for use by other applications. We also design a cost model that maximizes parallel movement and execution of the queries across resources, and utilizes caching. Our experiments on a real edge, fog and cloud deployment show that TorqueDB performs comparable to InfluxDB on a cloud VM for a smart city query workload, but without the associated monetary costs.
Transient cloud servers such as Amazon Spot instances, Google Preemptible VMs, and Azure Low-priority batch VMs, can reduce cloud computing costs by as much as 10x, but can be unilaterally preempted by the cloud provi...
详细信息
ISBN:
(纸本)9781450370523
Transient cloud servers such as Amazon Spot instances, Google Preemptible VMs, and Azure Low-priority batch VMs, can reduce cloud computing costs by as much as 10x, but can be unilaterally preempted by the cloud provider. Understanding preemption characteristics (such as frequency) is a key first step in minimizing the effect of preemptions on application performance, availability, and cost. However, little is understood about temporally constrained preemptions-wherein preemptions must occur in a given time window. We study temporally constrained preemptions by conducting a large scale empirical study of Google's Preemptible VMs (that have a maximum lifetime of 24 hours), develop a new preemption probability model, new model-driven resource management policies, and implement them in a batch computing service for scientific computing workloads. Our statistical and experimental analysis indicates that temporally constrained preemptions are not uniformly distributed, but are time-dependent and have a bathtub shape. We find that existing memoryless models and policies are not suitable for temporally constrained preemptions. We develop a new probability model for bathtub preemptions, and analyze it through the lens of reliability theory. To highlight the effectiveness of our model, we develop optimized policies for job scheduling and checkpointing. Compared to existing techniques, our model-based policies can reduce the probability of job failure by more than 2x. We also implement our policies as part of a batch computing service for scientific computing applications, which reduces cost by 5x compared to conventional cloud deployments and keeps performance overheads under 3%.
Mobile edge computing (MEC) has been recognized as a promising technology to support various emerging services in vehicular networks. With MEC, vehicle users can offload their computation-intensive applications (e.g.,...
详细信息
ISBN:
(纸本)9781728150895
Mobile edge computing (MEC) has been recognized as a promising technology to support various emerging services in vehicular networks. With MEC, vehicle users can offload their computation-intensive applications (e.g., intelligent path planning and safety applications) to edge computing servers located at roadside units. In this paper, an efficient computing offloading and server collaboration approach is proposed to reduce computing service delay and improve service reliability for vehicle users. Task partition is adopted, whereby the computation load offloaded by a vehicle can be divided and distributed to multiple edge servers. By the proposed approach, the computation delay can be reduced by parallelcomputing, and the failure in computing results delivery can also be alleviated via cooperation among edges. The offloading and computing decision-making is formulated as a long-term planning problem, and a deep reinforcement learning technique, i.e., deep deterministic policy gradient, is adopted to achieve the optimal solution of the complex stochastic nonlinear integer optimization problem. Simulation results show that our collaborative computing approach can adapt to different service environments and outperform the greedy offloading approach.
In the power system, the transmission lines are scattered and difficult to protect and repair. The traditional safety measures for maintenance of transmission lines are very monotonous. Most of the task process is dir...
详细信息
In the power system, the transmission lines are scattered and difficult to protect and repair. The traditional safety measures for maintenance of transmission lines are very monotonous. Most of the task process is directly involved in manual work, and there is no testing facility for hanging ground wires, which is inefficient and unsafe. The main purpose of this paper is to design and improve the monitoring of the state of the ground wire based on the data acquisition technology. A simulation model considering the grounding wire is built, and the influence of different frequency signal injection, multi-branch line, transformer no-load input, the same pole in parallel, the earth-coupled grounding equivalent resistance and other factors are analyzed by simulation. Experiments show that when the frequency of the detection signal is 120 Hz, if the length of the detected line is longer and the grounding resistance is greater, the measured value of the response current when the grounding wire is in the grounding state and the measured value of the response current when there is no grounding wire is the same the smaller the difference.
Electronic structure calculations based on density-functional theory (DFT) represent a significant part of today's HPC workloads and pose high demands on high-performance computing resources. To perform these quan...
详细信息
ISBN:
(纸本)9781728199986
Electronic structure calculations based on density-functional theory (DFT) represent a significant part of today's HPC workloads and pose high demands on high-performance computing resources. To perform these quantum-mechanical DFT calculations on complex large-scale systems, so-called linear scaling methods instead of conventional cubic scaling methods are required. In this work, we take up the idea of the submatrix method and apply it to the DFT computations in the software package CP2K. For that purpose, we transform the underlying numeric operations on distributed, large, sparse matrices into computations on local, much smaller and nearly dense matrices. This allows us to exploit the full floating-point performance of modern CPUs and to make use of dedicated accelerator hardware, where performance has been limited by memory bandwidth before. We demonstrate both functionality and performance of our implementation and show how it can he accelerated with GPM and FPGAs.
暂无评论