Intrusion detection is an effective means to deal with network attacks. Currently, the commonly used detection methods are based on machine learning. However, traditional machine learning-based methods are centralized...
详细信息
The proceedings contain 18 papers. The special focus in this conference is on Latin American High Performance Computing. The topics include: Accelerating Smart City Simulations;distributed Artificial Intelligent Model...
ISBN:
(纸本)9783031042089
The proceedings contain 18 papers. The special focus in this conference is on Latin American High Performance Computing. The topics include: Accelerating Smart City Simulations;distributed Artificial Intelligent Model Training and Evaluation;large-Scale distributed Deep Learning: A Study of Mechanisms and Trade-Offs with PyTorch;wind Prediction Using Deep Learning and High Performance Computing;an Analysis of Neural Architecture Search and Hyper Parameter Optimization Methods;Solving the Heat Transfer Equation by a Finite Difference Method Using Multi-dimensional Arrays in CUDA as in Standard C;High-Throughput of Measure-Preserving Integrators Derived from the Liouville Operator for Molecular Dynamics Simulations on GPUs;an Efficient parallel Model for Coupled Open-Porous Medium Problem Applied to Grain Drying processing;Energy Consumption Studies of WRF Executions with the LIMITLESS Monitor;Improving Performance of Long Short-Term Memory networks for Sentiment Analysis Using Multicore and GPU Architectures;a Methodology for Evaluating the Energy Efficiency of Post-Moore Architectures;Understanding COVID-19 Epidemic in Costa Rica Through network-based Modeling;an Efficient Vectorized Auction Algorithm for Many-Core and Multicore Architectures;Green Energy HPC Data Centers to Improve processing Cost Efficiency;DICE: Generic Data Abstraction for Enhancing the Convergence of HPC and Big Data;a Comparative Study of Consensus Algorithms for distributed Systems.
Aiming at the problems of large amount of data collected by airborne sensors, lack of data association, and low processing efficiency, this paper proposes a parallel LSTM algorithm model suitable for Spark platform. F...
详细信息
ISBN:
(纸本)9781728176796
Aiming at the problems of large amount of data collected by airborne sensors, lack of data association, and low processing efficiency, this paper proposes a parallel LSTM algorithm model suitable for Spark platform. First, use the Spark platform to complete the traversal scan operation in the memory RDD of all nodes in the distributed cluster, and combine the directed acyclic graph to create a Pipeline pipeline to implement a parallel computing framework. An algorithm model to optimize the parameters of LSTM neural network is proposed, and load balancing processing method is introduced to realize that all nodes of the distributed system can share the computing tasks in a balanced manner. The experimental results show that compared to the stand-alone case, the parallelized LSTM algorithm improves the efficiency. The prediction efficiency of the LSTM algorithm model after load balancing processing is higher, which shows that the distribution of traversal tasks of each node is more balanced and the degree of parallelization is higher.
The proceedings contain 93 papers. The topics discussed include: edge-cloud synergy: unleashing the potential of parallelprocessing for big data analytics;access delegation framework for private decentralized patient...
ISBN:
(纸本)9781665463164
The proceedings contain 93 papers. The topics discussed include: edge-cloud synergy: unleashing the potential of parallelprocessing for big data analytics;access delegation framework for private decentralized patient health records sharing system based on blockchain;fair and robust bio-inspired resource sharing in distributed dynamic spectrum access networks;trust development for blockchain interoperability using self-sovereign identity integration;abnormality detection in network traffic by classification and graph data analysis;an automated system to calculate marks from answer scripts;the problem with regular multiple byte block boundaries in encryption;on the identification of isomorphic graphs for graph neural network using multi-graph approach;a hybrid method based on machine learning to predict the stock prices in Bangladesh;cataract detection and grading using ensemble neural networks and transfer learning;and a tool for the dynamic allocation of multiple marine activities.
distributed execution of real-time data analytics such as event stream processing is the key to scalability, performance and reliable detection of situation changes. Although real-time analytics is highly I/O centric,...
详细信息
With the continuous development of digital image processing algorithms, its application scenarios have been integrated from the simple research of a single image and a single algorithm to a multi-algorithm fusion anal...
详细信息
ISBN:
(数字)9798331509828
ISBN:
(纸本)9798331509835
With the continuous development of digital image processing algorithms, its application scenarios have been integrated from the simple research of a single image and a single algorithm to a multi-algorithm fusion analysis paradigm. Therefore, this study proposes a digital media art image analysis framework that combines multiple computer image processing (CIP) algorithms. This breakthrough covers three core architectures: image edge extraction, image style transfer, and image compression. In the image edge extraction module, this study designed an improved edge detection algorithm based on de-noising auto-encoder. This algorithm improves the accuracy of edge detection through multi-directional feature extraction and (I, O)-fuzzy rough set optimization, while maintaining global stability. In the image style transfer module, this study proposed a dual-module network. The network includes a texture translation network and a perceptual loss network, and achieves cross-domain style transfer through a fusion model. At the same time, the algorithm optimizes the perceptual loss function to enhance semantic matching capabilities. In the image compression module, this study uses chaotic systems to construct an optimized measurement matrix. Specifically, through a distributed data parallel training framework, a compression method based on the LPAC gradient sparse algorithm is constructed. By testing the three modules separately, the experimental results confirm the effectiveness of the proposed algorithm.
The multiplication of a matrix by its transpose, AT A, appears as an intermediate operation in the solution of a wide set of problems. In this paper, we propose a new cache-oblivious algorithm (AtA) for computing this...
详细信息
ISBN:
(纸本)9781450390682
The multiplication of a matrix by its transpose, AT A, appears as an intermediate operation in the solution of a wide set of problems. In this paper, we propose a new cache-oblivious algorithm (AtA) for computing this product, based upon the classical Strassen algorithm as a sub-routine. In particular, we decrease the computational cost to 2/3 the time required by Strassen's algorithm, amounting to 14 3 nlog2 7 floating point operations. AtA works for generic rectangular matrices, and exploits the peculiar symmetry of the resulting product matrix for saving memory. In addition, we provide an extensive implementation study of AtA in a shared memory system, and extend its applicability to a distributed environment. To support our findings, we compare our algorithm with state-of-the-art solutions specialized in the computation of AT A. Our experiments highlight good scalability with respect to both the matrix size and the number of involved processes, as well as favorable performance for both the parallel paradigms and the sequential implementation, when compared with other methods in the literature.
The large datasets related to network traffic flow classification in the internet benefit machine learning (ML) and deep learning (DL) models for more accurate classification, which is used in many applications such a...
The large datasets related to network traffic flow classification in the internet benefit machine learning (ML) and deep learning (DL) models for more accurate classification, which is used in many applications such as in detecting traffic anomaly for prevention of potential cyber-attacks. Data parallelization allows for faster training times on large datasets as shown in our results, and it is also beneficial in the cloud-edge environment by allowing efficient distribution of computation and data across multiple nodes. We deployed Convolutional Neural network (CNN), Long Short-Term Memory (LSTM), advanced hybrid Convolutional LSTM (ConvLSTM), Convolutional GRU (ConvGRU), and XGBoost algorithm using Data parallelization approach. The experimental setup was implemented in the cloud and parallel training was executed using Nvidia Tesla Graphics processing Units (GPUs). Lastly comparison of the performance metric results is presented between the non-parallel centralized (single node) and data paralleldistributed approaches.
The attention mechanism has recently shown superior performance in natural language processing and computer vision tasks. But its complex dataflow and large-scale matrix calculation with huge computing and memory over...
详细信息
ISBN:
(纸本)9798350397444
The attention mechanism has recently shown superior performance in natural language processing and computer vision tasks. But its complex dataflow and large-scale matrix calculation with huge computing and memory overhead pose a great challenge for the design of hardware accelerators. And previous solutions that benefited from matrix partitioning are bounded by the softmax function. In this paper, we propose a new attention framework that can dramatically improve the performance of attention model inference for long sequence tasks on FPGAs. We design a novel accelerator architecture that employs two systolic arrays and a ping-pong structure to accelerate attention calculation. Meanwhile, we propose an analytical model to predict resource usage and performance, which guides a fast design space exploration. Experiments using the state-of-the-art BERT demonstrate the design achieves 4.61 and 1.24x improvement in speed and energy efficiency compared to CPU and GPU on the Xilinx XCZU11EG platform.
In this paper, we investigate device-free fall detection based on wireless channel state information (CSI). Here, we mainly propose a method that uses continuous wavelet transform (CWT) to generate images and then use...
详细信息
ISBN:
(纸本)9783031192135;9783031192142
In this paper, we investigate device-free fall detection based on wireless channel state information (CSI). Here, we mainly propose a method that uses continuous wavelet transform (CWT) to generate images and then uses transform learning of convolutional networks for classification. In addition, we add a wavelet scattering network to automatically extract features and classify them using a long and short-term memory network (LSTM), which can increase the interpretability and reduce the computational complexity of the system. After applying these methods to wireless sensing technology, both methods have a higher accuracy rate. The first method can cope with the problem of degraded sensing performance when the environment is not exactly the same, and the second method has more stable sensing performance.
暂无评论