This paper presents a comprehensive study on optimizing resource allocation in cloud computing environments using an ensemble of machine learning techniques and optimization algorithms. We developed a multifaceted app...
详细信息
ISBN:
(纸本)9798350391961;9798350391954
This paper presents a comprehensive study on optimizing resource allocation in cloud computing environments using an ensemble of machine learning techniques and optimization algorithms. We developed a multifaceted approach, integrating Long Short-Term Memory (LSTM) networks for forecasting resource demands, Particle Swarm Optimization (PSO) for initial resource allocation, Q-learning for dynamic resource adjustment, and Linear Regression (LR) for predicting energy consumption. Our LSTM model demonstrated high accuracy in demand forecasting, with detailed performance metrics indicating its effectiveness in diverse scenarios. The PSO algorithm significantly enhanced the efficiency of resource distribution, evidenced by a reduction in the number of utilized units. Q-learning contributed to the system's adaptability, optimizing resource allocation based on changing demands in real-time. The LR model accurately predicted energy consumption, aligning closely with observed data and highlighting the potential for energy-efficient cloud management.
The voltage source converter (VSC) has been widely used in the new power system. Its electromagnetic transient simulation has the characteristics of low simulation speed and high resource consumption. In this paper, t...
详细信息
ISBN:
(纸本)9798350377477;9798350377460
The voltage source converter (VSC) has been widely used in the new power system. Its electromagnetic transient simulation has the characteristics of low simulation speed and high resource consumption. In this paper, the small-step real-time simulator for VSC based on field programmable gate array (FPGA) is established. Firstly, the fixed-admittance model for VSC is established to improve the calculation speed of the primary system. At the same time, the parallelcomputing process of the primary system is designed to achieve small-step real-time computing of the primary system for VSC based on the parallelcomputing architecture of FPGA. Secondly, according to the logic of the control system for VSC, the serial-parallelcomputing process of the control system is designed, and the controller for VSC is implemented based on FPGA. Finally, the performance of the established FPGA simulator is verified based on the control experiments on PSCAD/EMTDC.
Several parallel and distributed data mining algorithms have been proposed in literature to perform large scale data analysis, overcoming the bottleneck of traditional methods on a single machine. However, although th...
详细信息
ISBN:
(纸本)9798350363074;9798350363081
Several parallel and distributed data mining algorithms have been proposed in literature to perform large scale data analysis, overcoming the bottleneck of traditional methods on a single machine. However, although the master-worker approach greatly simplifies the synchronization of all nodes since only the master is in charge to do that, it also presents several problematic issues for large-scale data analysis tasks (involving thousands or millions of nodes). This paper presents a hierarchical (or multi-level) master-worker framework for iterative parallel data analysis algorithms, to overcome the scalability issues affecting classic master-worker solutions. Specifically, the framework is composed of (more than one) merger and worker nodes organized in a k-tree structure, in which the workers are on the leaves and the mergers are on the root and the internal nodes in the tree.
The 3D surface reconstruction is critical for various applications, demanding efficient computational approaches. Traditional Radial Basis Functions (RBFs) methods are limited by increasing data points, leading to slo...
详细信息
ISBN:
(纸本)9798350363074;9798350363081
The 3D surface reconstruction is critical for various applications, demanding efficient computational approaches. Traditional Radial Basis Functions (RBFs) methods are limited by increasing data points, leading to slower execution times. Addressing this, our study introduces an experimental parallelization effort using Julia, as well-known for high-performance scientific computing. We developed an initial sequential RBF algorithm in Julia, then expanded it to a parallel model, exploiting Multi-Threading to enhance execution speed while maintaining accuracy. This initial exploration into Julia's parallelcomputing capabilities shows marked performance gains in 3D surface reconstruction, offering promising directions for future research. Our findings affirm Julia's potential in computationally intensive tasks, with test results confirming the expected time efficiency improvements.
With the increasing use of high-performance computing, users are turning to programs for concurrent execution enhance the speed and overall performance of large-scale programs. This trend is supported through the use ...
详细信息
This special issue is dedicated to examining the rapidly evolving fields of artificial intelligence, mathematical modeling, and optimization, with particular emphasis on their growing importance in computational scien...
详细信息
This special issue is dedicated to examining the rapidly evolving fields of artificial intelligence, mathematical modeling, and optimization, with particular emphasis on their growing importance in computational science. It features the most notable papers from the "Mathematical Modeling and Problem Solving" workshop at PDPTA'24, the 30th internationalconference on parallel and distributed Processing Techniques and Applications. The issue showcases pioneering research in areas such as natural language processing, system optimization, and high-performance computing. The nine selected studies include novel AI-driven methods for chemical compound generation, historical text recognition, and music recommendation, along with advancements in hardware optimization through reconfigurable accelerators and vector register sharing. Additionally, evolutionary and hyper-heuristic algorithms are explored for sophisticated problem-solving in engineering design, and innovative techniques are introduced for high-speed numerical methods in large-scale systems. Collectively, these contributions demonstrate the significance of AI, supercomputing, and advanced algorithms in driving the next generation of scientific discovery.
Deep learning has emerged as a powerful method for extracting valuable information from large volumes of data. However, when new training data arrives continuously (i.e., is not fully available from the beginning), in...
详细信息
ISBN:
(纸本)9798350395679;9798350395662
Deep learning has emerged as a powerful method for extracting valuable information from large volumes of data. However, when new training data arrives continuously (i.e., is not fully available from the beginning), incremental training suffers from catastrophic forgetting (i.e., new patterns are reinforced at the expense of previously acquired knowledge). Training from scratch each time new training data becomes available would result in extremely long training times and massive data accumulation. Rehearsal-based continual learning has shown promise for addressing the catastrophic forgetting challenge, but research to date has not addressed performance and scalability. To fill this gap, we propose an approach based on a distributed rehearsal buffer that efficiently complements data-parallel training on multiple GPUs, allowing us to achieve short runtime and scalability while retaining high accuracy. It leverages a set of buffers (local to each GPU) and uses several asynchronous techniques for updating these local buffers in an embarrassingly parallel fashion, all while handling the communication overheads necessary to augment input mini-batches (groups of training samples fed to the model) using unbiased, global sampling. In this paper we explore the benefits of this approach for classification models. We run extensive experiments on up to 128 GPUs of the ThetaGPU supercomputer to compare our approach with baselines representative of training-from-scratch (the upper bound in terms of accuracy) and incremental training (the lower bound). Results show that rehearsal-based continual learning achieves a top-5 classification accuracy close to the upper bound, while simultaneously exhibiting a runtime close to the lower bound.
WhisperLink is an innovative anonymous messaging service that aims to enhance privacy in digital communication. Hosted on the secure and robust infrastructure of the Google Cloud Platform, WhisperLink enables users to...
详细信息
ISBN:
(纸本)9798350391961;9798350391954
WhisperLink is an innovative anonymous messaging service that aims to enhance privacy in digital communication. Hosted on the secure and robust infrastructure of the Google Cloud Platform, WhisperLink enables users to create secure, temporary chat rooms that self-destruct after 24 hours. By not requiring logins, WhisperLink ensures confidentiality, enabling users to communicate without worrying about the risks associated with the exposure of personal data. WhisperLink places a strong emphasis on confidentiality and anonymity by employing end-to-end encryption, which ensures that messages can only be read by the intended recipients. It also has additional authentication features such as security questions, allowing only authorized users to access chat rooms. The platform deletes messages after 24 hours and does not store any residual data, therefore keeping conversations private and transient. The platform can be used innovatively in various user scenarios, making it ideal for social, professional, and private interactions. WhisperLink provides a flexible and secure platform for immediate, private communication, offering a user-friendly and efficient messaging experience. With these features, WhisperLink stands out as a leading solution in secure, anonymous digital communication.
The rise of edge computing has shifted computing resources closer to end-users, benefiting numerous delay-sensitive, computation-intensive applications. To speed up computation, distributedcomputing is a promising te...
详细信息
ISBN:
(纸本)9781728190549
The rise of edge computing has shifted computing resources closer to end-users, benefiting numerous delay-sensitive, computation-intensive applications. To speed up computation, distributedcomputing is a promising technique that allows parallel execution of computation tasks across multiple compute nodes. However, current research predominantly revolves around the master-worker paradigm, limiting resource sharing within one-hop neighborhoods. This limitation can render distributedcomputing ineffective in scenarios with limited nearby resources or constrained/dynamic connectivity. In this paper, we address this limitation by introducing a new distributedcomputing strategy that extends resource sharing beyond one-hop neighborhoods through exploring layered network structures and multi-hop routing. Our approach involves transforming the network graph into a sink tree and solving a joint optimization problem formulated based on the layered tree structure for task allocation and scheduling. Simulation results demonstrate a significant improvement over the traditional distributedcomputing and computation offloading strategies.
Simulations of reacting multiphase flows tend to display an inhomogeneously distributed computational intensity over the spatial and temporal domains. The time-to-solution of chemical reaction rates can span multiple ...
详细信息
Simulations of reacting multiphase flows tend to display an inhomogeneously distributed computational intensity over the spatial and temporal domains. The time-to-solution of chemical reaction rates can span multiple orders of magnitude due to the emergence of combustible kernels and thin turbulent reaction zones. Similarly, the time to solve the equation of state (EoS) for non-ideal fluid mixtures deviates substantially between the grid cells. These effects result in a performance profile that is unbalanced and rapidly changing for transient simulations, and therefore beyond the capabilities of traditional (quasi-)static mesh partitioning methods. We analyse this loss of parallel efficiency for large-eddy simulations of the ECN Spray-A benchmark with the multi-physics solver INCA and propose to mitigate the problem by introducing two independent repartitioning stages in addition to the classic domain decomposition for fluid transport: one for the EoS and one for chemical reactions. We explore various scalable repartitioning strategies in this context and observe that rebalancing computational load yields a significant speedup that is robust for various mesh resolutions and process numbers. The dynamic multistage load-balancing thus effectively removes obstacles towards good parallel scaling of INCA and similar solvers for reacting and/or multiphase flows.
暂无评论