Attack graphs (AGs) are graphical tools to analyze the security of computer networks. By connecting the exploitation of individual vulnerabilities, AGs expose possible multi-step attacks against target networks, allow...
详细信息
Modeling the crucial dynamic properties of tissue simulations is computationally expensive due to the complex interactions between cells and its surrounding extracellular matrix (ECM). This work extends the Cellular P...
详细信息
ISBN:
(数字)9783982633619
Modeling the crucial dynamic properties of tissue simulations is computationally expensive due to the complex interactions between cells and its surrounding extracellular matrix (ECM). This work extends the Cellular Potts Model (CPM) with a dynamic ECM model to simulate the viscoelastic mechanics. Our implementation leverages the NAStJA framework, a highly scalable system for distributed simulations. We assess the performance of our implementations on different HPC setups and analyze the scalability challenges of large-scale tissue simulations. GPU-accelerated simulations significantly reduce computation time but are limited by host-device communication.
This book presents the latest, innovative research findings on P2P, parallel, Grid, Cloud, and Internet computing. It gathers the Proceedings of the 12th internationalconference on P2P, parallel, Grid, Cloud and Inte...
ISBN:
(纸本)9783319698342
This book presents the latest, innovative research findings on P2P, parallel, Grid, Cloud, and Internet computing. It gathers the Proceedings of the 12th internationalconference on P2P, parallel, Grid, Cloud and Internet computing, held on November 810, 2017 in Barcelona, Spain. These computing technologies have rapidly established themselves as breakthrough paradigms for solving complex problems by enabling the aggregation and sharing of an increasing variety of distributed computational resources at large scale. Grid computing originated as a paradigm for high-performance computing, offering an alternative to expensive supercomputers through different forms of large-scale distributedcomputing, while P2P computing emerged as a new paradigm after client-server and web-based computing and has shown to be useful in the development of social networking, B2B (Business to Business), B2C (Business to Consumer), B2G (Business to Government), B2E (Business to Employee), and so on. Cloud computing has been defined as a computing paradigm where the boundaries of computing are determined by economic rationale rather than technical limits. Cloud computing has quickly been adopted in a broad range of application domains and provides utility computing at large scale. Lastly, Internet computing is the basis of any large-scale distributedcomputing paradigm; it has very rapidly developed into a flourishing field with an enormous impact on todays information societies, serving as a universal platform comprising a large variety of computing forms such as Grid, P2P, Cloud and Mobile computing. The aim of the book advances on P2P, parallel, Grid, Cloud and Internet computing is to provide the latest findings, methods and development techniques from both theoretical and practical perspectives, and to reveal synergies between these large-scale computing paradigms.
Flow models over flat porous surfaces have applications in natural processes, such as material, food, chemical processing, or mountain mudflow simulations. The development of simplified analytical or numerical models ...
详细信息
ISBN:
(数字)9798331524937
ISBN:
(纸本)9798331524944
Flow models over flat porous surfaces have applications in natural processes, such as material, food, chemical processing, or mountain mudflow simulations. The development of simplified analytical or numerical models can predict characteristics such as velocity, pressure, deviation length, and even temperature of such flows for geophysical and engineering purposes. In this context, there is considerable interest in theoretical and experimental models. Mathematical models to represent such phenomena for fluid mechanics have continuously been developed and implemented. Given this, we propose a mathematical and simulation model to describe a free-flowing flow parallel to a porous material and its transition zone. The objective of the application is to analyze the influence of the porous matrix on the flow under different matrix properties. We implement a Computational Fluid Dynamics scheme using the Finite Volume Method to simulate and calculate the numerical solutions for case studies. However, computational applications of this type demand high performance, requiring parallel execution techniques. Due to this, it is necessary to modify the sequential version of the code. So, we propose a methodology describing the steps required to adapt and improve the code. This approach decreases 5.3% the execution time of the sequential version of the code. Next, we adopt OpenMP for parallel versions and instantiate parallel code flows and executions on multi-core. We get a speedup of 10.4 by using 12 threads. The paper provides simulations that offer the correct understanding, modeling, and construction of abrupt transitions between free flow and porous media. The process presented here could expand to the simulations of other porous media problems. Furthermore, customized simulations require little processing time, thanks to parallel processing.
Sparse General Matrix Multiply (SpGEMM) is key for various High-Performance computing (HPC) applications such as genomics and graph analytics. Using the semiring abstraction, many algorithms can be formulated as SpGEM...
详细信息
ISBN:
(纸本)9798400710735
Sparse General Matrix Multiply (SpGEMM) is key for various High-Performance computing (HPC) applications such as genomics and graph analytics. Using the semiring abstraction, many algorithms can be formulated as SpGEMM, allowing redefinition of addition, multiplication, and numeric types. Today large input matrices require distributed memory parallelism to avoid disk I/O, and modern HPC machines with GPUs can greatly accelerate linear algebra *** this paper, we implement a GPU-based distributed-memory SpGEMM routine on top of the CombBLAS library. Our implementation achieves a speedup of over 2× compared to the CPU-only CombBLAS implementation and up to 3× compared to PETSc for large input ***, we note that communication between processes can be optimized by either direct host-to-host or device-to-device communication, depending on the message size. To exploit this, we introduce a hybrid communication scheme that dynamically switches data paths depending on the message size, thus improving runtimes in communication-bound scenarios.
Edge computing has transformed machine learning by using computing closer to the data sources, thereby reducing latency. The ever-increasing volume of data has necessitated forming clusters of edge devices, possibly w...
详细信息
ISBN:
(数字)9798331531195
ISBN:
(纸本)9798331531201
Edge computing has transformed machine learning by using computing closer to the data sources, thereby reducing latency. The ever-increasing volume of data has necessitated forming clusters of edge devices, possibly with heterogeneous capabilities. Managing heterogeneous resources such as computation and memory remains challenging. Given the capabilities of the edge devices, we need a simple technique suitable for an Edge computing *** introduce a scheduling mechanism that leverages Integer Linear Programming (ILP) to optimize the overall computation time of ML-based tasks. We implement our scheduling mechanism that efficiently allocates resources, ensuring tasks are executed parallel across the cores of edge devices in a cluster by minimizing computation time. For tasks large enough to fit on any core, we leverage distributed learning to train the model in pieces and later combine them. We employ our ILP-based scheduler for efficient task allocation and compare its performance with a Greedy, simple approach based on best-fit technique. We evaluate our approach on three cases of sets of tasks on the spectrum of uniformity and size. Our results demonstrate two times the speed gain for our ILP-based approach over Greedy for the category with the least uniformity and large size.
The proceedings contain 147 papers from the Proceedings of the 16th IASTED internationalconference on parallel and distributedcomputing and Systems. The topics discussed include: a grid simulation infrastructure sup...
详细信息
The proceedings contain 147 papers from the Proceedings of the 16th IASTED internationalconference on parallel and distributedcomputing and Systems. The topics discussed include: a grid simulation infrastructure supporting advance reservation;auction-based resource allocation protocols in grids;effectiveness of grid configurations on application performance;a constant time shortest-path routing algorithm for pyramid networks;wormhole routers for network-on-chop;communication optimization on broadcast-based clusters;a localization algorithm extension for the evolvable sensor network;and migration algorithms for automated load balancing.
Over the past few years, large language models have evolved to enable a wide range of applications-from natural language understanding to real-time conversational agents. However, the deployment of LLMs into productio...
详细信息
ISBN:
(数字)9798331509859
ISBN:
(纸本)9798331509866
Over the past few years, large language models have evolved to enable a wide range of applications-from natural language understanding to real-time conversational agents. However, the deployment of LLMs into production presents many significant challenges, especially with regard to low-latency responses that enable real-time interactions. This work investigates multi-node inference architectures for optimized deployment using open-source frameworks with scalability, flexibility, and cost-effectiveness. We investigate various methods, such as microbatching, tensor and pipeline parallelism, and sophisticated load balancing, that effectively distribute inference workloads across multiple nodes. We conduct extensive evaluations using popular open-source tools such as Kubernetes, Ray, and Envoy to benchmark the performance of these architectures in terms of latency, throughput, and resource utilization under diverse workloads. We also analyze model replication versus model partitioning trade-offs, giving insights into the most appropriate configuration for various deployment scenarios. As our results show, a well-orchestrated multi-node setup can be used to greatly reduce inference latency while preserving high throughputs, enabling the deployment of sophisticated LLMs in latencysensitive applications. This paper gives insights with a detailed analysis of multi-node inference strategies and integration into open-source ecosystems, therefore it will be a great guide for practitioners seeking to develop deployments of LLMs at scale. In summary, this work underlines how distributed architectures can overcome some of the inherent limitations imposed by singlenode deployments and are crucial for achieving more efficient and responsive AI-driven services.
暂无评论