Finding the Maximal Independent Set (MIS) in a graph is a well-known problem with applications in resource allocation, load balancing, and routing optimization. This task is particularly challenging for large graphs a...
详细信息
ISBN:
(数字)9798331524937
ISBN:
(纸本)9798331524944
Finding the Maximal Independent Set (MIS) in a graph is a well-known problem with applications in resource allocation, load balancing, and routing optimization. This task is particularly challenging for large graphs as it requires multiple iterations over the entire set of vertices. Recently, there has been significant interest in developing techniques to maintain the MIS dynamically in evolving graphs rather than re-computing from scratch. In this paper, we propose new data structures and techniques for computing MIS in parallel on dynamic graphs. We specifically propose techniques to handle insertions and deletions in a batched setting. We conducted detailed experiments on shared memory multicore CPUs using graphs ranging from 50 million to ${1. 2}$ billion edges. Our results show that using our technique for insertions and deletions can provide up to 15.64x and 10.57x speedups on average over comparable baselines. Additionally, the final MIS we produce varies by only about ${0. 1 8 \%}$ in cardinality compared to the existing state-of-the-art.
Stream applications are widely deployed on the cloud. While modern distributed streaming systems like Flink and Spark Streaming can schedule and execute them efficiently, streaming dataflows are often dynamically chan...
详细信息
ISBN:
(纸本)9783030953911;9783030953904
Stream applications are widely deployed on the cloud. While modern distributed streaming systems like Flink and Spark Streaming can schedule and execute them efficiently, streaming dataflows are often dynamically changing, which may cause computation imbalance and back-pressure. We introduce AutoFlow, an automatic, hotspot-aware dynamic load balance system for streaming dataflows. It incorporates a centralized scheduler that monitors the load balance in the entire dataflow dynamically and implements state migrations correspondingly. The scheduler achieves these two tasks using a simple asynchronous distributed control message mechanism and a hotspot-diminishing algorithm. The timing mechanism supports implicit barriers and a highly efficient state-migration without global barriers or pauses to operators. It also supports a time-window based load-balance measurement and feeds them to the hotspot-diminishing algorithm without user interference. We implemented AutoFlow on top of Ray, an actor-based distributed execution framework. Our evaluation based on various streaming benchmark datasets shows that AutoFlow achieves good load-balance and incurs a low latency overhead in a highly data-skew workload.
When it comes to medical image analysis, problems arise due to the scarce amount of data and computational resources in medical environments. This is because, as earlier stated, cloud settings demand efficient models ...
详细信息
Network-based applications rely on the underlying network infrastructure to reliably forward packets between nodes. The way packets are forwarded has a significant impact on service quality. Therefore, it is important...
详细信息
ISBN:
(数字)9798331524937
ISBN:
(纸本)9798331524944
Network-based applications rely on the underlying network infrastructure to reliably forward packets between nodes. The way packets are forwarded has a significant impact on service quality. Therefore, it is important to gain a better understanding of data packet routes. To obtain detailed information about network paths, continuous and long-term packet analysis is required. To achieve this, we present our open-source framework HiPerConTracer 3.0 for large-scale IP trace analysis. It performs Ping and Traceroute measurements to provide detailed insights into packet routes and packet timing by tracing routes between senders and receivers in public and private networks. Particularly, it runs its own measurements, without need to obtain data, or cooperation from, the underlying network service providers or remote server owners. Our tool supports large-scale data collection, storage, and post-processing stages. It supports easy-to-understand route visualization, round-trip time measurements, and hop counts. A proof-of-concept analysis revealed that packet route lengths can change drastically when traveling through unexpected countries, regions, and network operators.
K-Means algorithm is one of the most common clustering algorithms widely applied in various data analysis applications. Yinyang K-Means algorithm is a popular enhanced K-Means algorithm that avoids most unnecessary ca...
详细信息
The number of large language models for code generation is rising. However, comprehensive evaluations that focus on reliability and security remain sparse. This study evaluated the Python language code quality generat...
详细信息
ISBN:
(数字)9798331524937
ISBN:
(纸本)9798331524944
The number of large language models for code generation is rising. However, comprehensive evaluations that focus on reliability and security remain sparse. This study evaluated the Python language code quality generated by five large language models. They are GPT-4-Turbo, DeepSeek-Coder-33B-Instruct, Gemini Pro 1.0, Codex and CodeLLama70 b -Instruct. The evaluation considered three diverse application domains with varying prompt lengths for fair comparison. We found GPT-4-Turbo generated (on average) 4.5% more secure code than a Python code developer with three years of experience.
Geo-location, also known as measurement report (MR) location, is a technique to determine the geographic location of user equipment (UE) and the behaviour attribute of telephone traffic based on wireless signals measu...
详细信息
The current trend of using artificial neural networks to solve computationally intensive problems is omnipresent. In this scope, DeepQ learning is a common choice for agent-based problems. DeepQ combines the concept o...
详细信息
ISBN:
(纸本)9783030967727;9783030967710
The current trend of using artificial neural networks to solve computationally intensive problems is omnipresent. In this scope, DeepQ learning is a common choice for agent-based problems. DeepQ combines the concept of Q-Learning with (deep) neural networks to learn different Q-values/matrices based on environmental conditions. Unfortunately, DeepQ learning requires hundreds of thousands of iterations/Q-samples that must be generated and learned for large-scale problems. Gathering data sets for such challenging tasks is extremely time consuming and requires large data-storage containers. Consequently, a common solution is the automatic generation of input samples for agent-based DeepQ networks. However, a usual workflow is to create the samples separately from the training process in either a (set of) pre-processing step(s) or interleaved with the training process. This requires the input Q-samples to be materialized in order to be fed into the training step of the attached neural network. In this paper, we propose a new GPU-focussed method for on-the-fly generation of training samples tightly coupled with the training process itself. This allows us to skip the materialization process of all samples (e.g. avoid dumping them disk), as they are (re)constructed when needed. Our method significantly outperforms usual workflows that generate the input samples on the CPU in terms of runtime performance and memory/storage consumption.
Cloud computing and edge computing are distinct architectures that process data. Cloud computing is a centralized model of computing where data is stored and processed in a remote server, while edge computing is a dis...
Cloud computing and edge computing are distinct architectures that process data. Cloud computing is a centralized model of computing where data is stored and processed in a remote server, while edge computing is a distributed model of computing where data is stored and processed at the network’s edge. Both models have advantages and disadvantages, making them suitable for different types of applications. This article comprehensively reviews cloud and edge computing technologies, their differences, and their applications. Cloud computing has become a popular technology for storing and processing large amounts of data, while edge computing is gaining popularity due to its ability to process data closer to the source. The article discusses the advantages and disadvantages of both technologies and their potential applications in various industries, such as healthcare, transportation, and manufacturing. The article also highlights some challenges associated with implementing these technologies, including security concerns and the need for reliable connectivity. Overall, this review provides valuable insights into the current state of cloud and edge computing technologies and their potential impact on various industries. Additionally, this article will explore the potential benefits of combining both technologies for various applications
作者:
Li, TaoshenLu, MingyuNanning Univ
China ASEAN Int Join Lab Integrate Transport 8 Longting Rd Nanning Peoples R China Guangxi Univ
Sch Comp Elect & Informat 100 Daxue Rd Nanning Peoples R China
Aiming at the optimization problem in the stage of simultaneous wireless information and power transfer (SWITP), an optimal energy efficiency strategy of millimeter-wave cooperative communication small cell based on S...
详细信息
ISBN:
(纸本)9783030967727;9783030967710
Aiming at the optimization problem in the stage of simultaneous wireless information and power transfer (SWITP), an optimal energy efficiency strategy of millimeter-wave cooperative communication small cell based on SWITP was proposed to maximize the link energy efficiency, in which the receiver of user equipment devices worked in the power splitting mode. Under the constraints of minimum link transmission rate and minimum energy harvested, the strategy maximized the link energy efficiency of the system by jointly optimizing the transmitting power control and the power splitting factor. Since the original problem is a non-convex fractional programming problem and the NP-hard, the strategy transformed the original problem into a tractable convex optimization problem which is easy to solve by Dinkelbach method, and then Lagrange dual method was used to solve the problem. Finally, a cross-iteration algorithm was designed to get the optimal solution. Simulation results show that the proposed strategy is more effective and superior than the traditional power control method and the maximum transmit power method.
暂无评论