Despite the popularity of Serverless computing, there are insufficient efforts dedicated to Serverless workflows (i.e., Serverless function orchestration), particularly for Serverless edge computing. In this paper, we...
详细信息
ISBN:
(纸本)9798350339864
Despite the popularity of Serverless computing, there are insufficient efforts dedicated to Serverless workflows (i.e., Serverless function orchestration), particularly for Serverless edge computing. In this paper, we first identify the challenges of deploying the state-of-the-art cloud-oriented Serverless workflow scheduling on resource-constrained edge devices, then propose to model Serverless workflows with behavior trees, and finally reveal our key observations and preliminary results for behavior tree-based Serverless workflow scheduling.
Current IaaS providers have deployed data centers worldwide, with resources continually increasing. Meanwhile, there is a rising trend in the concurrency of user requests and the diversity of user request types. To ac...
详细信息
ISBN:
(纸本)9798350368543;9798350368536
Current IaaS providers have deployed data centers worldwide, with resources continually increasing. Meanwhile, there is a rising trend in the concurrency of user requests and the diversity of user request types. To achieve better resource allocation, various complex scheduling architectures have been proposed. However, due to the challenges associated with real-world experiments, simulation systems are needed to build experimental environments for related research. As existing systems do not perform well enough, we construct LGDCloudSim. It is designed with full consideration of the characteristics of the largescale geographically distributed cloud data center scenarios. To support large-scale simulations, we propose state management optimization and operation process optimization methods. Experiments show that LGDCloudSim can simulate up to 5x10(8) hosts and 107 request concurrency. It also supports diverse scheduling architectures and different request types.
With growing network service demands, rational and efficient multi-domain resource allocation is paramount. Research aims to develop intelligent Traffic Engineering (TE) algorithms that can dynamically allocate resour...
详细信息
ISBN:
(纸本)9798350386066;9798350386059
With growing network service demands, rational and efficient multi-domain resource allocation is paramount. Research aims to develop intelligent Traffic Engineering (TE) algorithms that can dynamically allocate resources, adapt to changing conditions, and meet user needs. TE algorithms based on Software-Defined Networking (SDN) have proven effective for this goal by leveraging the centralized control plane and programmable data plane of SDN. This enables flexible and dynamic optimization of routing and resource allocation across multiple domains to meet traffic demands. However, domains operated by different service providers may exhibit non-cooperative behavior due to conflicts of interest and competition. Some domains may act selfishly by hiding bandwidth or exaggerating inter-domain requests to reserve more resources for themselves. This complicates TE design as algorithms can no longer assume universal cooperation in multi-domain networks. Game theory provides a framework to model competition between domains through behaviors like request forwarding, dropping and study cooperation strategies. This paper presents GameTE, a game-theoretic distributed TE algorithm for multi-domain SDN environments without trusted relationships between domains. By incorporating incentives and punishments, our algorithm suppresses selfish behaviors and promotes efficient resource utilization. Evaluation results demonstrate that GameTE is effective in curbing deception, enhancing resource sharing between domains, and improving overall network performance compared to baseline schemes.
Internet of Things (IoT) applications span the edgecloud continuum to form multiscale distributedsystems. The heterogeneity that defines this architecture, coupled with the asynchronous, event-triggered and failure-p...
详细信息
ISBN:
(纸本)9798350368543;9798350368536
Internet of Things (IoT) applications span the edgecloud continuum to form multiscale distributedsystems. The heterogeneity that defines this architecture, coupled with the asynchronous, event-triggered and failure-prone nature of these deployments create significant programming and maintenance challenges for developers of IoT applications. To address this impediment to innovation, we present LAMINAR, a dataflow programming model for IoT applications implemented using a novel log-based and concurrent runtime system that spans all resource scales. We describe the properties that underpin LAMINAR's design and compare it to a lowerlevel event-based approach. We show that LAMINAR's dataflow model hides many of the complexities of "lock-free" event-driven programming. Through an empirical evaluation of LAMINAR, we find its design and implementation are both more straightforward for developers and more performant.
Large language models (LLMs) are computationally intensive. The computation workload and the memory footprint grow quadratically with the dimension (layer width). Most of LLMs' parameters come from the linear laye...
详细信息
ISBN:
(纸本)9798350386066;9798350386059
Large language models (LLMs) are computationally intensive. The computation workload and the memory footprint grow quadratically with the dimension (layer width). Most of LLMs' parameters come from the linear layers of the transformer structure and are highly redundant. These linear layers contribute more than 80% of the computation workload and 99% of the model size. To pretrain and finetune LLMs efficiently, there are three major challenges to address: 1) reducing redundancy of the linear layers;2) reducing GPU memory footprint;3) improving GPU utilization when using distributed training. Prior methods, such as LoRA and QLoRA, utilized low-rank structure and quantization to reduce the number of trainable parameters and model size, respectively. However, the resulting model still consumes a large amount of GPU memory. In this paper, we present high-performance GPU-based methods for both pretraining and finetuning quantized LLMs with low-rank structures. We replace a single linear layer in the transformer structure with two narrower linear layers, significantly reducing the number of parameters by several orders of magnitude. By quantizing the pretrained parameters into low precision (8-bit and 4-bit), the memory consumption of the resulting model is further reduced. Compared with existing LLMs, our methods achieve a speedup of 1.3x and a model compression ratio of 2.64x for pretraining without accuracy drop. For finetuning, our methods achieve an average accuracy score increase of 6.3 and 24.0 in general tasks and financial tasks, respectively, and GPU memory consumption is reduced by 6.3x. The sizes of our models are smaller than 0.59 GB, allowing inference on a smartphone.
Byzantine fault detection (BFD) techniques are promising approaches for better scalability and practicality, even though they cannot mask Byzantine faults like Byzantine fault tolerance (BFT) techniques. However, the ...
详细信息
ISBN:
(纸本)9798350328127
Byzantine fault detection (BFD) techniques are promising approaches for better scalability and practicality, even though they cannot mask Byzantine faults like Byzantine fault tolerance (BFT) techniques. However, the existing BFD protocol for database systems has suffered from long latency since it synchronously makes an agreement on the order of transactions between replicas when executing the transactions. In this paper, we explore an alternative BFD approach for database systems, which defers the expensive agreement and detects Byzantine faults lazily rather than detecting them in real-time. We discuss the challenges and design overview of our approach. We also present preliminary experimental results showing the benefit of our approach.
Occupancy refers to the presence of people in rooms and buildings. It is an essential input for IoT applications, including controlling lighting, heating, access, and monitoring space limitation policies. Occupancy in...
详细信息
ISBN:
(纸本)9798350369458;9798350369441
Occupancy refers to the presence of people in rooms and buildings. It is an essential input for IoT applications, including controlling lighting, heating, access, and monitoring space limitation policies. Occupancy information can also be used to improve users' comfort and to reduce energy waste in buildings. This paper evaluates the performance and resource consumption of recent machine learning techniques for occupancy detection and measurement by exploiting data from distributed environmental sensors. This evaluation is founded on a dataset captured by our dedicated sensor network for indoor monitoring, comprising temperature, humidity, and carbon dioxide (CO2) sensors. Using different sensor modalities and spatio-temporal data selections, we compare eight classification algorithms based on the accuracy achieved and the required runtimes. Binary classification for occupancy detection (OD) achieves accuracies over 90% for individual modalities and close to 100% for modality combinations. Multi-class classification for occupancy measurements (OM) shows as clear ranking of the sensor modalities, and gradient boosting algorithms are superior when combining sensor modalities and fusing data from multiple sensors.
Adopting Digital Twin (DT) technology in vehicular edge computing (VEC) enables efficient capture of real-time state information of applications, thereby addressing complex task scheduling problems. Existing literatur...
详细信息
ISBN:
(纸本)9798350369458;9798350369441
Adopting Digital Twin (DT) technology in vehicular edge computing (VEC) enables efficient capture of real-time state information of applications, thereby addressing complex task scheduling problems. Existing literature studies considered only minimizing service latency for task offloading;however, there is room for exploring strategies to enhance user Quality of Experience (QoE) in timeliness and reliability domains. In this paper, we have developed an optimization framework using Mixed Integer Linear Programming (MILP), namely QuETOD, which minimizes service latency by allocating task execution responsibility to highly reliable and reputed vehicles in a DT-enabled VEC environment. The developed QuETOD framework clusters the vehicles based on the demand-supply theory of economics by considering computing resources and utilizing the multi-weighted subjective logic for getting the proper reputation update of the vehicles. The experimental results of the developed QuETOD system depict significant performance improvement in terms of QoE and reliability compared to the state-of-the-art works as high as 15% and 25%, respectively.
Hierarchical Federated Learning (HFL) is a practical implementation of federated learning in mobile edge computing, employing edge servers as intermediaries between mobile devices and the cloud server for device coord...
详细信息
ISBN:
(纸本)9798350386066;9798350386059
Hierarchical Federated Learning (HFL) is a practical implementation of federated learning in mobile edge computing, employing edge servers as intermediaries between mobile devices and the cloud server for device coordination and cloud communication. However, the devices are usually mobile users with unpredictable mobile trajectories and statistical heterogeneity, leading to the edge models optimized along dynamic edge data distribution directions and further resulting in instability and slow convergence of the global model. In this work, we propose a Mobility-Aware deviCe sampling algorithm in HFL, namely MACH, which can dynamically maintain the device sampling strategy at each edge to accelerate the convergence of the global model. First, we analyze the convergence bound of HFL with mobile devices under arbitrary device sampling probabilities. Based on this convergence bound, we formalize the sampling optimization problem for mobility-aware device sampling, aiming to minimize the convergence error under time-averaged cost constraints, while taking the limited device-edge wireless channel capacity into account. Next, we introduce the MACH algorithm, consisting of two underlying components: experience updating and edge sampling. Experience updating utilizes an upper confidence bound method to estimate device statistical information online, and edge sampling customizes a sampling strategy on each edge based on the estimated device statistical information. Finally, extensive experimental results through real-world mobile device trajectories validate that MACH can reduce the time required to achieve a target accuracy by 25.00% - 56.86%.
Fog computing, an evolution of cloud computing, has become increasingly popular for its ability to lessen the burden of such a centralized computing paradigm by distributing tasks generated by IoT across fog layers. E...
详细信息
ISBN:
(纸本)9798350369458;9798350369441
Fog computing, an evolution of cloud computing, has become increasingly popular for its ability to lessen the burden of such a centralized computing paradigm by distributing tasks generated by IoT across fog layers. Effectively managing real-time, delay-sensitive, and diverse IoT applications to enhance the Quality-of-Experience (QoE) presents significant challenges due to the dispersed nature and limited resources of fog nodes. Previous studies in fog computing task offloading have typically focused on either energy consumption or service delay. This paper introduces an optimization framework for task offloading within fog computing environments that aims to balance improved user QoE with reduced energy consumption, employing Mixed-Integer Linear Programming (MILP). Given the NP-hard nature of this framework, we have devised a Deep Q-Learning (DQL) based model for task offloading, termed ELTO-DQL, which aims for near-optimal solutions in polynomial time. Experimental results indicate that the ELTO-DQL model enhances energy efficiency and QoE by up to 19% and 15% respectively, outperforming contemporary benchmarks.
暂无评论