In the vehicle edge computing network (VECN), how to deal with the computation resources and energy resources shortage problem the roadside units (RSUs) encounter when they are performing delay sensitive computation t...
详细信息
ISBN:
(数字)9798350362244
ISBN:
(纸本)9798350362251
In the vehicle edge computing network (VECN), how to deal with the computation resources and energy resources shortage problem the roadside units (RSUs) encounter when they are performing delay sensitive computation tasks is an important issue, especially during the peak hours and the situation of VECN is dynamic. To complete the computation tasks on time with the minimum expenditure, in this paper, we investigate the problem of information-energy collaboration among RSUs, where the spectrum management is also involved. For the considered scenario, the RSUs’ strategies of spectrum selection, computation task offloading and energy sharing are derived from the formulated optimization problem. Since the proposed problem is a highly complex mixed-integer nonlinear programming problem and the strategies are coupled with each other, a multi-agent deep deterministic policy gradient (MADDPG) based algorithm is proposed to find the sub-optimal solutions quickly in a dynamic environment. The simulation results show that our approach is superior to the existing schemes in terms of total system expenditure and the spectral efficiency.
Applying intelligence to a group of simple robots known as swarm robots has become an exciting technology in assisting or replacing humans to fulfil complex, dangerous and harsh missions. However, building a strategy ...
详细信息
ISBN:
(数字)9781728173061
ISBN:
(纸本)9781728173207
Applying intelligence to a group of simple robots known as swarm robots has become an exciting technology in assisting or replacing humans to fulfil complex, dangerous and harsh missions. However, building a strategy for a swarm to thrive in a dynamic environment is challenging because of control decentralisation and interactions between agents. The decision-making process in a robotic task commonly takes place in sequential stages. By understanding the subsequent action-reaction process, a strategy to make optimal decisions in a respective environment can be learnt. Hence, using the concept of epigenetic inheritance, novel evolutionary-learning mechanisms for a swarm will be discussed in this paper. reinforcement evolutionary learning using epigenetic inheritance (RELEpi) is proposed in this article. This method utilizes reward, temporal difference and epigenetic inheritance to approximate optimal action and behaviour policies. The proposed method opens possibilities to combine reward-based learning and evolutionary methods as a stacked process where histone value is used rather than fitness function. The formulation consists of methylation and epigenetic mechanisms, inspired by the epigenome studies. The methylation process helps the accumulation of the reward to histone value of the gene. Epigenetic mechanisms give the ability to mate genetic information along with their histone value.
Controlling 6 Degrees-of-Freedom (DoF) robotic manipulators in an online, model-free manner poses significant challenges due to their complex coupling, non-linearities, and the need to account for unmodeled dynamics. ...
Controlling 6 Degrees-of-Freedom (DoF) robotic manipulators in an online, model-free manner poses significant challenges due to their complex coupling, non-linearities, and the need to account for unmodeled dynamics. This paper introduces a model-free adaptive approach for real-time control of a 6 DoF “EPSON” robotic manipulator, without requiring any prior knowledge of the manipulator’s dynamics. Initially, we lay out the framework for an optimal control solution. A performance index is introduced, leveraging error dynamics and correction control signals, offering the capability to incorporate high-order error dynamics without the need to explicitly derive error trajectories. The order of error dynamics is determined by the chosen number of error samples. We assume a kernel-based solution structure aligning with the performance index, resulting in a temporal difference equation. This equation can be optimized to formulate a model-free control strategy. Subsequently, a reinforcementlearning approach is adopted to approximate the underlying strategy. Infeasible exact solutions are overcome by employing a value iteration mechanism to adapt the actor-critic structures within an adaptive critics framework. To validate the proposed approach, it is compared against a conventional proportional-integral controller. A Unified Robot Description Format file is generated to facilitate the import of the robotic manipulator into the MATLAB Simulink environment, enabling its control. Ultimately, the proposed method yields superior results in terms of the dynamic characteristics of the response, demonstrating its effectiveness over the conventional approach.
The Artificial Intelligence-Generated Content (AIGC) technique has gained significant popularity in creating diverse content. However, the current deployment of AIGC services is a centralized framework, thus leading t...
详细信息
ISBN:
(数字)9798350362015
ISBN:
(纸本)9798350362022
The Artificial Intelligence-Generated Content (AIGC) technique has gained significant popularity in creating diverse content. However, the current deployment of AIGC services is a centralized framework, thus leading to high response times. To address this issue, we propose a diffusion-based task scheduling method that considers the integration of the diffusion model, Deep reinforcementlearning (DRL), and Mobile Edge Computing (MEC) technique to improve the AIGC efficiency. This challenges efficient server selection without prior information in dynamic MEC systems. We formulate our problem as an online integer linear programming problem aiming to minimize task offloading delay. Furthermore, we propose a novel AIGC Task Scheduling (DDRL-ATS) algorithm based on Diffusion DRL (DDRL) that effectively addresses this problem. The DDRL-ATS algorithm achieves efficient AIGC tailored for heterogeneous MEC environments. Additionally, an online Adaptive Multi-server Selection and Allocation (DDRL-AMSA) algorithm based on DDRL is proposed to further enhance the AIGC efficiency. Moreover, our DDRL-AMSA algorithm achieves near-optimal solutions within approximate linear time complexity bounds. Finally, experimental results validate the effectiveness of our method by showcasing at least a reduction of 13.54% in task offloading delay compared to state-of-the-art methods.
With the emergence of various types of applications such as delay-sensitive applications, future communication networks are expected to be increasingly complex and dynamic. Network Function Virtualization (NFV) provid...
详细信息
ISBN:
(数字)9781728192901
ISBN:
(纸本)9781728192918
With the emergence of various types of applications such as delay-sensitive applications, future communication networks are expected to be increasingly complex and dynamic. Network Function Virtualization (NFV) provides the necessary support towards efficient management of such complex networks, by disintegrating the dependency on the hardware devices via virtualizing the network functions and placing them on shared data centres. However, one of the main challenges of the NFV paradigm is the resource allocation problem which is known as NFV-Resource Allocation (NFV-RA). NFV-RA is a method of deploying software-based network functions on the substrate nodes, subject to the constraints imposed by the underlying infrastructure and the agreed Service Level Agreement (SLA). This work investigates the potential of reinforcementlearning (RL) as a fast yet accurate means (as compared to integer linear programming) for deploying the softwarized network functions onto substrate networks under several Quality of Service (QoS) constraints. In addition to the regular resource constraints and latency constraints, we introduced the concept of a complete outage of certain nodes in the network. This outage can be either due to a disaster or unavailability of network topology information due to proprietary and ownership issues. We have analyzed the network performance on different network topologies, different capacities of the nodes and the links, and different degrees of the nodal outage. The computational time escalated with the increase in the network density to achieve the optimal solutions; this is because Q-learning is an iterative process which results in a slow exploration. Our results also show that for certain topologies and a certain combination of resources, we can achieve between 7090% service acceptance rate even with a 40% nodal outage.
This paper concerns with a novel generalized policy iteration (GPI) algorithm with approximation errors. Approximation errors are explicitly considered in the GPI algorithm. The properties of the stable GPI algorithm ...
详细信息
This paper concerns with a novel generalized policy iteration (GPI) algorithm with approximation errors. Approximation errors are explicitly considered in the GPI algorithm. The properties of the stable GPI algorithm with approximation errors are analyzed. The convergence of the developed algorithm is established to show that the iterative value function is convergent to a finite neighborhood of the optimal performance index function. Finally, numerical examples and comparisons are presented.
Self-healing systems depend on following a set of predefined instructions to recover from a known failure state. Failure states are generally detected based on domain specific specialized metrics. Failure fixes are ap...
详细信息
ISBN:
(数字)9798400705854
ISBN:
(纸本)9798350363838
Self-healing systems depend on following a set of predefined instructions to recover from a known failure state. Failure states are generally detected based on domain specific specialized metrics. Failure fixes are applied at predefined application hooks that are not sufficiently expressive to manage different failure types. Self-healing is usually applied in the context of distributed systems, where the detection of failures is constrained to communication problems, and resolution strategies often consist of replacing complete components. However, current complex systems may reach failure states at a fine granularity not anticipated by developers (for example, value range changes for data streaming in IoT systems), making them unsuitable for existing self-healing techniques. To counter these problems, in this paper we propose a new self-healing framework that learns recovery strategies for healing fine-grained system behavior at run time. Our proposal targets complex reactive systems, defining monitors as predicates specifying satisfiability conditions of system properties. Such monitors are functionally expressive and can be defined at run time to detect failure states at any execution point. Once failure states are detected, we use a reinforcementlearning-based technique to learn a recovery strategy based on users' corrective sequences. Finally, to execute the learned strategies, we extract them as Context-oriented programming variations that activate dynamically whenever the failure state is detected, overwriting the base system behavior with the recovery strategy for that state. We validate the feasibility and effectiveness of our framework through a prototypical reactive application for tracking mouse movements, and the DeltaIoT exemplar for self-healing systems. Our results demonstrate that with just the definition of monitors, the system is effective in detecting and recovering from failures between 55% - 92% of the cases in the first application, and at pa
暂无评论