检索结果-内蒙古大学图书馆

Dynamic Resource Allocation Strategy of Multi-Objective Fuzzy Optimization Based on Markov Decision Process

IEEE ACCESS 2023年 11卷 99607-99613页

作者： Zhang, Yu-Ting Yang, Jing-Yu Wu, Yang Natl Def Univ Grad Sch Beijing 100091 Peoples R China Natl Def Univ Joint Operat Coll Beijing 100091 Peoples R China

With the limited amount of national defense resources, the dynamic allocation of resources in overall project construction is conducive to improving the efficiency of project construction. In this regard, this paper constructs a dynamic resource allocation model of multi-objective and multi-stage fuzzy optimization based on the Markov decision process (MDP). On this basis, this paper further adopts the q-learning algorithm to seek strategy optimization. At last, this paper determines the resource allocation strategy that can optimize the total benefit of construction projects through the example analysis. Relevant results reveal that the dynamic resource allocation model of multi-objective and multi-stage fuzzy optimization based on MDP can achieve the effect of optimizing the project construction objectives and maximizing the construction output.

关键词： Resource management Dynamic scheduling Optimization q-learning Heuristic algorithms Indexes Markov processes Fuzzy systems Process planning Decision making Dynamic resource allocation fuzzy optimization Markov decision process model q-learning algorithm

来源：评论

学校读者我要写书评

暂无评论

Intelligent Secure Communication for Cognitive Networks With Multiple Primary Transmit Power

引用

IEEE ACCESS 2020年 8卷 37343-37351页

作者： Lai, Shiwei Xia, Junjuan Zou, Dan Fan, Liseng Guangzhou Univ Sch Comp Sci & Cyber Engn Guangzhou 510006 Peoples R China East China Jiaotong Univ Sch Informat Engn Nanchang 330013 Jiangxi Peoples R China

In this paper, we study an intelligent secure communication scheme for cognitive networks with multiple primary transmit power, where a secondary Alice transmits its secrecy data to a secondary Bob threatened by a secondary attacker. The secondary nodes limit their transmit power among multiple levels, in order to maintain the quality of service of the primary networks. The attacker can work in an eavesdropping, spoofing, jamming or silent mode, which can be viewed as the action in the traditional q-learning algorithm. On the other hand, the system can adaptively choose the transmit power level among multiple ones to suppress the intelligent attacker, which can be viewed as the status of q-learning algorithm. Accordingly, we firstly formulate this secure communication problem as a static secure communication game with Nash equilibrium (NE) between the main links and attacker, and then employ the q-learning algorithm to select the transmit power level. Simulation results are finally demonstrated to verify that the intelligent attacker can be effectively suppressed by the proposed studies in this paper.

关键词： Intelligent secure communication q-learning algorithm Nash equilibrium

来源：评论

学校读者我要写书评

暂无评论

A reinforcement learning approach for developing routing policies in multi-agent production scheduling

引用

INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY 2007年第3-4期33卷 323-333页

作者： Wang, Yi-Chi Usher, John M. Feng Chia Univ Dept Ind Engn & Syst Management Taichung 40724 Taiwan Mississippi State Univ Dept Ind Engn Mississippi State MS 39762 USA

Most recent research studies on agent-based production scheduling have focused on developing negotiation schema for agent cooperation. However, successful implementation of agent-based approaches not only relies on the cooperation among the agents, but the individual agent's intelligence for making good decisions. learning is one mechanism that could provide the ability for an agent to increase its intelligence while in operation. This paper presents a study examining the implementation of the q-learning algorithm, one of the most widely used reinforcement learning approaches, for use by job agents when making routing decisions in a job shop environment. A factorial experiment design for studying the settings used to apply q-learning to the job routing problem is carried out. This study not only investigates the effects of this q-learning application but also provides recommendations for factor settings and useful guidelines for future applications of q-learning to agent-based production scheduling.

关键词： agent-based scheduling job routing q-learning algorithm reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

A reinforcement learning-based load balancing algorithm for fog computing

引用

TELECOMMUNICATION SYSTEMS 2023年第3期84卷 321-339页

作者： Tahmasebi-Pouya, Niloofar Sarram, Mehdi Agha Mostafavi, Seyedakbar Yazd Univ Dept Comp Engn Yazd Iran

Fog computing is a developing paradigm for bringing cloud computing capabilities closer to end-users. Fog computing plays an important role in improving resource utilization and decreasing delay for internet of things (IoT) applications. At the same time, it faces many challenges, including challenges related to energy consumption, scheduling and resource overload. Load balancing helps to reduce delay, increase user satisfaction, and also increase system efficiency by efficiently and fairly allocation of tasks among computing resources. Fair load distribution among fog nodes is a difficult challenge due to the increasing number of IoT devices. In this research, we suggested a new approach for fair load distribution in fog environment. The q-learning algorithm-based load balancing method is executed as the proposed approach in the fog layer. The objective of this method is to simultaneously improve the load balancing and delay. In this technique, the fog node uses reinforcement learning to choose whether to handle a task it receives via IoT devices directly, or whether to send it to a nearby fog node or the cloud. The simulation findings demonstrate that our approach results a suitable technique for fair load distribution among fog nodes, which improves the delay, run time, network utilization, and standard deviation of load on nodes than other compared techniques. In this way, in the case where the number of fog nodes is considered to be 4, the delay in the proposed method is reduced by around 8.44% in comparison to the load balancing and optimization strategy (LBOS) method, 26.65% in comparison to the secure authentication and load balancing (SALB) method, 29.15% in comparison to the proportional method, 7.75% in comparison to the fog cluster-based load-balancing (FCBLB) method, and 36.22% in comparison to the random method. In the case where the number of fog nodes is considered to be 10, the delay in the proposed method is reduced by around 13.80% in comparison t

关键词： Delay Fog computing Internet of things Load balancing q-learning algorithm Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Single-Cell Multiuser Computation Offloading in Dynamic Pricing-Aided Mobile Edge Computing

引用

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS 2024年第2期11卷 3004-3014页

作者： Tao, Ming Li, Xueqiang Ota, Kaoru Dong, Mianxiong Dongguan Univ Technol Sch Comp Sci & Technol Dongguan 523820 Peoples R China Muroran Inst Technol Dept Informat & Elect Engn Muroran 0500071 Japan

Along with emerging mobile Internet applications embedded in tremendous growth of computing demand, mobile edge computing (MEC) could effectively address the issue of compute-intensive and latency-sensitive computation imposed on mobile terminals through performing computation offloading strategies. However, how to find optimal decisions of transmission power, computing capacity demand, and offloading demand at the end-user and how to determine the resource pricing and allocation at the MEC server with the limited computing capacity still remain challenging issues in operating the MEC system in an optimal fashion. For multiuser in signal cell network with MEC, a dynamic pricing-based computation offloading solution is investigated in this article. Through the use of q-learning algorithm comprehensively considering those sensitive factors, e.g., time cost, energy consumption and dynamic pricing, the offloading decision at the end-user is achieved with the consideration of time-varying wireless channel conditions. According to the resources supply and demand relationship, a dynamic pricing algorithm for the MEC server is designed to adjust the pricing strategy to achieve the win-win situation. Simulation results have been shown to demonstrate the efficiency in making offloading decision while the wireless channel is fast fading and the resource pricing is adjusted dynamically, and in enhancing utilities for both end-users and the MEC server.

关键词： Servers Pricing Resource management Heuristic algorithms Computational modeling Task analysis Games Computation offloading dynamic pricing mobile edge computing (MEC) q-learning algorithm

来源：评论

学校读者我要写书评

暂无评论

Discrete-Time Control for Double-Integrator Systems With State Constraint and learning Gain

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS 2023年第1期70卷 216-220页

作者： Liu, Zhengxiong Ma, Zhiqiang Shi, Dailiang Huang, Panfeng Northwestern Polytech Univ Sch Astronaut Res Ctr Intelligent Robot Xian 710072 Peoples R China Northwestern Polytech Univ Natl Key Lab Aerosp Flight Dynam Sch Astronaut Xian 710072 Peoples R China

This brief investigates a barrier Lyapunov function based discrete-time control with q-learning based gains for double-integrator systems with state constraint. It is found, from the stability proof, that the high conservatism of analysis for the stabilization of discrete-time system with state constraint is revealed, where the explicit selection of constant high gain is challenging. To address this problem, the fuzzy q-learning algorithm is employed to search for the nearly optimal control gains for both fast response and low steady-state error in the long view of performance consideration. The numerical and experimental results verify the effectiveness of the proposed method, and varying gains based on fuzzy approximation q-learning can aid to reduce the steady-state error while fast response to the reference motion trajectories.

关键词： Discrete-time control state constraints barrier Lyapunov function q-learning algorithm physical human-robot interaction digital human-robot interaction

来源：评论

学校读者我要写书评

暂无评论

Reinforcement learning method for plug-in electric vehicle bidding

引用

IET SMART GRID 2019年第4期2卷 529-536页

作者： Najafi, Soroush Shafie-khah, Miadreza Siano, Pierluigi Wei, Wei Catalao, Joao P. S. Isfahan Univ Technol Dept Elect & Comp Engn Esfahan Iran Univ Vaasa Sch Technol & Innovat Vaasa 65200 Finland Univ Salerno Dept Management & Innovat Syst Fisciano SA Italy Tsinghua Univ Dept Elect Engn State Key Lab Power Syst Beijing Peoples R China Univ Porto Fac Engn Porto Portugal INESC TEC Porto Portugal

This study proposes a novel multi-agent method for electric vehicle (EV) owners who will take part in the electricity market. Each EV is considered as an agent, and all the EVs have vehicle-to-grid capability. These agents aim to minimise the charging cost and to increase the privacy of EV owners due to omitting the aggregator role in the system. Each agent has two independent decision cores for buying and selling energy. These cores are developed based on a reinforcement learning (RL) algorithm, i.e. q-learning algorithm, due to its high efficiency and appropriate performance in multi-agent methods. Based on the proposed method, agents can buy and sell energy with the cost minimisation goal, while they should always have enough energy for the trip, considering the uncertain behaviours of EV owners. Numeric simulations on an illustrative example with one agent and a testing system with 500 agents demonstrate the effectiveness of the proposed method.

关键词： electric vehicles power markets learning (artificial intelligence) multi-agent systems plug-in electric vehicle novel multiagent method electric vehicle owners electricity market vehicle-to-grid capability charging cost EV owners aggregator role independent decision cores buying selling energy reinforcement learning algorithm q-learning algorithm multiagent methods cost minimisation goal

来源：评论

学校读者我要写书评

暂无评论

Research on supply chain efficiency optimization algorithm based on reinforcement learning

引用

ADVANCES IN CONTINUOUS AND DISCRETE MODELS 2024年第1期2024卷 1-15页

作者： Zhou, Tao Xie, Lihua Zou, Chunbin Tian, Yong Sichuan China Tobacco Ind Co Ltd Chengdu 610016 Sichuan Peoples R China

Supply chain efficiency is critical to enterprises and can affect their competitiveness. The supply chain faces an uncertain and complex external market environment, facing the problem of supply chain efficiency optimization;the traditional optimization method is ineffective, which can better face the current environment and deal with problems. It has advantages in optimizing supply chain efficiency and has been widely used. This paper first expounds on the importance of supply chain management status, the limitations of traditional supply chain management methods, and reinforcement learning in the application of supply chain optimization. Then, through experiments, reinforcement learning, supply chain optimization problems, and the analysis of related algorithm design, the optimal algorithm focuses on inventory management optimization. Finally, this paper points out the future research directions and development trend of the supply chain efficiency optimization algorithm based on reinforcement learning.

关键词： Supply chain efficiency optimization Reinforcement learning q-learning algorithm Inventory management optimization

来源：评论

学校读者我要写书评

暂无评论

Biased Bi-Population Evolutionary algorithm for Energy-Efficient Fuzzy Flexible Job Shop Scheduling with Deteriorating Jobs

引用

Complex System Modeling and Simulation 2024年第1期4卷 15-32页

作者： Libao Deng Yingjian Zhu Yuanzhu Di Lili Zhang School of Information Science and Engineering Harbin Institute of TechnologyWeihai 264209China. Department of Computer Science Maynooth UniversityMaynoothW23 F2H6 the School of Computing Dublin City UniversityDublinD09 V209Ireland.

There are many studies about flexible job shop scheduling problem with fuzzy processing time and deteriorating scheduling,but most scholars neglect the connection between them,which means the purpose of both models is to simulate a more realistic factory *** this perspective,the solutions can be more precise and practical if both issues are considered ***,the deterioration effect is treated as a part of the fuzzy job shop scheduling problem in this paper,which means the linear increase of a certain processing time is transformed into an internal linear shift of a triangle fuzzy processing *** from that,many other contributions can be stated as follows.A new algorithm called reinforcement learning based biased bi-population evolutionary algorithm(RB2EA)is proposed,which utilizes q-learning algorithm to adjust the size of the two populations and the interaction frequency according to the quality of population.A local enhancement method which combimes multiple local search stratgies is *** interaction mechanism is designed to promote the convergence of the *** experiments are designed to evaluate the efficacy of RB2EA,and the conclusion can be drew that RB2EA is able to solve energy-efficient fuzzy flexible job shop scheduling problem with deteriorating jobs(EFFJSPD)efficiently.

关键词： bi-population evolutionary algorithm q-learning algorithm fuzzy deteriorating effect energy flexible job shop scheduling

来源：评论

学校读者我要写书评

暂无评论

learning policies for single machine job dispatching

引用

ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING 2004年第6期20卷 553-562页

作者： Wang, YC Usher, JM Kun Shan Univ Technol Dept Informat Management Tainan 710 Taiwan Mississippi State Univ Dept Ind Engn Mississippi State MS 39762 USA

Reinforcement learning (RL) has received some attention in recent years from a,gent-based researchers because it deals with the problem of how an autonomous agent can learn to select proper actions for achieving its goals through interacting with its environment. Each time after an agent performs an action, the environment's response, as indicated by its new state, is used by the agent to reward or penalize its action. The agent's goal is to maximize the total amount of reward it receives over the long run. Although there have been several successful examples demonstrating the usefulness of RL, its application to manufacturing systems has not been fully explored. In this study, a single machine agent employs the q-learning algorithm to develop a decision-making policy on selecting the appropriate dispatching rule from among three given dispatching rules. The system objective is to minimize mean tardiness. This paper presents a factorial experiment design for studying the settings used to apply q-learning to the single machine dispatching rule selection problem. The factors considered in this study include two related to the agent's policy table design and three for developing its reward function. This study not only investigates the main effects of this q-learning application but also provides recommendations for factor settings and useful guidelines for future applications of q-learning to agent-based production scheduling. (C) 2004 Elsevier Ltd. All rights reserved.

关键词： reinforcement learning q-learning algorithm dispatching rule selection

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：