检索结果-内蒙古大学图书馆

2nd International Conference on Intelligent Cyber Physical Systems and Internet of Things (ICoICI)

作者： Kumar, S. Mohan Peter, John Benito Jesudasan Kolangiammal, S. Mubarakali, Azath Karthik, S. Sujatha, S. CMR Univ Res & Innovat Bangalore Karnataka India Capital One Serv LLC Cyber Data Sci & Engn Richmond VA USA SRM Inst Sci & Technol Dept Elect & Commun Engn Chennai Tamil Nadu India King Khalid Univ Coll Comp Sci Abha Saudi Arabia Vinayaka Missions Res Fdn Deemed Be Univ Vinayaka Missions Kirupananda Variyar Engn Coll Dept Comp Sci & Engn Salem Tamil Nadu India Saveetha Univ Saveetha Sch Engn Saveetha Inst Med & Tech Sci Dept Biomed Engn Chennai Tamil Nadu India

ISBN: (纸本)9798331540661;9798331540678

Improving efficiency and patient satisfaction through better appointment scheduling is a problem for healthcare systems across the globe. To improve healthcare appointment scheduling procedures, this research proposes a new method that makes use of cloud computing and reinforcement learning (rl) algorithms. to maximize healthcare provider efficiency while minimizing patient wait times, resource usage, and operational costs by dynamically learning and adapting scheduling rules using rl, taking use of the scalability and computing capacity of cloud infrastructure. To train rl agents, create a simulation environment that mimics real-world healthcare conditions. Our findings show that rl-based scheduling strategies are more efficient at scheduling appointments than conventional techniques, and we prove it via rigorous testing and assessment. Additionally, we demonstrate how our strategy can handle dynamic healthcare contexts with different patient numbers and resource restrictions, demonstrating its flexibility and resilience. The results show that rl approaches run by the cloud can change healthcare appointment scheduling for the better, leading to healthcare systems that are more responsive and flexible. Utilizing the complementary strengths of cloud computing and rl, our method provides a data-driven, scalable solution to the hard problem of healthcare schedule optimization, which improves both patient care and operational efficiency

关键词： Healthcare systems Patient wait times rl algorithms Cloud infrastructure Dynamic learning Scheduling policies

来源：评论

学校读者我要写书评

暂无评论

Constructing an English Language Chatbot Using Reinforcement Learning algorithms 3

Constructing an English Language Chatbot Using Reinforcement...

引用

3rd International Conference on Artificial Intelligence and Autonomous Robot Systems, AIARS 2024

作者： Xu, Bo School of International Communication Jilin Animation Institute Changchun130000 China

ISBN: (纸本)9798350376173

Against the backdrop of accelerating global integration, the role of English as a global language is becoming increasingly significant worldwide. With the development of technology, the development of chatbots (Chat Robots) has become a concern for people. This article focuses on the current situation of English teaching and introduces 'chatbots' in English teaching practice, aiming to improve students' English proficiency. This article combines reinforcement learning (rl) and topic modeling to establish a new rl-LDA (Reinforcement Learning Latent Dirichlet Allocation) model. This framework uses an encoder decoder to encode the document topic layer and topic vocabulary, and multiplies the information obtained from the two to form the output of the neural network input layer. In rl, the word overlap rate is only 14.2%, and the English language chatbot constructed in this article has good performance. It shows a certain degree of intelligence and adaptability, and can make personalized responses according to the needs of different users. © 2024 IEEE.

关键词： Chat robots English language rl algorithms topic models word overlap rate

来源：评论

学校读者我要写书评

暂无评论

Provably Efficient algorithms for Safe Reinforcement Learning

Provably Efficient Algorithms for Safe Reinforcement Learnin...

引用

作者： Wei, Honghao University of Michigan

学位级别：Ph.D., Doctor of Philosophy

Safe reinforcement learning (rl) is an area of research focused on developing algorithms and methods that ensure the safety of rl agents during learning and decision-making processes. The goal is to enable rl agents to interact with their environments and learn optimal decisions while avoiding actions that can lead to harmful or undesirable outcomes. This dissertation provides a comprehensive study of model-free, simulator-free reinforcement learning algorithms for Constrained Markov Decision Processes (CMDPs) with sublinear regret and zero constraint violation, with the focus on three settings: (1) episodic CMDPs; (2) infinite-horizon average-reward CMDPs and (3) non-stationary episodic CMDPs. The first part provides the first model-free, simulator-free safe-rl algorithm with sublinear regret and zero constraint violation. The algorithm is named Triple-Q because it includes three key components: a Q-function (also called action value function) for the cumulative reward, a Q-function for the cumulative utility of the constraint, and a virtual Queue that (over)-estimates the cumulative constraint violation. Under Triple-Q, at each step, an action is chosen based on the pseudo-Q-value that is a combination of the three “Q” values. The algorithm updates the reward and utility Qvalues with learning rates that depend on the visit counts to the corresponding (state, action) pairs and are periodically reset. In the episodic CMDP setting, Triple-Q achieves Õ (1/δ H4 S1/2 A1/2 K4/5) regret when K is large enrough, where K is the total number of episodes, H is the number of steps in each episode, S is the number of states, A is the number of actions, and δ is Slater’s constant. Furthermore, Triple-Q guarantees zero constraint violation, both on expectation and with a high probability, when K is sufficiently large. Finally, the computational complexity of Triple-Q is similar to SARSA for unconstrained MDPs, and is computationally efficient. In Chapter III, the results are exte

关键词： Safe reinforcement learning Model-free rl algorithms Zero constraint violation Linear function approximation

来源：评论

学校读者我要写书评

暂无评论

Review of Deep Reinforcement Learning for Real Robots

World Scientific Research Journal

引用

World Scientific Research Journal 2022年第7期8卷 686-693页

作者： Huayi Sheng

Deep reinforcement learning is one of the most exciting fields in artificial intelligence, combining reinforcement learning with the power of deep neural networks to understand the world and act on that understanding. In the past few years, deep reinforcement learning has been extensively studied, with remarkable progress and widespread success in different fields. For robot control, deep reinforcement learning algorithms hold the promise of achieving human‐like tasks or surpassing them. This paper reviews the research status of reinforcement learning algorithms in the field of robot control. The basic theory of reinforcement learning, the mathematical background, and the problem of narrowing down current robotics applications are also included in this review. Finally, future research directions for reinforcement learning are discussed.

关键词： Neural networks Reinforcement learning (rl) Markov decision process rl algorithms

来源：评论

学校读者我要写书评

暂无评论

Option and Constraint Generation using Work Domain Analysis

Option and Constraint Generation using Work Domain Analysis

引用

IEEE International Conference on Systems, Man, and Cybernetics (SMC)

作者： Tokadli, Gueliz Feigh, Karen M. Georgia Inst Technol Sch Aerosp Engn Atlanta GA 30332 USA

ISBN: (纸本)9781479938407

In this paper we investigate the use of Work Domain Analysis (WDA), a technique from the field of cognitive engineering, to inform the creation of options and constraints for Reinforcement Learning (rl) algorithms. The micro-world of Pac-Man, a classic arcade game, is used as a tractable and representative work domain. WDA was conducted on individuals familiar with Pac-Man and an Abstraction Hierarchy (AH), a means-ends representation of their understanding of the game, was created for each individual. The abstraction hierarchies for best performing and worst performing individuals were then combined to illustrate the differences between the different groups. Several differences between the two groups were found, and included the use of defense as well as offensive strategies by high performers versus only defense by poor performers, context sensitivity and additional goals and more sophisticated constraints by high performers. The differences were translated into an options and constraint paradigm suitable for incorporation into rl algorithms.

关键词： cognition learning (artificial intelligence) Pac-Man microworld rl algorithms WDA abstraction hierarchy cognitive engineering constraint generation context sensitivity means-ends representation option generation reinforcement learning algorithms work domain analysis Abstracts Algorithm design and analysis Games Interviews Learning (artificial intelligence) Machine learning algorithms Terminology

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：