检索结果-内蒙古大学图书馆

2nd International Conference on Vision, Image and Signal Processing (ICVISP)

作者： Li, Wei Meleis, Waleed Northeastern Univ 360 Huntington Ave Boston MA 02115 USA

ISBN: (纸本)9781450365291

A major challenge in reinforcement learning (RL) is use of a tabular representation to represent learned policies with a large number of states or state-action pairs. Function approximation is a promising tool to overcome this deficiency. This approach uses parameterized functions instead of a table to represent learned knowledge and enables generalization. However, existing schemes cannot solve realistic RL problems, with their rapidly increasing demands for approximating accuracy and efficiency. In this paper, we extend the architecture of Sparse Distributed Memories (SDMs) and propose a novel on-line methodology, similarity-aware kanerva coding (SAK), that closely represents the learned knowledge for very large-scale problems with significantly fewer parameterized components. SAK directly measures the state variables' real distances in all dimensions and reformulates a new state similarity metric with an improved definition of state closeness. As a result, our scheme accurately distributes and generalizes knowledge among related states. We further enhance SAK's efficiency by allowing a limited number of prototype states that have certain similarities to be activated for value approximation so that the risk of over-generalization is hindered. In addition, SAK eliminates size tuning and prototype reallocation for the prototype set, resulting in not only broadened scalability but also significant savings in the amount of necessary prototypes and computational overhead needed for RL. Our extensive experimental results show that SAK achieves more than 48% improvements over existing schemes in learning quality, and reveal that SAK is able to consistently learn good policies for RL with small overhead and short training times, even given roughly tuned scheme parameters.

关键词： Reinforcement learning function approximation kanerva coding

来源：评论

学校读者我要写书评

暂无评论

Dynamic Generalization kanerva coding in Reinforcement Learning for TCP Congestion Control Design 16

Dynamic Generalization Kanerva Coding in Reinforcement Learn...

引用

16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)

作者： Li, Wei Zhou, Fan Meleis, Waleed Chowdhury, Kaushik Northeastern Univ Dept Elect & Comp Engn Boston MA 02115 USA

ISBN: (纸本)9781510855076

Traditional reinforcement learning (RL) techniques often encounter limitations when solving large or continuous state-action spaces. Training times needed to explore the very large space are impractically long, and it can be difficult to generalize learned knowledge. A compact representation of the state space is usually generated to solve both problems. However, simple state abstraction often cannot achieve the desired learning quality, while expert state representations usually involve costly hand-crafted strategies. We propose a new technique, generalization-based kanerva coding, that automatically generates and optimizes state abstractions for learning. When applied to adapting the congestion window of the highly complex TCP congestion control protocol, a standard Internet protocol, this technique outperforms the current standard-TCP New Reno by 59:5% in throughput and 6:5% in delay. Our technique also achieves a 35:2% improvement in throughput over the best previously proposed kanerva coding technique when applied in the same context.

关键词： state abstraction TCP congestion control dynamic generalization kanerva coding

来源：评论

学校读者我要写书评

暂无评论

Dynamic Generalization kanerva coding in Reinforcement Learning for TCP Congestion Control Design 17

Dynamic Generalization Kanerva Coding in Reinforcement Learn...

引用

Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

作者： Wei Li Fan Zhou Waleed Meleis Kaushik Chowdhury Northeastern University Boston MA USA

Traditional reinforcement learning (RL) techniques often encounter limitations when solving large or continuous state-action spaces. Training times needed to explore the very large space are impractically long, and it can be difficult to generalize learned knowledge. A compact representation of the state space is usually generated to solve both problems. However, simple state abstraction often cannot achieve the desired learning quality, while expert state representations usually involve costly hand-crafted *** propose a new technique, generalization-based kanerva coding, that automatically generates and optimizes state abstractions for learning. When applied to adapting the congestion window of the highly complex TCP congestion control protocol, a standard Internet protocol, this technique outperforms the current standard-TCP New Reno by 59.5% in throughput and 6.5% in delay. Our technique also achieves a 35.2% improvement in throughput over the best previously proposed kanerva coding technique when applied in the same context.

关键词： state abstraction TCP congestion control dynamic generalization kanerva coding

来源：评论

学校读者我要写书评

暂无评论

Sparse Distributed Memory Approach for Reinforcement Learning Driven Efficient Routing in Mobile Wireless Network System

引用

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS 2021年第11期12卷 144-152页

作者： Vidyadhar, Varshini Nagaraj, R. Sudha, G. Bangalore Inst Technol Dept Comp Sci & Engn Bangalore Karnataka India Bangalore Inst Technol Dept Informat Sci & Engn Bangalore Karnataka India Bangalore Inst Technol Dept Elect & Elect Bangalore Karnataka India

In recent years, researchers have explored the applicability of Q-learning, a model-free reinforcement learning technology towards designing QoS-aware, resource-efficiency, and reliable routing techniquesin a dynamically changing network environment. However, Q-learning is based on tabular representation to characterize learned policies that frequently encounter a dimension disaster problem when introduced to the uncertain and dynamically changing network environment. In addition, the time required for agent learning in the training phase is too long, which makes it difficult for the agent to generalize the observation state efficiently. To this end, this paper attempts to overcome the overhead memory problems encountered in Q-learning-based routing techniques. In this paper, the study presents a novel memory-efficient intelligent routing mechanism based on adaptive kanerva coding, which minimizes the storage cost required for storing large action and a state value. Unlike existing schemes, the proposed method optimizes memory requirements. Also, it enables better generalization by storing the learnable parameters of the function approximator present in the agent in a kanerva-coding data structure. The kanerva-coding is a sparse memory with distributed reading and writing mechanism which enables optimal compression and state abstractions for learning with fewer parameterized components making it highly memory efficient. The design and implementation of the proposed technique are done on the Anaconda tool. Simulation results demonstrate that the proposed technique can adaptively adjust the routing policy according to the varying network environment to meet the transmission requirements of different services with low memory requirements.

关键词： Mobile wireless network reinforcement learning Q-learning kanerva coding routing memory optimization

来源：评论

学校读者我要写书评

暂无评论

QTCP: Adaptive Congestion Control with Reinforcement Learning

引用

IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING 2019年第3期6卷 445-458页

作者： Li, Wei Zhou, Fan Chowdhury, Kaushik Roy Meleis, Waleed Northeastern Univ Dept Elect & Comp Engn Boston MA 02115 USA

Next generation network access technologies and Internet applications have increased the challenge of providing satisfactory quality of experience for users with traditional congestion control protocols. Efforts on optimizing the performance of TCP by modifying the core congestion control method depending on specific network architectures or apps do not generalize well under a wide range of network scenarios. This limitation arises from the rule-based design principle, where the performance is linked to a pre-decided mapping between the observed state of the network to the corresponding actions. Therefore, these protocols are unable to adapt their behavior in new environments or learn from experience for better performance. We address this problem by integrating a reinforcement-based Q-learning framework with TCP design in our approach called QTCP. QTCP enables senders to gradually learn the optimal congestion control policy in an on-line manner. QTCP does not need hard-coded rules, and can therefore generalize to a variety of different networking scenarios. Moreover, we develop a generalized kanerva coding function approximation algorithm, which reduces the computation complexity of value functions and the searchable size of the state space. We show that QTCP outperforms the traditional rule-based TCP by providing 59.5 percent higher throughput while maintaining low transmission latency.

关键词： Reinforcement learning TCP congestion control function approximation dynamic generalization kanerva coding

来源：评论

学校读者我要写书评

暂无评论

Learning-based and Data-driven TCP Design for Memory-constrained IoT 12

Learning-based and Data-driven TCP Design for Memory-constra...

引用

12th IEEE Annual International Conference on Distributed Computing in Sensor Systems (DCOSS)

作者： Li, Wei Zhou, Fan Meleis, Waleed Chowdhury, Kaushik Northeastern Univ Dept Elect & Comp Engn Boston MA USA

ISBN: (纸本)9781509014590

Advances in wireless technology have resulted in pervasive deployment of devices of a high variability in form factors, memory and computational ability. The need for maintaining continuous connections that deliver data with high reliability necessitate re-thinking of conventional design of the transport layer protocol. This paper investigates the use of Q-learning in TCP cwnd adaptation during the congestion avoidance state, wherein the classical alternation of the window is replaced, thereby allowing the protocol to immediately respond to previously seen network conditions. Furthermore, it demonstrates how memory plays a critical role in building the exploration space, and proposes ways to reduce this overhead through function approximation. The superior performance of the learning-based approach over TCP New Reno is demonstrated through a comprehensive simulation study, revealing 33.8% and 12.1% improvement in throughput and delay, respectively, for the evaluated topologies. We also show how function approximation can be used to dramatically reduce the memory requirements of a learning-based protocol while maintaining the same throughput and delay.

关键词： TCP IoT Q-learning function approximation kanerva coding

来源：评论

学校读者我要写书评

暂无评论

Rough Sets-based Prototype Optimization in kanerva-based Function Approximation

Rough Sets-based Prototype Optimization in Kanerva-based Fun...

引用

IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)

作者： Wu, Cheng Li, Wei Meleis, Waleed Soochow Univ Sch Urban Rail Transportat Suzhou Peoples R China Northeastern Univ Dept Elect & Comp Engn Boston MA 02115 USA

ISBN: (纸本)9781467396189

Problems involving multi-agent systems can be complex and involve huge state-action spaces, making such problems difficult to solve. Function approximation schemes such as kanerva coding with dynamic, frequency-based prototype selection can improve performance. However, selecting the number of prototypes is difficult and the approach often still gives poor performance. In this paper, we solve a collection of hard instances of the predator-prey pursuit problem and argue that poor performance is caused by inappropriate selection of the prototypes for kanerva coding, including the number and allocation of these prototypes. We use rough sets theory to reformulate the selection of prototypes and their implementation in kanerva coding. We introduce the equivalence class structure to explain how prototype collisions occur, use a reduct of the set of prototypes to eliminate unnecessary prototypes, and generate new prototypes to split the equivalence classes causing prototype collisions. The Rough Sets-based approach increases the fraction of predatorprey test instances solved by up to 24.5% over frequency-based kanerva coding. We conclude that prototype optimization based on rough set theory can adaptively explore the optimal number of prototypes and greatly improve a kanerva-based reinforcement learner's ability to solve large-scale multi-agent problems.

关键词： reinforcement learning kanerva coding rough sets predator-prey pursuit problem

来源：评论

学校读者我要写书评

暂无评论

Adaptive kanerva-based function approximation for multi-agent systems 08

Adaptive Kanerva-based function approximation for multi-agen...

引用

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3

作者： Cheng Wu Waleed M. Meleis

ISBN: (纸本)9780981738123

In this paper, we show how adaptive prototype optimization can be used to improve the performance of function approximation based on kanerva coding when solving largescale instances of classic multi-agent problems. We apply our techniques to the predator-prey pursuit problem. We first demonstrate that kanerva coding applied within a reinforcement learner does not give good results. We then describe our new adaptive kanerva-based function approximation algorithm, based on prototype deletion and generation. We show that probabilistic prototype deletion with random prototype generation increases the fraction of test instances that are solved from 45% to 90%, and that prototype splitting increases that fraction to 94%. We also show that optimizing prototypes reduces the number of prototypes, and therefore the number of features, needed to achieve a 90% solution rate by up to 87%. These results demonstrate that our approach can dramatically improve the quality of the results obtained and reduce the number of prototypes required. We conclude that adaptive prototype optimization can greatly improve a kanerva-based reinforcement learner's ability to solve large-scale multi-agent problems.

关键词： function approximation reinforcement learning kanerva coding pursuit

来源：评论

学校读者我要写书评

暂无评论

KaBaGe-RL: kanerva-based generalisation and reinforcement learning for possession football

KaBaGe-RL: Kanerva-based generalisation and reinforcement le...

引用

IEEE Conference on Intelligent Robots and Systems (IROS 2001)

作者： Kostiadis, K Hu, HS Univ Essex Dept Comp Sci Colchester CO4 3SQ Essex England

ISBN: (纸本)0780366123

The complexity of most modem systems prohibits a hand-coded approach to decision making. In addition, many problems have continuous or large discrete state spaces;some have large or continuous action spaces. The problem of learning in large spaces is tackled through generalisation techniques, which allow compact representation of learned information and transfer of knowledge between similar states and actions. In this paper kanerva coding and reinforcement learning are combined to produce the KaBaGe-RL decision-making module. The purpose of KaBaGe-RL is twofold. Firstly, kanerva coding is used as a generalisation method to produce a feature vector from the raw sensory input. Secondly, the reinforcement learning uses this feature vector in order to learn an optimal policy. The efficiency of KaBaGe-RL is tested using the "3 versus 2 possession football" challenge, a sub-problem of the RoboCup domain. The results demonstrate that the learning approach outperforms a number of benchmark policies including a hand-coded one.

关键词： kanerva coding reinforcement learning RoboCup decision-making generalisation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：