For the virtues such as simplicity, high generalization capability, and few training cost, the K-Nearest-Neighbor (KNN) classifier is widely used in pattern recognition and machinelearning. However, the computation c...
详细信息
Reinforcement learning has established as a framework that allows an autonomous agent for automatically acquiring – in a trial and error-based manner – a behavior policy based on a specification of the desired behav...
详细信息
We consider a stochastic extension of the loop-free shortest path problem with adversarial rewards. In this episodic Markov decision problem an agent traverses through an acyclic graph with random transitions: at each...
详细信息
ISBN:
(纸本)9780982252925
We consider a stochastic extension of the loop-free shortest path problem with adversarial rewards. In this episodic Markov decision problem an agent traverses through an acyclic graph with random transitions: at each step of an episode the agent chooses an action, receives some reward, and arrives at a random next state, where the reward and the distribution of the next state depend on the actual state and the chosen action. We consider the bandit situation when only the reward of the just visited state-action pair is revealed to the agent. For this problem we develop algorithms that perform asymptotically as well as the best stationary policy in hindsight. Assuming that all states are reachable with probability α > 0 under all policies, we give an algorithm and prove that its regret is 0(L2 √T\A\/α), where T is the number of episodes, A denotes the (finite) set of actions, and L is the length of the longest path in the graph. Variants of the algorithm are given that improve the dependence on the transition probabilities under specific conditions. The results are also extended to variations of the problem, including the case when the agent competes with time varying policies.
Recommender systems are an important component of many websites. Two of the most popular approaches are based on matrix factorization (MF) and Markov chains (MC). MF methods learn the general taste of a user by factor...
详细信息
The sensitivity analysis can help to construct a tightly neural network. There are several methods to define the sensitivity of input and weight for perturbations to the trained neural network. This paper proposed a s...
详细信息
We report on trace gas and major atmospheric constituents results obtained by the Vehicle Cabin Atmosphere Monitor (VCAM) during operations aboard the International Space Station (ISS). VCAM is an autonomous environme...
详细信息
One of the most important issues in fuzzy decision tree learning is the fuzzification of input data. This paper proposes a self-adaptive data fuzzification algorithm based on the self-organizing map (SOM) technology, ...
详细信息
Decision tree is one of the most popular and widely used classification models in machinelearning. The discretization of continuous-valued attributes plays an important role in decision tree generation. In this paper...
详细信息
Reinforcement learning suffers from inefficiency when the number of potential solutions to be searched is large. This paper describes a method of improving reinforcement learning by applying rule induction in multi-ag...
详细信息
Wireless Sensor Networks for Home Appliance Energy Management based on ZigBee technology is introduced in this paper. The aim of this research is to develop a real-time, low-cost, low power consumption and better reli...
详细信息
暂无评论