Reinforcement Learning Toolbox 2.0
last updated:
General
Documentation
Manual
Tutorial
Class Reference
Master Thesis
Examples
Related Papers
Downloads
Links
News
mailto:webmaster
Main Page     Class Hierarchy   Compound List   File List   Compound Members   File Members

CPrioritizedSweeping Class Reference

class for model based Prioritized Sweeping More...

#include <cprioritizedsweeping.h>

Inheritance diagram for CPrioritizedSweeping:

CSemiMDPListener CValueIteration CStateObject CParameterObject CParameterObject CParameters CParameters List of all members.


Public Member Functions

  CPrioritizedSweeping (CFeatureQFunction *qFunction, CStateModifier *discretizer, CAbstractFeatureStochasticModel *model, CFeatureRewardFunction *rewardFunction, int kSteps)
  CPrioritizedSweeping (CFeatureVFunction *vFunction, CStateModifier *discretizer, CAbstractFeatureStochasticModel *model, CFeatureRewardFunction *rewardFunction, int kSteps)
virtual  ~CPrioritizedSweeping ()
virtual void  nextStep (CStateCollection *oldState, CAction *action, CStateCollection *newState)
  Updates the features of the old State.

CFeatureCalculator getFeatureCalculator ()


Protected Attributes

int  kSteps
  kSteps updates are done each step.


Detailed Description

class for model based Prioritized Sweeping

Prioritized Sweeping is used if the model is learned during the training trial and is very similiar to value iteration. Since it would be too complex to do value iteration each time the model changes, only the first k states from the priority list are updated each step, starting with the current features. The Prioritized Sweeping class is subclass of CValueIteration, the only extension is that, each time a nextStep event occurs, the current features are updated (and so the backward states of the current features are added to the priority list) and than the states from the list are updated k times. Since it is subclass of CValueIteration it provides the full functionality for state updates. The prioritized sweeping class always learns the value function of the greedy policy, so its the optimal value function.

The class only provides Q-Function learning because the Q-Function is needed for the policies. Additionally it takes a model (CAbstractFeatureStochasticModel) and a feature reward function as parameters. The model (and the reward function, if it is a reward model itself) has to be added to the agents listener list before the prioritized sweeping algorithm is added.


Constructor & Destructor Documentation

CPrioritizedSweeping::CPrioritizedSweeping CFeatureQFunction qFunction,
CStateModifier discretizer,
CAbstractFeatureStochasticModel model,
CFeatureRewardFunction rewardFunction,
int  kSteps
 

The class only provides Q-Function learning because the Q-Function is needed for the policies. Additionally it takes a model (CAbstractFeatureStochasticModel) and a feature reward function as parameters. The model (and the reward function, if it is a reward model itself) has to be added to the agents listener list before the prioritized sweeping algorithm is added.

CPrioritizedSweeping::CPrioritizedSweeping CFeatureVFunction vFunction,
CStateModifier discretizer,
CAbstractFeatureStochasticModel model,
CFeatureRewardFunction rewardFunction,
int  kSteps
 
virtual CPrioritizedSweeping::~CPrioritizedSweeping  )  [virtual]
 

Member Function Documentation

CFeatureCalculator* CPrioritizedSweeping::getFeatureCalculator  ) 
 
virtual void CPrioritizedSweeping::nextStep CStateCollection oldState,
CAction action,
CStateCollection newState
[virtual]
 

Updates the features of the old State.

Retrieves the state from the statecollection (with the given modifier pointer from the constructor), and updates each feature. So the backwards states priorities get updated as well. After that, the first kSteps states in the list are updated.

Reimplemented from CSemiMDPListener.


Member Data Documentation

int CPrioritizedSweeping::kSteps [protected]
 

kSteps updates are done each step.


The documentation for this class was generated from the following file: