Reinforcement Learning Toolbox 2.0
last updated:
General
Documentation
Manual
Tutorial
Class Reference
Master Thesis
Examples
Related Papers
Downloads
Links
News
mailto:webmaster
Main Page     Class Hierarchy   Compound List   File List   Compound Members   File Members

CFeatureRewardModel Class Reference

#include <crewardmodel.h>

Inheritance diagram for CFeatureRewardModel:

CFeatureRewardFunction CSemiMDPRewardListener CActionObject CLearnDataObject CRewardFunction CStateObject CSemiMDPListener CParameterObject CParameterObject CParameters CParameters List of all members.


Public Member Functions

  CFeatureRewardModel (CActionSet *actions, CRewardFunction *function, CAbstractFeatureStochasticEstimatedModel *model, CStateModifier *discretizer)
  Creates a reward model, which uses an estimated model for the calculation of the transition visits.

  CFeatureRewardModel (CActionSet *actions, CRewardFunction *function, CStateModifier *discretizer)
  Creates a reward model, which has to use an own visit table for the visits.

virtual  ~CFeatureRewardModel ()
virtual double  getReward (int oldState, CAction *action, int newState)
  Returns the reward for a specific discrete state transition.

virtual void  nextStep (CStateCollection *oldState, CAction *action, double reward, CStateCollection *newState)
  virtual function, to be implemented by subclass

virtual void  saveData (FILE *stream)
  Saves the reward model.

virtual void  loadData (FILE *stream)
virtual void  resetData ()


Protected Member Functions

double  getTransitionVisits (int oldState, int action, int newState)
  Returns the transition visits of the specified state.



Protected Attributes

CMyArray2D< CFeatureMap * > *  rewardTable
  Table of the rewards, summed up during the whole training trial, for a transition.

CMyArray2D< CFeatureMap * > *  visitTable
  Table of the Transition visits, so the reward can be calculated by the mean (sum rewards/sum visits).

CAbstractFeatureStochasticEstimatedModel model
  Used for calculating the visits, so no visit table is needed if used.

bool  bExternVisitSparse

Detailed Description

For model based learning you need an reward function which assigns a reward for transitions of feature indices, not for state objects. But what happens if you don't have a reward function for features to yours disposal, if you just have normal reward function (e.g. for the model state)? You can also estimate the reward you will get for a transition, this is done by CFeatureRewardModel. Therefore it stores the reward already got when the same transition occurred and the visits of the transition. so it can calculate the mean reward. For the visits of the transition, a estimated model can also be used to spare memory. Since the reward model must learn from the training trial it has to be added to the agent's listener list. Since the reward model implements the CFeatureRewardFunction interface it can also be used as normal reward function. semi MDP support hans't been added by now.
See also:
CFeatureRewardFunction

Constructor & Destructor Documentation

CFeatureRewardModel::CFeatureRewardModel CActionSet actions,
CRewardFunction function,
CAbstractFeatureStochasticEstimatedModel model,
CStateModifier discretizer
 

Creates a reward model, which uses an estimated model for the calculation of the transition visits.

It is very important that the estimated model contains the same transitions as the reward table, because otherwise a division by zero would occur. So the estimated model has to be added before the reward model to the listener.

CFeatureRewardModel::CFeatureRewardModel CActionSet actions,
CRewardFunction function,
CStateModifier discretizer
 

Creates a reward model, which has to use an own visit table for the visits.

virtual CFeatureRewardModel::~CFeatureRewardModel  )  [virtual]
 

Member Function Documentation

virtual double CFeatureRewardModel::getReward int  oldState,
CAction action,
int  newState
[virtual]
 

Returns the reward for a specific discrete state transition.

Calculates the mean reward from that transition, i.e. sum rewards/sum visits

Implements CFeatureRewardFunction.

double CFeatureRewardModel::getTransitionVisits int  oldState,
int  action,
int  newState
[protected]
 

Returns the transition visits of the specified state.

Returns either the visits from the visit table, or, if an estimated model is assigned, the visits can also be retrieved by the model

virtual void CFeatureRewardModel::loadData FILE *  stream  )  [virtual]
 

Loads the reward Table and the visit table if no estimated model was assigned to the constructor

Implements CLearnDataObject.

virtual void CFeatureRewardModel::nextStep CStateCollection oldState,
CAction action,
double  reward,
CStateCollection newState
[virtual]
 

virtual function, to be implemented by subclass

Reimplemented from CSemiMDPRewardListener.

virtual void CFeatureRewardModel::resetData  )  [virtual]
 

Implements CLearnDataObject.

virtual void CFeatureRewardModel::saveData FILE *  stream  )  [virtual]
 

Saves the reward model.

Saves the reward Table and the visit table if it is used

Implements CLearnDataObject.


Member Data Documentation

bool CFeatureRewardModel::bExternVisitSparse [protected]
 
CAbstractFeatureStochasticEstimatedModel* CFeatureRewardModel::model [protected]
 

Used for calculating the visits, so no visit table is needed if used.

CMyArray2D<CFeatureMap *>* CFeatureRewardModel::rewardTable [protected]
 

Table of the rewards, summed up during the whole training trial, for a transition.

CMyArray2D<CFeatureMap *>* CFeatureRewardModel::visitTable [protected]
 

Table of the Transition visits, so the reward can be calculated by the mean (sum rewards/sum visits).

Only used if no extern estimated model is assigned


The documentation for this class was generated from the following file: