Reinforcement Learning Toolbox 2.0
last updated:
General
Documentation
Manual
Tutorial
Class Reference
Master Thesis
Examples
Related Papers
Downloads
Links
News
mailto:webmaster
Main Page     Class Hierarchy   Compound List   File List   Compound Members   File Members

CAbstractFeatureStochasticEstimatedModel Class Reference

Base class for all estimated models. More...

#include <ctheoreticalmodel.h>

Inheritance diagram for CAbstractFeatureStochasticEstimatedModel:

CFeatureStochasticModel CSemiMDPListener CStateObject CLearnDataObject CAbstractFeatureStochasticModel CParameterObject CParameterObject CActionObject CParameters CParameters CDiscreteStochasticEstimatedModel CFeatureStochasticEstimatedModel List of all members.


Public Member Functions

  CAbstractFeatureStochasticEstimatedModel (CStateProperties *properties, CFeatureQFunction *stateActionVisits, CActionSet *actions, int numFeatures)
  Creates an new estimated model.

  CAbstractFeatureStochasticEstimatedModel (CStateProperties *properties, CFeatureQFunction *stateActionVisits, CActionSet *actions, int numFeatures, FILE *file)
  Loads an estimated model from a file.

virtual  ~CAbstractFeatureStochasticEstimatedModel ()
virtual void  nextStep (CStateCollection *oldState, CAction *action, CStateCollection *nextState)=0
  the nextStep method, must be implemented by the subclasses

virtual void  intermediateStep (CStateCollection *oldState, CAction *action, CStateCollection *nextState)
  intermediate Steps can be treated as normal steps in the model based case

virtual void  saveData (FILE *stream)
virtual void  loadData (FILE *stream)
virtual void  resetData ()
double  getTransitionsVisits (int oldFeature, CAction *action, int newFeature)
  Returns the Transition visits of the specified Transition.

double  getStateActionVisits (int Feature, int action)
  Returns the State Action Visits.

double  getStateVisits (int Feature)
  Returns how often the agent visited the given state.



Protected Member Functions

virtual void  updateStep (int oldFeature, CAction *action, int newFeature, double Faktor)
  Updates the propabilities of the transitions from oldFeature and the given actions.



Protected Attributes

CFeatureQFunction stateActionVisits

Detailed Description

Base class for all estimated models.

Estimated Models estimate the propability of the state transition by counting the number of Transitions from a specific state action pair to a specific state and the number of visits from of the specific state-action pair. So the estimated model is build on the fly, during learning. This is done by the class CAbstractFeatureStochasticEstimatedModel. The class is subclass of CFeatureStochasticModel so it stores the transition probabilities in the Transition list. In addition it has an double array which stores the visits of the state action pair (double is needed because feature visits can be double valued). The Transition-visits are not stored explicitly but can be recovered by multiplying the probability with the visits of the state action pair.

The class CAbstractFeatureStochasticEstimatedModel provides the function doUpdateStep for updating the transitions and the visit table when a specific feature is visited (with an given factor). The function first calculates the visits of the Transitions (multiplying state-action visits with transition propability), updates the visits of the state-action pair (the factor of the feature is added), and then recalculates the new probabilities of the transitions (by dividing the transition visits through the state action visits). Before this is done the feature factor is added to the specified transition's visits or a new Transition object is created if the transition hasn’t existed by now.
The class also has the possibility to forget transitions from the past, so the propabilities can adapt to changing models more quickly. This is done by the timeFaktor. Each time an update occurs, the state-actoin visits are multiplied by the timeFaktor before updating. By default the time factor is 1.0, so nothing is forgotten.
There are additional functions for retrieving the transition and the state action and the state visits.

The subclasses of CAbstractFeatureStochasticEstimatedModel only have to implement the function nextStep(...) from the CSemiMDPListener interface. Indermediate steps don't need a special treatment, and are updated like normal step.


Constructor & Destructor Documentation

CAbstractFeatureStochasticEstimatedModel::CAbstractFeatureStochasticEstimatedModel CStateProperties properties,
CFeatureQFunction stateActionVisits,
CActionSet actions,
int  numFeatures
 

Creates an new estimated model.

CAbstractFeatureStochasticEstimatedModel::CAbstractFeatureStochasticEstimatedModel CStateProperties properties,
CFeatureQFunction stateActionVisits,
CActionSet actions,
int  numFeatures,
FILE *  file
 

Loads an estimated model from a file.

virtual CAbstractFeatureStochasticEstimatedModel::~CAbstractFeatureStochasticEstimatedModel  )  [virtual]
 

Member Function Documentation

double CAbstractFeatureStochasticEstimatedModel::getStateActionVisits int  Feature,
int  action
 

Returns the State Action Visits.

Returns how often the given action was choosen in the given state. The State Action visits are stored in saVisits.

Reimplemented in CDiscreteStochasticEstimatedModel.

double CAbstractFeatureStochasticEstimatedModel::getStateVisits int  Feature  ) 
 

Returns how often the agent visited the given state.

This is calculated by summing up the state action visits.

Reimplemented in CDiscreteStochasticEstimatedModel.

double CAbstractFeatureStochasticEstimatedModel::getTransitionsVisits int  oldFeature,
CAction action,
int  newFeature
 

Returns the Transition visits of the specified Transition.

The transition visits show how often a specific transition has been occured and they are calcualted by multiplying the propability of the transition with the visits of the state action pair.

virtual void CAbstractFeatureStochasticEstimatedModel::intermediateStep CStateCollection oldState,
CAction action,
CStateCollection nextState
[virtual]
 

intermediate Steps can be treated as normal steps in the model based case

Reimplemented from CSemiMDPListener.

virtual void CAbstractFeatureStochasticEstimatedModel::loadData FILE *  stream  )  [virtual]
 

Implements CLearnDataObject.

virtual void CAbstractFeatureStochasticEstimatedModel::nextStep CStateCollection oldState,
CAction action,
CStateCollection nextState
[pure virtual]
 

the nextStep method, must be implemented by the subclasses

Reimplemented from CSemiMDPListener.

Implemented in CDiscreteStochasticEstimatedModel, and CFeatureStochasticEstimatedModel.

virtual void CAbstractFeatureStochasticEstimatedModel::resetData  )  [virtual]
 

Implements CLearnDataObject.

virtual void CAbstractFeatureStochasticEstimatedModel::saveData FILE *  stream  )  [virtual]
 

Implements CLearnDataObject.

virtual void CAbstractFeatureStochasticEstimatedModel::updateStep int  oldFeature,
CAction action,
int  newFeature,
double  Faktor
[protected, virtual]
 

Updates the propabilities of the transitions from oldFeature and the given actions.

The function first calculates the visits of the Transitions (multiplying state-action visits with transition propability), updates the visits of the state-action pair (the factor of the feature is added), and then recalculates the new probabilities of the transitions (by dividing the transition visits through the state action visits). Before this is done the feature factor is added to the specified transition's visits or a new Transition object is created if the transition hasn’t existed by now. For the SemiMDP case the duration is added to the transition after the updates.


Member Data Documentation

CFeatureQFunction* CAbstractFeatureStochasticEstimatedModel::stateActionVisits [protected]
 

The documentation for this class was generated from the following file: