Reinforcement Learning Toolbox 2.0
last updated:
General
Documentation
Manual
Tutorial
Class Reference
Master Thesis
Examples
Related Papers
Downloads
Links
News
mailto:webmaster
Main Page     Class Hierarchy   Compound List   File List   Compound Members   File Members

CAbstractVETraces Class Reference

Class representing etraces for a V-Function. More...

#include <cvetraces.h>

Inheritance diagram for CAbstractVETraces:

CParameterObject CParameters CGradientVETraces CStateVETraces CFeatureVETraces List of all members.


Public Member Functions

  CAbstractVETraces (CAbstractVFunction *vFunction)
  Creates an ETrace for the given V-Function.

virtual void  resetETraces ()=0
  Interface for clearing the Etraces.

virtual void  addETrace (CStateCollection *State, double factor=1.0)=0
  Interface for adding a Etrace.

virtual void  updateETraces (int duration=1)=0
  Interfeace for updating the ETraces.

virtual void  updateVFunction (double td)=0
  Update the V-Function.

void  setLambda (double lambda)
double  getLambda ()
void  setTreshold (double treshold)
double  getTreshold ()
void  setReplacingETraces (bool bReplace)
  Sets the use of Replacing V-Etraces.

bool  getReplacingETraces ()
CAbstractVFunction getVFunction ()


Protected Attributes

CAbstractVFunction vFunction
  pointer to the V-Function


Detailed Description

Class representing etraces for a V-Function.

V-ETraces store the "Trace" of each state, so all past states are stored in that "Trace". Many Learning Algorithms use E-Traces for their Value-Updates, which makes a very good improvement to most of the algorithms. V-ETraces objects stores the E-Traces only for the states, not for actions (see CAbstractQETraces), what exactly gets stored in the etraces depends on the kind of the value function. The class CAbstractVETraces is the interface for the V-Etraces objects. It has 4 virtual functions for implementing the E-Traces functionality.

  • addETrace(CStateCollection *State, double factor = 1.0) adds the etrace of the specified state with the given factor.
  • updateVFunction(double td): updates the VFunction with the ETraces. For each state in the ETraces the update value "td" gets multiplied by the state's factor and then the V-Value of the specified state is updated by this value.
  • resetETraces(): resets the ETraces, i.e. all states are cleared from the ETraces.
  • updateETraces(int duration): Multiplies all ETraces with (lambda * gamma)^duration. The duration is obviously the duration of the current step, so the time attenuation factor has to be exponentiated with the duration.
Each ETraces object has the parameter "Lambda", which is a attentuation factor. Every state from the past gets updated by updateVFunction with the factor lambda^N*gamma^N, N is the time past since the state was active the parameter gamma is the discount factor of the learning Problem (Parameter: "DiscountFactor").
In The RIL toolbox there are several implementations of V-ETraces. CStateVETraces stores the states directly in a state list and maintains an own list for the factors, CFeatureVETraces saves only the features of a state in a Table and CGradientVETraces saves always the current gradient to the etrace object. To determine which E-Trace object shall be used for a value-function the class CAbstractVFunction provides the method getStandardETraces, which returns a new VETraces object which is best suited for that class of V-Functions.
The class CAbstractVETraces has following parameters:
  • "Lambda", 0.9 : attenuation factor
  • "DiscountFactor", 0.95 : gamma
  • "ReplacingETraces", replacing etrace handling depends on the kind of the etraces
  • "ETraceTreshold", 0.001 : smallest value of an etrace, the etrace will be deleted from the list if its lower than this value. Used for performance reasons.

Constructor & Destructor Documentation

CAbstractVETraces::CAbstractVETraces CAbstractVFunction vFunction  ) 
 

Creates an ETrace for the given V-Function.


Member Function Documentation

virtual void CAbstractVETraces::addETrace CStateCollection State,
double  factor = 1.0
[pure virtual]
 

Interface for adding a Etrace.

Implemented in CStateVETraces, CGradientVETraces, and CFeatureVETraces.

double CAbstractVETraces::getLambda  ) 
 
bool CAbstractVETraces::getReplacingETraces  ) 
 
double CAbstractVETraces::getTreshold  ) 
 
CAbstractVFunction* CAbstractVETraces::getVFunction  ) 
 
virtual void CAbstractVETraces::resetETraces  )  [pure virtual]
 

Interface for clearing the Etraces.

Implemented in CStateVETraces, and CGradientVETraces.

void CAbstractVETraces::setLambda double  lambda  ) 
 
void CAbstractVETraces::setReplacingETraces bool  bReplace  ) 
 

Sets the use of Replacing V-Etraces.

void CAbstractVETraces::setTreshold double  treshold  ) 
 
virtual void CAbstractVETraces::updateETraces int  duration = 1  )  [pure virtual]
 

Interfeace for updating the ETraces.

All ETraces factors get multplied by lambda * gamma. For an multistep action the update has to be lambda * gamma^N, so the duration can be given as parameter.

Implemented in CStateVETraces, and CGradientVETraces.

virtual void CAbstractVETraces::updateVFunction double  td  )  [pure virtual]
 

Update the V-Function.

For all States in the Etraces the "td" value is multiplied with the E-Trace factor (e.g. lambda^N*gamma^N$ for replacing E-Traces), N is the time past since the state was active is calculated ,the Value of the state is updated.

Implemented in CStateVETraces, and CGradientVETraces.


Member Data Documentation

CAbstractVFunction* CAbstractVETraces::vFunction [protected]
 

pointer to the V-Function


The documentation for this class was generated from the following file: