CFeatureRewardModel Class Reference
#include <crewardmodel.h>
Inheritance diagram for CFeatureRewardModel:
List of all
members.
|
Public Member Functions
|
|
|
CFeatureRewardModel
(CActionSet
*actions,
CRewardFunction
*function, CAbstractFeatureStochasticEstimatedModel
*model,
CStateModifier
*discretizer)
|
| |
Creates a reward model, which uses an
estimated model for the calculation of the transition visits.
|
|
|
CFeatureRewardModel
(CActionSet
*actions,
CRewardFunction
*function, CStateModifier *discretizer)
|
| |
Creates a reward model, which has to use an
own visit table for the visits.
|
|
virtual
|
~CFeatureRewardModel
()
|
|
virtual double
|
getReward (int
oldState, CAction
*action, int newState)
|
| |
Returns the reward for a specific discrete
state transition.
|
|
virtual void
|
nextStep (CStateCollection *oldState,
CAction *action, double
reward, CStateCollection
*newState)
|
| |
virtual function, to be implemented by
subclass
|
|
virtual void
|
saveData (FILE
*stream)
|
| |
Saves the reward model.
|
|
virtual void
|
loadData (FILE
*stream)
|
|
virtual void
|
resetData ()
|
Protected Member Functions
|
|
double
|
getTransitionVisits
(int oldState, int action, int newState)
|
| |
Returns the transition visits of the
specified state.
|
Protected Attributes
|
|
CMyArray2D<
CFeatureMap * >
*
|
rewardTable |
| |
Table of the rewards, summed up during the
whole training trial, for a transition.
|
|
CMyArray2D<
CFeatureMap * >
*
|
visitTable |
| |
Table of the Transition visits, so the
reward can be calculated by the mean (sum rewards/sum visits).
|
|
CAbstractFeatureStochasticEstimatedModel
*
|
model |
| |
Used for calculating the visits, so no visit
table is needed if used.
|
|
bool
|
bExternVisitSparse |
Detailed Description
For model based learning you need an reward function which assigns
a reward for transitions of feature indices, not for state objects.
But what happens if you don't have a reward function for features
to yours disposal, if you just have normal reward function (e.g.
for the model state)? You can also estimate the reward you will get
for a transition, this is done by CFeatureRewardModel. Therefore it
stores the reward already got when the same transition occurred and
the visits of the transition. so it can calculate the mean reward.
For the visits of the transition, a estimated model can also be
used to spare memory. Since the reward model must learn from the
training trial it has to be added to the agent's listener list.
Since the reward model implements the CFeatureRewardFunction
interface it can also be used as normal reward function. semi MDP
support hans't been added by now.
- See also:
- CFeatureRewardFunction
Constructor & Destructor Documentation
| |
Creates a reward model, which uses an estimated model for the
calculation of the transition visits.
It is very important that the estimated model contains the same
transitions as the reward table, because otherwise a division by
zero would occur. So the estimated model has to be added before the
reward model to the listener.
|
| |
Creates a reward model, which has to use an own visit table for
the visits.
|
| virtual
CFeatureRewardModel::~CFeatureRewardModel
|
( |
|
) |
[virtual] |
|
Member Function Documentation
| virtual double
CFeatureRewardModel::getReward
|
( |
int |
oldState,
|
|
|
CAction *
|
action,
|
|
|
int |
newState |
|
) |
[virtual] |
|
| |
Returns the reward for a specific discrete state transition.
Calculates the mean reward from that transition, i.e. sum
rewards/sum visits
Implements CFeatureRewardFunction.
|
| double
CFeatureRewardModel::getTransitionVisits
|
( |
int |
oldState,
|
|
|
int |
action,
|
|
|
int |
newState |
|
) |
[protected] |
|
| |
Returns the transition visits of the specified state.
Returns either the visits from the visit table, or, if an
estimated model is assigned, the visits can also be retrieved by
the model
|
| virtual void
CFeatureRewardModel::loadData
|
( |
FILE * |
stream |
) |
[virtual] |
|
| |
Loads the reward Table and the visit table if no estimated model
was assigned to the constructor
Implements CLearnDataObject.
|
| virtual void
CFeatureRewardModel::resetData
|
( |
|
) |
[virtual] |
|
| virtual void
CFeatureRewardModel::saveData
|
( |
FILE * |
stream |
) |
[virtual] |
|
| |
Saves the reward model.
Saves the reward Table and the visit table if it is used
Implements CLearnDataObject.
|
Member Data Documentation
| |
Used for calculating the visits, so no visit table is needed if
used.
|
| |
Table of the rewards, summed up during the whole training trial,
for a transition.
|
| |
Table of the Transition visits, so the reward can be calculated
by the mean (sum rewards/sum visits).
Only used if no extern estimated model is
assigned
|
The documentation for this class was generated from the following
file:
|