CAdaptiveParameterFromAverageRewardCalculator Class
Reference
Adaptive
Parameter Calculator which calculates the parameter's value from
the current average reward. More...
#include <cagentlistener.h>
Inheritance diagram for
CAdaptiveParameterFromAverageRewardCalculator:
List of all members.
|
Public Member Functions
|
|
|
CAdaptiveParameterFromAverageRewardCalculator
(CParameters
*targetObject,
string targetParameter,
CRewardFunction
*reward, int nStepsPerUpdate,
int functionKind,
double paramMin, double paramMax, double targetMin,
double targetMax,
double alpha)
|
|
|
~CAdaptiveParameterFromAverageRewardCalculator
()
|
|
virtual void
|
nextStep
(CStateCollection *oldState,
CAction *action, double
reward, CStateCollection
*newState)
|
| |
virtual function, to be implemented by
subclass
|
|
virtual void
|
onParametersChanged
()
|
| |
Updates all data elements represents
parameters.
|
|
virtual void
|
resetCalculator
()
|
| |
Reset the targetValue.
|
Protected Attributes
|
|
double
|
alpha |
|
double
|
targetValue |
|
int
|
nSteps |
|
int
|
nStepsPerUpdate |
Detailed Description
Adaptive Parameter Calculator which calculates the parameter's
value from the current average reward.
The target value in this class is the current average reward.
The target value gets resetted the minimim expected reward if a new
learning trial has started. This adaptive parameter has to be added
to the agent's listener list in order to calculate the average
reward. The average reward is calculated dynamically with the
formular averagereward_t+1 = averagereward_t * alpha + reward_t+1 *
(1 - alpha). Alpha can be set with the parameter
"APRewardUpdateRate" and defines the update rate of the average
reward. Alpha should be choosen close to 0.99 to get good results.
The average reward is not resetted when a new episode begins. For
more details see the super class. Parameters of CAdaptiveParameterFromNStepsCalculator:
Constructor & Destructor Documentation
|
CAdaptiveParameterFromAverageRewardCalculator::CAdaptiveParameterFromAverageRewardCalculator
|
( |
CParameters *
|
targetObject,
|
|
|
string |
targetParameter,
|
|
|
CRewardFunction *
|
reward,
|
|
|
int |
nStepsPerUpdate,
|
|
|
int |
functionKind,
|
|
|
double |
paramMin,
|
|
|
double |
paramMax,
|
|
|
double |
targetMin,
|
|
|
double |
targetMax,
|
|
|
double |
alpha |
|
) |
|
|
|
CAdaptiveParameterFromAverageRewardCalculator::~CAdaptiveParameterFromAverageRewardCalculator
|
( |
|
) |
|
|
Member Function Documentation
| virtual void
CAdaptiveParameterFromAverageRewardCalculator::onParametersChanged
|
( |
|
) |
[virtual] |
|
| virtual void
CAdaptiveParameterFromAverageRewardCalculator::resetCalculator
|
( |
|
) |
[virtual] |
|
| |
Reset the targetValue.
This function is used for resetting for example the steps or
number of episodes when learning is restarted. (used for parameter
evaluation)
Implements CAdaptiveParameterCalculator.
|
Member Data Documentation
The documentation for this class was generated from the following
file:
|