CQStochasticPolicy Class Reference
Stochastic Policy
which computes its propabilities from the Q-Values of.
More...
#include <cpolicies.h>
Inheritance diagram for CQStochasticPolicy:
List of all
members.
|
Public Member Functions
|
|
|
CQStochasticPolicy
(CActionSet
*actions,
CActionDistribution
*distribution,
CAbstractQFunction
*qfunction)
|
|
|
~CQStochasticPolicy
()
|
|
virtual void
|
getActionValues
(CStateCollection *state,
CActionSet
*availableActions, double
*actionValues,
CActionDataSet
*actionDataSet=NULL)
|
| |
Interface function for calculating the
action ratings, has to be implemented by the subclasses.
|
|
virtual void
|
getActionGradient
(CStateCollection *state,
CAction *action,
CActionData *data,
CFeatureList
*gradientState)
|
| |
Interface function for calculating the
derivative of an action factor.
|
|
virtual bool
|
isDifferentiable
()
|
|
virtual CAbstractQFunction
*
|
getQFunction
()
|
Protected Member Functions
|
|
virtual void
|
getActionStatistics
(CStateCollection *state,
CAction *action,
CActionStatistics
*stat)
|
| |
returns the action statistics object from
the q-function
|
Protected Attributes
|
|
CAbstractQFunction
*
|
qfunction |
| |
QFunction of the policy, needed for action
decision.
|
Detailed Description
Stochastic Policy which computes its propabilities from the
Q-Values of.
This stochastic policy calculates its action ratings according
to the given Q-Function. The getActionValues function writes the
Q-Values in the actionFactors array. The Q-Stochastic Policies also
support gradient calculation. The policy is differentiable, if the
distribution and the Q-Function are differentiable. The gradient
d_actionratings(action) / dw calculated in the function
getActionGradient is the same as dQ(s,a)/dw.
Constructor & Destructor Documentation
|
CQStochasticPolicy::~CQStochasticPolicy
|
( |
|
) |
|
|
Member Function Documentation
| |
Interface function for calculating the derivative of an action
factor.
The function has to calculate d_actionratings(action)/dw, which
is for example dQ(s,a)/dw.
Reimplemented from CStochasticPolicy.
Reimplemented in CVMStochasticPolicy.
|
| |
returns the action statistics object from the q-function
Reimplemented from CStochasticPolicy.
|
| virtual bool
CQStochasticPolicy::isDifferentiable
|
( |
|
) |
[virtual] |
|
Member Data Documentation
| |
QFunction of the policy, needed for action
decision.
|
The documentation for this class was generated from the following
file:
|