CPrioritizedSweeping Class Reference
class
for model based Prioritized Sweeping
More...
#include <cprioritizedsweeping.h>
Inheritance diagram for CPrioritizedSweeping:
List of all
members.
|
Public Member Functions
|
|
|
CPrioritizedSweeping
(CFeatureQFunction
*qFunction,
CStateModifier
*discretizer, CAbstractFeatureStochasticModel
*model,
CFeatureRewardFunction
*rewardFunction, int kSteps)
|
|
|
CPrioritizedSweeping
(CFeatureVFunction
*vFunction,
CStateModifier
*discretizer, CAbstractFeatureStochasticModel
*model,
CFeatureRewardFunction
*rewardFunction, int kSteps)
|
|
virtual
|
~CPrioritizedSweeping
()
|
|
virtual void
|
nextStep
(CStateCollection *oldState,
CAction *action,
CStateCollection
*newState)
|
| |
Updates the features of the old State.
|
|
CFeatureCalculator
*
|
getFeatureCalculator
()
|
Protected Attributes
|
|
int
|
kSteps |
| |
kSteps updates are done each step.
|
Detailed Description
class for model based Prioritized Sweeping
Prioritized Sweeping is used if the model is learned during the
training trial and is very similiar to value iteration. Since it
would be too complex to do value iteration each time the model
changes, only the first k states from the priority list are updated
each step, starting with the current features. The Prioritized
Sweeping class is subclass of CValueIteration, the only
extension is that, each time a nextStep event occurs, the current
features are updated (and so the backward states of the current
features are added to the priority list) and than the states from
the list are updated k times. Since it is subclass of CValueIteration it provides
the full functionality for state updates. The prioritized sweeping
class always learns the value function of the greedy policy, so its
the optimal value function.
The class only provides Q-Function learning because the
Q-Function is needed for the policies. Additionally it takes a
model (CAbstractFeatureStochasticModel)
and a feature reward function as parameters. The model (and the
reward function, if it is a reward model itself) has to be added to
the agents listener list before the prioritized sweeping algorithm
is added.
Constructor & Destructor Documentation
| |
The class only provides Q-Function learning because the
Q-Function is needed for the policies. Additionally it takes a
model (CAbstractFeatureStochasticModel)
and a feature reward function as parameters. The model (and the
reward function, if it is a reward model itself) has to be added to
the agents listener list before the prioritized sweeping algorithm
is added.
|
| virtual
CPrioritizedSweeping::~CPrioritizedSweeping
|
( |
|
) |
[virtual] |
|
Member Function Documentation
| |
Updates the features of the old State.
Retrieves the state from the statecollection (with the given
modifier pointer from the constructor), and updates each feature.
So the backwards states priorities get updated as well. After that,
the first kSteps states in the list are updated.
Reimplemented from CSemiMDPListener.
|
Member Data Documentation
| |
kSteps updates are done each step.
|
The documentation for this class was generated from the following
file:
|