Class VIcore

java.lang.Object
  extended by VIcore

public class VIcore
extends java.lang.Object

This is part of the applet to demonstrates value iteration for a particular grid world problem. It isn't designed to be general or reusable. This is the core part of the code that does the value iteration.

The code is available at VIcore.java. You also need VIgui.java.

Copyright (C) 2006-2007 David Poole.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.


Field Summary
 boolean absorbing
           
 double discount
           
 double[][][] qvalues
          qvalues[x][y][a] gives the Q-value for doing action a in the (x,y) state
 double[][] values
          values[x][y] gives the Value for the (x,y) state
 
Constructor Summary
VIcore()
           
 
Method Summary
 double contribution(int xval, int yval, int dir)
          determines the contribution to the q-value if the agent actually went in direction dir from the (xval,yval) location.
 void doreset(double initVal)
          resets the Q-values.
 void dostep(double newdiscount)
          does one step of value iteration
 double q(int xval, int yval, int action)
          computes the next Q-value from the previous value function
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

values

public double[][] values
values[x][y] gives the Value for the (x,y) state


qvalues

public double[][][] qvalues
qvalues[x][y][a] gives the Q-value for doing action a in the (x,y) state


discount

public double discount

absorbing

public boolean absorbing
Constructor Detail

VIcore

public VIcore()
Method Detail

dostep

public void dostep(double newdiscount)
does one step of value iteration

Parameters:
newdiscount - the discount to use

q

public double q(int xval,
                int yval,
                int action)
computes the next Q-value from the previous value function


contribution

public double contribution(int xval,
                           int yval,
                           int dir)
determines the contribution to the q-value if the agent actually went in direction dir from the (xval,yval) location.

Parameters:
xval - the x-position
yval - the y-position
dir - the direction the agent goes (not the action)

doreset

public void doreset(double initVal)
resets the Q-values. Sets all of the Q-values to initVal, and all of the visit counts to 0

Parameters:
initVal - the initial value to set all values to