Discretization
Had this talk first with Soham about discretizing lines into a discrete space, and I was like, this is hard?
Discretization is really an important problem that I must understand how to solve.
The hard part about discretizing an MDP is figuring out this new transition model.
There are 2 ways to do it, described in CS287 - Advanced Robotics:
- Nearest neighbour
- Stochastic Interpolation
Nearest Neighbour
Find the closest point, snapping to the nearest neighbor (round it off) There are 2 issues:
- Cheat by moving 0.5 (the bare minimum), and you can snap to the next one
- It seems like you are not moving, when you really are, just a little bit
Finer and finer grid can help solve these problems.
From an implementation perspective, you define your discrete grid self.state_points
. And then find the nearest points to you cont_state
by taking the argmin of the absolute difference values.
- Note: In the above we assume a transition matrix of size (S+1, A, S+1), see Value Iteration code as well
Another fastest approach instead to find the closest_i
is to use a KD Tree, which finds it in instead of which is the case for argmin
.
Ahh Serendipity: This was what Soham was doing with the KD Tree.
Stochastic Interpolation
Another solution is to be a little bit more stochastic.
In a 2D grid system,You take the 4 nearest discrete points, and use a transition probability by how close the continuous point is to each of the 4 discrete points.
- NOTE: The above is wrong, the and should switch positions in the diagram.
Implementation
I actually implement discretization in homework 1, which is super cool, because I get hands-on experience now.
Some examples of continuous spaces that I need to solve:
- Mountain Car
- Double Integral