You can formulate pole balancing as a MDP, or classical Control problem.