A Markov decision process (MDP) with n states can be solved using the value iteration algorithm in O(n^2) time, as opposed to the Omega(n^3) time required if one uses a linear program. Adding expected budget constraints on the MDP policy breaks the value iteration algorithm, but such constraints can be easily captured in the linear program formulation. We show the fastest known algorithms for MDPs with expected budget constraints, achieving running times comparable to those of value iteration. In specific, we show two algorithms for solving MDPs with m expected budget constraints, giving the exact solution in O(poly(m) n^2) time or an approximately feasible solution in O(log(m) n^2) time.

Comments

Warning: Creating default object from empty value in /home/nedd/neddimitrov.org/modules/CGExtensions/CGExtensions.module.php on line 630

## Add A Comment