Fast Budgeted MDPs

A Markov decision process (MDP) with n states can be solved using the value iteration algorithm in O(n^2) time, as opposed to the Omega(n^3) time required if one uses a linear program. Adding expected budget constraints on the MDP policy breaks the value iteration algorithm, but such constraints can be easily captured in the linear program formulation. We show the fastest known algorithms for MDPs with expected budget constraints, achieving running times comparable to those of value iteration.  In specific, we show two algorithms for solving MDPs with m expected budget constraints, giving the exact solution in O(poly(m) n^2) time or an approximately feasible solution in O(log(m) n^2) time.


Add a comment

Add A Comment

Visual Captcha
Code in the picture:
Your Name(*):

© Copyright 2004-2020 - Ned Dimitrov