Trading performance for stability in Markov decision processes

Investor logo

Warning

This publication doesn't include Faculty of Economics and Administration. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

BRÁZDIL Tomáš CHATTERJEE Krishnendu FOREJT Vojtěch KUČERA Antonín

Year of publication 2017
Type Article in Periodical
Magazine / Source Journal of Computer and System Sciences
MU Faculty or unit

Faculty of Informatics

Citation
Doi http://dx.doi.org/10.1016/j.jcss.2016.09.009
Field Informatics
Keywords Markov decision processes; Mean payoff; Stability; Stochastic systems; Controller synthesis
Description We study controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize the expected mean-payoff performance and stability (also known as variability in the literature). We argue that the basic notion of expressing the stability using the statistical variance of the mean payoff is sometimes insufficient, and propose an alternative definition. We show that a strategy ensuring both the expected mean payoff and the variance below given bounds requires randomization and memory, under both the above definitions. We then show that the problem of finding such a strategy can be expressed as a set of constraints.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.