foundations of computational agents
Utility theory, as presented here, was developed by Neumann and Morgenstern [1953] and further developed by Savage [1972]. Keeney and Raiffa [1976] discuss utility theory, concentrating on multi-attribute (feature-based) utility functions. The axioms for discounting are by Koopmans [1972]; Bleichrodt et al. [2008] provide a debugged version and a proof. For work on graphical models of utility and preferences, see Bacchus and Grove [1995] and Boutilier et al. [2004]. Walsh [2007] and Rossi et al. [2011] overview the use of preferences in AI.
Kahneman [2011] discusses the psychology behind how people make decisions under uncertainty and motivates prospect theory. Wakker [2010] provides a textbook overview of utility and prospect theories.
Decision networks or influence diagrams were invented by Howard and Matheson [1984]. A method using dynamic programming for solving influence diagrams can be found in Shachter and Peot [1992]. The value of information and control is discussed by Matheson [1990].
MDPs were invented by Bellman [1957] and are discussed by Puterman [1994] and Bertsekas [2017]. Mausam and Kolobov [2012] overview MDPs in AI. Boutilier et al. [1999] review lifting MDPs to features, known as decision-theoretic planning.