foundations of computational agents
Bayes classifiers are discussed by Duda et al.  and Langley et al. . Friedman and Goldszmidt [1996a] discuss how the naive Bayes classifier can be generalized to allow for more appropriate independence assumptions. TAN networks are described by Friedman et al. . Latent tree models are described by Zhang .
Bayesian learning is overviewed by Loredo , Jaynes , [MacKay, 2003], and Howson and Urbach . See also books on Bayesian statistics such as Gelman et al.  and Bernardo and Smith . Bayesian learning of decision trees is described in Buntine . Grünwald  discusses the MDL principle. Ghahramani  reviews how Bayesian probability is used in AI.
For an overview of learning belief networks, see Heckerman , Darwiche , and Koller and Friedman . Structure learning using decision trees is based on Friedman and Goldszmidt [1996b]. The Bayesian information criteria is due to Schwarz . Note that our definition is slightly different; the definition of Schwarz is justified by a more complex Bayesian argument. Modeling missing data is discussed by Marlin et al.  and Mohan and Pearl .