Artificial Intelligence - foundations of computational agents -- 9.3 Sequential Decisions

Third edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2023 is now available (including the full text).

9.3 Sequential Decisions

Generally, agents do not make decisions in the dark without observing something about the world, nor do they make just a single decision. A more typical scenario is that the agent makes an observation, decides on an action, carries out that action, makes observations in the resulting world, then makes another decision conditioned on the observations, and so on. Subsequent actions can depend on what is observed, and what is observed can depend on previous actions. In this scenario, it is often the case that the sole reason for carrying out an action is to provide information for future actions.

A sequential decision problem is a sequence of decisions, where for each decision you should consider

what actions are available to the agent;
what information is, or will be, available to the agent when it has to act;
the effects of the actions; and
the desirability of these effects.

Example 9.10: Consider a simple case of diagnosis where a doctor first gets to choose some tests and then gets to treat the patient, taking into account the outcome of the tests. The reason the doctor may decide to do a test is so that some information (the test results) will be available at the next stage when treatment may be performed. The test results will be information that is available when the treatment is decided, but not when the test is decided. It is often a good idea to test, even if testing itself can harm the patient.

The actions available are the possible tests and the possible treatments. When the test decision is made, the information available will be the symptoms exhibited by the patient. When the treatment decision is made, the information available will be the patient's symptoms, what tests were performed, and the test results. The effect of the test is the test result, which depends on what test was performed and what is wrong with the patient. The effect of the treatment is some function of the treatment and what is wrong with the patient. The utility includes, for example, costs of tests and treatments, the pain and inconvenience to the patient in the short term, and the long-term prognosis.