5.5 Knowledge-Level Debugging

The explicit use of semantics allows explanation and debugging at the knowledge level. To make a system usable by people, the system cannot just give an answer and expect the user to believe it. Consider the case of a system advising doctors who are legally responsible for the treatment that they carry out based on the diagnosis. The doctors must be convinced that the diagnosis is appropriate. The system must be able to justify that its answer is correct. The same mechanism can be used to explain how the system found a result and to debug the knowledge base; a good explanation should convince someone there are no bugs.

Knowledge-level debugging is the process of finding errors in knowledge bases with reference only to what the symbols mean and what is true in the world, not the reasoning steps.

Three types of non-syntactic errors arise in rule-based systems:

  • An incorrect answer is produced; that is, some atom that is false in the intended interpretation was derived.

  • An answer that was not produced; that is, the proof failed on a particular true atom when it should have succeeded.

  • The program gets into an infinite loop. These can be handled for the top-down proof procedure in a similar way to cycle pruning, but where only the selected atoms need to be checked for cycles, and not the whole answer clause. The bottom-up proof procedure never gets into an infinite loop.

Ways to debug the first two of these types of error are examined below.

5.5.1 Incorrect Answers

An incorrect answer, or false-positive error, is an answer that has been proved yet is false in the intended interpretation. An incorrect answer is only produced by a sound proof procedure if a false clause was used in the proof. The aim is to find a false clause from an incorrect answer.

Suppose atom g was proved yet is false in the intended interpretation. There must be a clause ga1ak in the knowledge base that was used to prove g. Either all of the ai are true, in which case the buggy clause has been found, or one of the ai is false. This ai can be debugged in the same way.

This leads to an algorithm, presented in Figure 5.6 to debug false positives. It can find a false clause in a knowledge base when an atom that is false in the intended interpretation is derived. It only requires the person debugging the knowledge base to be able to answer true–false questions.

1: procedure Debug_false(g,KB)
2:      Inputs
3:          KB a knowledge base
4:          g an atom: KBg and g is false in intended interpretation      
5:      Output
6:          clause in KB that is false
7:      Find the definite clause ga0akKB used to prove g
8:      Present the rule to the user and ask if each ai is true
9:      if user specifies ai is false then
10:          return Debug_false(ai,KB)
11:      else if user specifies all ai are true then
12:          return ga0ak      
Figure 5.6: An algorithm to debug false positive answers
Example 5.15.

Consider Example 5.8, involving the electrical domain, but assume there is a bug in the knowledge base. Suppose that the domain expert or user had inadvertently said that whether w1 is connected to w3 depends on the status of s3 instead of s1 (see Figure 5.2). Thus, the knowledge includes the following incorrect rule:

live_w1live_w3up_s3

instead of the rule with up_s1. All of the other axioms are the same as in Example 5.8. The atom lit_l1 can be derived, which is false in the intended interpretation.

The atom lit_l1 was derived using the following rule:

lit_l1light_l1live_l1ok_l1.

The atoms light_l1 and ok_l1 are true in the intended interpretation, but live_l1 is false in the intended interpretation. The rule used to derive this atom is

live_l1live_w0.

The atom live_w0 is false in the intended interpretation. It was proved using the clause

live_w0up_s2live_w1.

The atom live_w1 is false in the intended interpretation, and was proved using the clause

live_w1up_s3live_w3.

Both elements of the body are true in the intended interpretation, so this is a buggy rule.

5.5.2 Missing Answers

The second type of error occurs when an expected answer is not produced. This manifests itself by a failure when an answer is expected. An atom g that is true in the domain, but is not a consequence of the knowledge base, is a false-negative error. The preceding algorithm does not work in this case; there is no proof.

An appropriate answer is not produced only if a definite clause or clauses are missing from the knowledge base. By knowing the intended interpretation of the symbols and by knowing what queries should succeed (i.e., what is true in the intended interpretation), a domain expert can debug a missing answer. Figure 5.7 shows how to debug false negatives. Given g, a true atom for which there is no proof, it returns an atom for which there is a missing clause (or clauses).

1: procedure Debug_missing(g,KB)
2:      Inputs
3:          KB a knowledge base
4:          g an atom: KB⊬g and g is true in the intended interpretation      
5:      Output
6:          atom for which there is a clause missing
7:      if there is a definite clause ga1akKB
8:                   such that all ai are true in the intended interpretation then
9:          select ai that cannot be proved
10:          return Debug_missing(ai,KB)
11:      else
12:          return g      
Figure 5.7: An algorithm for debugging missing answers (false negatives)

It searches the space of plausible proofs until it finds an atom where there is no appropriate clause in the knowledge base.

Example 5.16.

Suppose that, for the axiomatization of the electrical domain in Example 5.8, the world of Figure 5.2 actually had s2 down. Thus, it is missing the definite clause specifying that s2 is down. The axiomatization of Example 5.8 fails to prove lit_l1 when it should succeed. Consider how to find the bug.

There is one clause with lit_l1 in the head:

lit_l1light_l1live_l1ok_l1.

All of the elements of the body are true. The atoms light_l1 and ok_l1 can both be proved, but live_l1 fails, so the algorithm recursively debugs this atom. There is one rule with live_l1 in the head:

live_l1live_w0.

The atom live_w0 is true in the intended interpretation and cannot be proved. The clauses for live_w0 are

live_w0live_w1up_s2.
live_w0live_w2down_s2.

The user can determine that the body of the second rule is true. There is a proof for live_w2. There are no clauses for down_s2, so this atom is returned. The correction is to add an appropriate clause, by stating it as a fact or providing a rule for it.