14.2 Flexible Representations

The third edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2023 is now available (including full text).

14.2.3 Classes

Typically, you know more about a domain than a database of facts; you know general rules from which other facts can be derived. Which facts are explicitly given and which are derived is a choice to be made when designing and building a knowledge base.

Primitive knowledge is knowledge that is specifies explicitly in terms of facts. Derived knowledge is knowledge that can be inferred from other knowledge. Derived knowledge is typically specified using rules.

The use of rules allows for a more compact representation of knowledge. Derived relations allow for conclusions to be drawn from observations of the domain. This is important because you do not directly observe everything about a domain. Much of what is known about a domain is inferred from the observations and more general knowledge.

A standard way to use derived knowledge is to put individuals into classes, and then give general properties to classes so that individuals inherit the properties of classes. Grouping individuals into classes enables a more concise representation because the members of a class can share the attributes they have in common (see the box). This is the same issue that was discussed in the context of probabilistic classifiers.

A class is the set of those actual and potential individuals that would be members of the class. This is typically an intensional set, defined by a characteristic function that is true of members of the set and false of other individuals. The alternative to an intensional set is an extensional set, which is defined by listing its elements.

For example, the class chair is the set of all things that would be chairs. We do not want the definition to be the set of things that are chairs, because chairs that have not yet been built also fall into the class of chairs. We do not want two classes to be equivalent just because they have the same members. For example, the class of green unicorns and the class of chairs that are exactly 124 meters high are different classes, even though they may contain the same elements; they are both empty. A 124 meter high chair would not be a green unicorn.

The definition of class allows any set that can be described to be a class. For example, the set consisting of the number 17, the Tower of London, and the Prime Minister of Canada’s left foot may be a class, but it is not very useful. A natural kind is a class such that describing individuals using the class is more succinct than describing individuals without the class. For example, “mammal” is a natural kind, because describing the common attributes of mammals makes a knowledge base that uses “mammal” more succinct than one that does not use “mammal” and instead repeats the attributes for every individual.

Class S is a subclass of class C means that S is a subset of C. That is, every individual of type S is of type C.

Example 14.8.

Example 14.7 explicitly specified that the logo for computer comp_2347 was a lemon icon. You may, however, know that all Lemon brand computers have this logo. An alternative representation is to associate the logo with lemon_computer and derive the logo of comp_2347. The advantage of this representation is that if you find another Lemon brand computer, you can infer its logo. Similarly each Lemon Laptop 10000 may weigh 1.1 kg.

An extended example is shown in Figure 14.2, where the shaded rectangles are classes, and arcs from classes are not the properties of the class but properties of the members of the class. The class of Lemon laptop 10000s would weigh much more than 1.1 kg.

Figure 14.2: A semantic network allowing inheritance. Shaded nodes are classes.

The relationship between types and subclasses can be written as a definite clause:

prop(X,type,C)
    prop(S,subClassOf,C)
    prop(X,type,S).

You can treat type and subClassOf as special properties that allow property inheritance. Property inheritance occurs when a value for a property is specified at the class level and inherited by the members of the class. If all members of class c have value v for property p, this can be written in Datalog as

prop(Ind,p,v)
    prop(Ind,type,c).

which, together with the aforementioned rule that relates types and subclasses, can be used for property inheritance.

Example 14.9.

All Lemon computers have a lemon icon as a logo and have color yellow and color green (see the logo and color arcs in Figure 14.2). All Lemon laptops 10000 have a weight of 1.1 kg. Lemon laptop 10000 is a subclass of Lemon computers. Computer comp_2347 is a Lemon laptop 10000. This knowledge can be represented by the following Datalog program:

prop(X,has_logo,lemon_icon)
    prop(X,type,lemon_computer).
prop(X,has_color,green)
    prop(X,type,lemon_computer).
prop(X,has_color,yellow)
    prop(X,type,lemon_computer).
prop(X,weight_kg,1.1)
    prop(X,type,lemon_laptop_10000).
prop(lemon_laptop_10000,subClassOf,lemon_computer).
prop(comp_2347,type,lemon_laptop_10000).

From this Datalog program, and the clause involving subClassOf above, the logo, colors and weight of comp_2347 can be derived. With the structured representation, to incorporate a new Lemon Laptop 10000, you only declare that it is a Lemon laptop 10000 and the colors, logo and weight can be derived through inheritance.

Some general guidelines are useful for deciding what should be primitive and what should be derived:

  • When associating an attribute with an individual, select the most general class C that the individual is in, where all members of C have that attribute, and associate the attribute with class C. Inheritance can be used to derive the attribute for the individual and all other members of class C. This representation methodology tends to make knowledge bases more concise, and it means that it is easier to incorporate new individuals because members of C automatically inherit the attribute.

  • Do not associate a contingent attribute of a class with the class. A contingent attribute is one whose value changes when circumstances change. For example, it may be true of the current computer environment that all of the computers come in cardboard boxes. However, it may not be a good idea to put that as an attribute of the computer class, because it would not be expected to be true as other computers are bought.

  • Axiomatize in the causal direction. If a choice exists between making the cause primitive or the effect primitive, make the cause primitive. The information is then more likely to be stable when the domain changes. See Example 5.36.

Classes in Knowledge Bases and Object-Oriented Programming

The use of “individuals” and “classes” in knowledge-based systems is very similar to the use of “objects” and “classes” in object-oriented programming (OOP) languages such as Smalltalk, Python or Java. This should not be too surprising because they have an interrelated history. But there are important differences that tend to make the direct analogy often more confusing than helpful:

  • Objects in OOP are computational objects; they are data structures and associated programs. A “person” object in Java is not a person. However, individuals in a knowledge base (KB) are (typically) things in the real world. A “person” individual in a KB can be a real person. A “chair” individual can be a real chair you can actually sit in; it can hurt you if you bump into it. You can send a message to, and get answers from, a “chair” object in Java, whereas a chair in the real world tends to ignore what you tell it. A KB is not typically used to interact with a chair, but to reason about a chair. A real chair stays where it is unless it is moved by a physical agent.

  • In a KB, a representation of an object is only an approximation at one (or a few) levels of abstraction. Real objects tend to be much more complicated than what is represented. You typically do not represent the individual fibers in the fabric of a chair. In an OOP system, there are only the represented properties of an object. The system can know everything about a Java object, but not about a real individual.

  • The class structure of Java is intended to represent designed objects. A systems analyst or a programmer gets to create a design. For example, in Java, an object is only a member of one lowest-level class. There is no multiple inheritance. Real objects are not so well behaved. The same person could be a football coach, a mathematician, and a mother.

  • A computer program cannot be uncertain about its data structures; it has to select particular data structures to use. However, you can be uncertain about the types of things in the world.

  • The representations in a KB do not actually do anything. In an OOP system, objects do computational work. In a KB, they just represent – that is, they just refer to objects in the world.

  • While an object-oriented modeling language, like UML, may be used for representing KBs, it may not be the best choice. A good OO modeling tool has facilities to help build good designs. However, the world being modeled may not have a good design at all. Trying to force a good design paradigm on a messy world may not be productive.