12.6.7 Limitations

Thus far, we have assumed a very simple form of natural language. Our aim was to show what could be easily accomplished with simple tools rather than with a comprehensive study of natural language. Useful front ends to databases can be built with the tools presented by, for example, constraining the domain sufficiently and asking the user, if necessary, which of multiple competing interpretations are intended.

This discussion of natural language processing assumes that natural language is compositional; the meaning of the whole can be derived from the meaning of the parts. Compositionality is, in general, a false assumption. You usually must know the context in the discourse and the situation in the world to discern what is meant by an utterance. Many types of ambiguity exist that can only be resolved by understanding the context of the words.

For example, you cannot always determine the correct reference of a description without knowledge of the context and the situation. A description does not always refer to a uniquely determined individual.

Example 12.40: Consider the following paragraph:
The student took many courses. Two computer science courses and one mathematics course were particularly difficult. The mathematics course...

The referent is defined by the context and not just the description "The mathematics course." There could be more mathematics courses, but we know from context that the phrase is referring to the particularly difficult one taken by the student.

Many problems of reference arise in database applications if the use of "the" or "it" is allowed or if words that have more than one meaning are permitted. Context is used to disambiguate references in natural language. Consider:

Who is the head of the mathematics department?
Who is her secretary?

It is clear from the previous sentence who "her" refers to, as long as the reader understands that heads are people who have a gender, but departments do not.