10.2 Unsupervised Learning

This chapter has so far considered supervised learning, where target features are observed in the training data. In unsupervised learning, the target features are not given in the training examples. The aim is to construct a natural classification for the data.

One general method for unsupervised learning is clustering, which partitions the examples into clusters or classes. Each class predicts feature values for the examples in the class. Each clustering has a prediction error on the predictions. The best clustering is the one that minimizes the error.

Example 10.8.

A diagnostic assistant may want to group treatments into groups that predict the desirable and undesirable effects of the treatment. The assistant may not want to give a patient a drug because similar drugs may have had disastrous effects on similar patients.

An intelligent tutoring system may want to cluster students’ learning behavior so that strategies that work for one member of a class may work for other members.

In hard clustering, each example is placed definitively in a class. The class is then used to predict the feature values of the example. The alternative to hard clustering is soft clustering, in which each example has a probability distribution over its class. The prediction of the values for the features of an example is the weighted average of the predictions of the classes the example is in, weighted by the probability of the example being in the class. Soft clustering is described in Section 10.2.2.

10.2.1 k-Means

10.2.2 Expectation Maximization for Soft Clustering

Artificial Intelligence 2E

10.2 Unsupervised Learning

Example 10.8.

Artificial
Intelligence 2E