Self-Organizing Maps and Case-Based Reasoning

Timo Honkela
Media Lab
University of Art and Design Helsinki

Kohonen's Self-Organizing Map (SOM) is a means for automatically arranging high-dimensional statistical data. The map attempts to represent the input samples with optimal accuracy using a restricted set of models. The models also become ordered on the map grid so that similar models are close to each other and dissimilar models far from each other. The SOM is useful in clustering, abstraction, and visualization through dimensionality reduction. The unsupervised learning scheme of the SOM makes it suited for applications in which the input data cannot be labeled.

In Case-Based Reasoning (CBR) problem solving is based in a collection of past cases rather than being encoded in generic rules or other knowledge descriptions. Each case typically contains a description of a problem and a solution. In order to solve a current problem it is matched against the cases in the case base, and similar cases are retrieved. The retrieved cases are used to suggest a solution which is reused and tested for success. The solution may be revised if necessary. The current problem and the final solution are stored as part of a new case.

Self-Organizing Maps and Case-Based Reasoning share several features as methods for building intelligent systems. The main similarity is the data-driven approach. In this talk the similarities and potential ways to combine the approaches are considered.

For instance, the following problem is considered in the talk. While analyzing a collection of samples or cases of any kind one should consider whether the they come essentially from a similar source and are thus comparable with each other. Consider, e.g., a process monitoring system of a factory in which the machinery is changed to a large extent. Afterwards the comparison between the ``old'' and the ``new'' data is questionable. The same kind of question can be asked when medical measurements are considered. If a set of people is studied, one can question whether all the persons are similar enough with respect to the phenomenon being studied. It is important to discover patterns and hidden variables that are relevant only for a small portion of the whole set of individuals or cases.