Describing artistic digital content:
creating and using connectionist metadata

(C) Timo Honkela

University of Art and Design Helsinki
Media Lab

timo@mlab.uiah.fi

Presented at the CIRCUS workshop
"Integration of Content, Style and Context",
Angoulême, France, 29.3.- 1.4.1999.

Metadata

It is clear that the use of standardised formats is beneficial, for instance, the widespread use of the World Wide Web would not have been possible without the adoption of html. Similarly, there are serious attempts to create standards for metadata, data about data, so that a piece of art stored in electronic form would include information about, e.g., its creator, source, identification and possible access restrictions. Moreover, metadata usually includes also a textual summary of the contents, a content description that provides information for the organisation and search of the data.

Artistic content and metadata

Especially if pictorial or sound data is considered a textual description is, of course, highly valuable. Often the description is based on a pre-defined classification or a list of keywords, i.e. a terminology base or on a thesaurus. However, even if the identity of the artist or the place of publishing can be rather easily determined unambiguously, the same is not true for the description of the contents. For instance, in the domain of information retrieval and databases of text documents, Furnas et al. (1987) have found that in spontaneous word choice for objects in five domains, two people favored the same term with less than 20% probability. Bates (1986) has shown that different indexers, well trained in an indexing scheme, might assign index terms for a given document differently. It has also been observed that an indexer might use different terms for the same document at different times. The meaning of an expression (queries, descriptions) in any domain is graded and changing, biased by the particular context.

Information retrieval based on classifications and keywords

The traditional key word based approach with Boolean logic has three basic problems. First, for Boolean queries there is no simple way of controlling the size of the output, and the output is not ranked in the order of relevancy. In addition, considering the results of a query it is not known what was not found, especially if the collection is unfamiliar. Third, if the domain of the query is not known well it is difficult to select the appropriate key words. Thus, even if the indexer or the metadata creator is able to find accurate descriptions of the content, the user of the metadata may not succeed in that, i.e. to use the same words or phrases.

Inevitable individuality in language use

It is a very basic problem in text document management that different words and phrases are used for expressing similar objects of interest. Natural languages are used for the communication between human beings, i.e., individuals with varying background, knowledge, and ways to express themselves. When artistic contents are considered this phenomenon should be more than evident. Therefore, if the content description is based on a rather small selection of keywords, the search of the contents may not be efficient.

Potential solutions

There are multiple potential solutions for the problems outlined above.

"Freedom of speech"
It may be useful not to define any artificial limitations for the descriptions. For instance, when the domain develops into directions which did not exist when the classification system was developed, problems arise.
Longer descriptions
If the content it described using large enough body of text the for better recall, i.e., higher likelihood for finding the information is greater. However, tools for ensuring precision are needed. Precision refers to the number of relevant retrieved documents over the total number of retrieved documents.
More context
If a word or an expression is seen without the context there are more possibilities for misunderstanding. Thus, for human reader the contextual information is often very beneficial. It can be both textual and multimodal.
Similarly, the methods that are used to manage data should be able to to deal with contextual information, or even provide context, the Self-Organizing Map (see Kohonen 1982, 95, and below) being an example of such a method. Document maps can provide context for information retrieval process.
Data speaks for itself
Often it is even possible to find relevant features from the data itself (see, e.g., Kohonen et al. 1997). However, a computerised method - using a some kind of autonomous agent - does not provide an "objective" classification of the data while any process of feature extraction, human or artificial, is based on some selections for which there are well-grounded alternatives.

More on Context

The following illustrations illuminate the effect of context in interpretation of data be it in mathematical, linguistic, or narrative domain.

Addition of one more dimension may bring a different view on the clustering of the data.
Ambiguity is an inherent feature of all natural languages. Short expressions such as 'open' are prone to have several interpretations. Only context (or cotext, the neighborhood in the text) can reveal the intended meaning that may be in this case 'open' as adjective or verb. Even inside the traditional word classes there are several interpretations that can be characterised by the following examples: "the door is open", "she is open to suggestions", "open the door", "open your heart". These examples do not highlight the need for logical, amibiguous formalisms. They do emphasise the need for methods that are able to deal with context, ambiguity and metaphors. These issues are discussed in detail by George Lakoff in his Women, Fire and Dangerous Things.
The interpretation of a situation in time is often very much dependent on the historical context that is taken into account.

Connectionist metadata

The connectionist models, or artificial neural networks, often combined with the principles of the so-called vector space model, appear to be a promising alternative to traditional keyword-based methods.

Kohonen's Self-Organizing Map (SOM) for data organisation

Perhaps the most typical notion of the SOM is to consider it as an artificial neural network model of the brain, especially of the experimentally found ordered "maps" in the cortex. There exists rather lot of neurophysiological evidence to support the idea that the SOM captures some of the fundamental processing principles of the brain.

The SOM is nowadays often used as a statistical tool for multivariate analysis. The SOM is both a projection method which maps high-dimensional data space into low-dimensional space, and a clustering method so that similar data samples tend to be mapped to nearby map units.

An ordered view on the information space can be provided using the SOM (see, e.g., http://www.cis.hut.fi/nnrc/som.html, or http://www.mlab.uiah.fi/~timo/som/.

By virtue of the Self-Organizing Map algorithm, also text documents can be mapped onto a two-dimensional grid so that related documents appear close to each other. The largest maps of document collections have been created in the WEBSOM project of Neural Networks Research Center at Helsinki University of Technology. Even millions of documents have been automatically positioned on a map.

The organisation of a map directly reflects the overall contents of the collection. Therefore the view on the texts is not skewed by any predetermined classification (see the illustration below in which a traditional classification system is used).

References

Bates, M. J. (1986).
Subject access in online catalog: a design model. Journal of the American Society of Information Science, 37(6): 357-376.

Furnas, G. W., Landauer, T. K., Gomez, L. M., and Dumais, S. T. (1987).
The vocabulary problem in human-system communication. Communications of the ACM, 30(11):964-971.

Honkela, T. (1997).
Self-Organizing Maps in Natural Language Processing. Thesis for the degree of Doctor of Philosophy, Helsinki University of Technology, Department of Computer Science and Engineering.
URL: http://www.cis.hut.fi/~tho/thesis/

Kohonen, T. (1982).
Self-organizing formation of topologically correct feature maps. Biological Cybernetics, 43(1):59-69.

Kohonen, T. (1995c).
Self-Organizing Maps. Springer, Berlin, Heidelberg.

Kohonen, T., Kaski, S., and Lappalainen, H. (1997).
Self-organized formation of various invariant-feature filters in the adaptive-subspace SOM. Neural Computation, 9:1321-1344.

Timo Honkela, April 28, 1999

Describing artistic digital content: creating and using connectionist metadata