Institut für Informatik

Technical Report No. 137 - Abstract

Matthias O. Will
Metadata and library mining: Analyzing the usage of a distributed electronic library

In order to analyze the usage of digital Web-based libraries, common log file analyzers are not enough because they are only capable of examining the user's surfing behaviour at a very shallo level. As digital document repositories gain importance for researchers and other interested parties, it is crucial to obtain more specific information about the types of visited documents consulted by specific users and their interactions on a specific document repository. For these purposes, it is not sufficient to analyze the data stored in the server logfiles, but more knowledge about a server's content is needed. Furthermore, if the user's interactions with the system which go beyond requesting and retrieving a specific document are to be traced, suitable action types have to be implemented into the system which result in additional logging data when selected by a user. Many publishing houses are reluctant in offering electronic publications on-line because conceiving and implementing interactive multimedia products is a costly process, and making them available to end-users is a service that publishers cannot offer for free. While discussing charging and billing mechanisms, it is also necessary to be able to determine whether an electronic product is successful or should be discontinued. While this can be determined by looking at the sales in the traditional publishing business, in a digital library, it is possible to determine the way specific modules are accessed if suitable metadata is available. Thus, a successful analysis consists of a combination of Web mining and metadata retrieval. In this paper, we first propose a suitable architecture to incorporate metadata into a digital library. In reviewing the logging behavior of traditional Web servers, we will discuss the requirements for library mining, which cannot be met by available web mining tools. The main section will discuss the design and implementation of a suitable analyzing tool for digital libraries. We conclude by pointing out various applications for library mining.

Report No. 137 (PostScript)