An Analysis of Search Failures in Online Library Catalogs

Abstract

An Analysis of Search Failures in Online Library Catalogs

Yaşar Ahmet Tonta
Doctor of Philosophy in Library and Information Studies
University of California at Berkeley
Professor Michael D. Cooper, Chair

This study investigates the causes of search failures that occur in online library catalogs by developing a conceptual model of search failures and examines the retrieval performance of an experimental online catalog by means of transaction logs, questionnaires, and the critical incident technique. It analyzes retrieval effectiveness of 228 queries from 45 users by employing precision and recall measures, identifying user-designated ineffective searches, and comparing them quantitatively and qualitatively with precision and recall ratios for corresponding searches. The dissertation tests the hypothesis that users' assessments of retrieval effectiveness differ from retrieval performance as measured by precision and recall and that increasing the match between the users' vocabulary and that of the system by means of clustering and relevance feedback techniques will improve the performance and help reduce failures in online catalogs.

In the experiment half the records retrieved were judged relevant by the users (precision) before relevance feedback searches. Yet, the system retrieved only about 25% of the relevant documents in the database (recall). As should be expected, precision ratios decreased (18%) while recall ratios increased (45%) as users performed relevance feedback searches. A multiple linear regression model, which was developed to examine the relationship between retrieval effectiveness and users' judgments of the search performance, found that users' assessments of the effectiveness of their searches was the most significant factor in explaining precision and recall ratios. Yet, there was no strong correlation between precision and recall ratios and user characteristics (i.e., frequency of online catalog use and knowledge of online searching) and users' own assessments of search performance (i.e., search effectiveness, finding what is wanted). Thus, user characteristics and users' assessments of retrieval effectiveness are not adequate measures to predict system performance as measured by precision and recall ratios.

The qualitative analysis showed that search failures due to zero retrievals and vocabulary mismatch occurred much less frequently in the online catalog studied. It was concluded that classification clustering and relevance feedback techniques that are available in some probabilistic online catalogs help decrease the number of search failures considerably.

Michael D. Cooper, Chair

December 1, 1992

Go to Table of Contents