Gesellschaft für Informatik e.V.

Lecture Notes in Informatics

Datenbanksysteme in Business, Technologie und Web (BTW) P-144, 107-116 (2009).

Gesellschaft für Informatik, Bonn

Copyright © Gesellschaft für Informatik, Bonn


Easy Tasks Dominate Information Retrieval Evaluation Results

Th. Mandl


The evaluation of information retrieval systems involves the creation of potential user needs for which systems try to find relevant documents. The difficulty of these topics differs greatly and final scores for systems are typically based on the mean average. However, the topics which are relatively easy to solve, have a much larger impact on the final system ranking than hard topics. This paper presents a novel methodology to measure that effect. The results of a large evaluation experiment with 100 topics from the Cross Language Evaluation Forum (CLEF) allow a split of the topics into four groups according to difficulty. The easy topics have a larger impact especially for multilingual retrieval. Nevertheless the internal test reliability as measured by Cronbach's Alpha is higher for more difficult topics. We can show how alternative, robust measures like the geometric average distribute the effect of the topics more evenly.

Full Text: PDF

Gesellschaft für Informatik, Bonn
ISBN 978-3-88579-238-3

Last changed 04.10.2013 18:20:32