rss
J Am Med Inform Assoc doi:10.1136/amiajnl-2011-000293
  • Research and applications

Leveraging medical thesauri and physician feedback for improving medical literature retrieval for case queries

  1. Martin S Kohn2
  1. 1Department of Computer Science, University of Illinois at Urbana–Champaign, Urbana, Illinois, USA
  2. 2IBM Thomas J Watson Research Center, Yorktown Heights, New York, USA
  1. Correspondence to Parikshit Sondhi, Department of Computer Science, University of Illinois at Urbana–Champaign, 201 North Goodwin Avenue, Urbana, IL 61801-2302, USA; sondhi1{at}illinois.edu
  • Received 31 March 2011
  • Accepted 19 February 2012
  • Published Online First 21 March 2012

Abstract

Objective This paper presents a study of methods for medical literature retrieval for case queries, in which the goal is to retrieve literature articles similar to a given patient case. In particular, it focuses on analyzing the performance of state-of-the-art general retrieval methods and improving them by the use of medical thesauri and physician feedback.

Materials and Methods The Kullback–Leibler divergence retrieval model with Dirichlet smoothing is used as the state-of-the-art general retrieval method. Pseudorelevance feedback and term weighing methods are proposed by leveraging MeSH and UMLS thesauri. Evaluation is performed on a test collection recently created for the ImageCLEF medical case retrieval challenge.

Results Experimental results show that a well-tuned state-of-the-art general retrieval model achieves a mean average precision of 0.2754, but the performance can be improved by over 40% to 0.3980, through the proposed methods.

Discussion The results over the ImageCLEF test collection, which is currently the best collection available for the task, are encouraging. There are, however, limitations due to small evaluation set size. The analysis shows that further refinement of the methods is necessary before they can be really useful in a clinical setting.

Conclusion Medical case-based literature retrieval is a critical search application that presents a number of unique challenges. This analysis shows that the state-of-the-art general retrieval models are reasonably good for the task, but the performance can be significantly improved by developing new task-specific retrieval models that incorporate medical thesauri and physician feedback.

Footnotes

  • Funding This paper is based upon work supported in part by an IBM faculty award and by the National Science Foundation under grants IIS-0347933, IIS-0713581, IIS-0713571 and CNS-0834709. The sponsors had no role in any of the following: study design; in the collection, analysis and interpretation data; in the writing of the report; and in the decision to submit the paper for publication.

  • Competing interests None.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Related Article

Access policy for JAMIA

All content published in JAMIA is deposited with PubMed Central by the publisher with a 12 month embargo. Authors/funders may pay an Open Access fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication.

All content older than 12 months is freely available on this website.

AMIA members can log in with their JAMIA user name (email address) and password or via the AMIA website.