Search filters to identify geriatric medicine in Medline
- Esther M M van de Glind1,2,
- Barbara C van Munster1,3,
- René Spijker2,4,
- Rob J P M Scholten2,
- Lotty Hooft2
- 1Department of Internal Medicine, section of Geriatric Medicine, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands
- 2Dutch Cochrane Centre, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands
- 3Gelre Hospitals, Department of Geriatric Medicine, Apeldoorn, The Netherlands
- 4Medical Library, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands
- Correspondence to Dr Lotty Hooft, Dutch Cochrane Centre, J1B-108-2, Academic Medical Center, PO Box 22660, 1100 DD Amsterdam, The Netherlands;
- Received 18 April 2011
- Accepted 29 August 2011
- Published Online First 23 September 2011
Objectives To create user-friendly search filters with high sensitivity, specificity, and precision to identify articles on geriatric medicine in Medline.
Design A diagnostic test assessment framework was used. A reference set of 2255 articles was created by hand-searching 22 biomedical journals in Medline, and each article was labeled as ‘relevant’, ‘not relevant’, or ‘possibly relevant’ for geriatric medicine. From the relevant articles, search terms were identified to compile different search strategies. The articles retrieved by the various search strategies were compared with articles from the reference set as the index test to create the search filters.
Measures Sensitivity, specificity, precision, accuracy, and number-needed-to-read (NNR) were calculated by comparing the results retrieved by the different search strategies with the reference set.
Results The most sensitive search filter had a sensitivity of 94.8%, a specificity of 88.7%, a precision of 73.0%, and an accuracy of 90.2%. It had an NNR of 1.37. The most specific search filter had a specificity of 96.6%, a sensitivity of 69.1%, a precision of 86.6%, and an accuracy of 89.9%. It had an NNR of 1.15.
Conclusion These geriatric search filters simplify searching for relevant literature and therefore contribute to better evidence-based practice. The filters are useful to both the clinician who wants to find a quick answer to a clinical question and the researcher who wants to find as many relevant articles as possible without retrieving too many irrelevant articles.
- Evidence-based practice
- search filters
- sensitivity and specificity
- evidence-based medicine
The aging population is increasing demand on healthcare. Geriatric patients often have multiple chronic conditions, use many medications, and may have cognitive and functional impairments. A study on prevalence of morbidities in older people showed that 82% of patients aged 65 and over had at least one chronic condition, and 24% had four or more.1 Owing to deteriorating organ functions, they are prone to medication-related side effects.2 Consequently, care of older people is complicated. To provide the best care, doctors need to be able to find relevant information quickly and easily.
Information specific to geriatric patients is hard to find for several reasons. Geriatric medicine overlaps with, among others, psychiatry, internal medicine, and neurology, and therefore information relating to geriatrics is published in a wide range of journals. In addition, the amount of available information is increasing at a rapid rate, and time for searching is limited. Even though bibliographic databases often provide tools to improve the performance of searching (eg, Medical Subject Heading (MeSH) terms in Medline), using these correctly can be challenging. Moreover, indexing in Medline is not always consistent.3–5 Furthermore, there is a time lag between articles being published and being indexed with MeSH terms in Medline, with the result that recently published articles will not be found when only MeSH terms are used.
With a search strategy or ‘filter’ focused on geriatrics, clinicians, policy-makers, librarians, and information specialists would be able to find the answers to clinical questions more quickly than with a general search in the whole database. Searchers could, for example, combine ‘cardiac failure’ with a geriatrics search filter to improve the precision of retrieving articles relevant to the case at hand.
Researchers have previously developed Medline search filters for other branches of medicine. Search filters consist of MeSH terms and text words in titles and abstracts that are related to the subject of the intended search. Iansavichus et al developed search filters for renal information for Embase,6 Gehanno and colleagues created a search filter to identify studies on return-to-work,7 and Boluyt et al tested the sensitivity and precision of search filters for retrieving child health systematic reviews.8
In 2006, Kastner et al developed search strategies to identify relevant articles for several age-specific categories.9 These strategies were constructed from search terms concerning age groups, whereas we aim to create search filters that identify not only articles on older people, but also geriatric topics in general.
The objective of this study was to develop systematically and test various search strategies in order to create search filters to identify articles on geriatric medicine in Medline and to test their operating characteristics, namely the sensitivity, specificity, precision, and accuracy. Sensitivity is defined as the number of retrieved records that are relevant divided by the total number of relevant records in the reference set. The relevant records that are missed are referred to as false negatives. A highly sensitive search will result in few relevant records being missed. Specificity is defined as the number of correctly not identified irrelevant records divided by the total number of irrelevant records in the reference set. Consequently, a highly specific search will result in few irrelevant records being retrieved. The irrelevant records that are retrieved are referred to as false positives. Precision is defined as the number of relevant records retrieved divided by the total number of records retrieved. This is also known as positive predictive value. The accuracy is defined as the number of records that is dealt with correctly by the search filter. The number-needed-to-read (NNR) (1/precision) is a measure of the usability of the filter, because it indicates how many records a searcher must screen for each relevant record retrieved.
Our research questions were: (a) Which is the most specific filter? (b) Which is the most sensitive filter? (c) How usable are these search filters (low NNR)? We also compared the operating characteristics of our search filters with the only other existing age-specific search filter that we were aware of, that developed by Kastner et al.9
Construction of the reference set
We created search filters for Medline using the PubMed interface because this database and interface is freely accessible and widely used. We used a diagnostic test analytic framework to develop and test the geriatric search filters.6 To assess the performance of the search filters, we compared their retrieval with a reference standard compiled by hand-searching journal articles. We treated the search filters as diagnostic tests for relevant studies, and the manual review of the literature was considered to be the ‘gold standard’ or reference set.6 10 This reference standard consisted of articles from journals with high impact factor published in the UK and USA, chosen after consulting several geriatricians, neurologists, and psychiatrists. We included articles from these journals published in the last quarter of 2009 to lessen the risk that not all articles in the reference set were indexed in Medline at the time that the searches were conducted.
Two of the authors (EMMvdG and BvM) hand-searched these journals independently of each other, and scored each article as ‘relevant’, ‘not relevant’ or ‘possibly relevant’ to geriatric medicine. Disagreements were discussed with a third author (LH). We categorized articles as relevant for geriatric medicine if they comprised topics that concerned the so-called ‘geriatric giants’ (incontinence, immobility, instability, and cognitive impairment),11 12 described a condition specific to old age, or were on a group of patients whose mean age was over 70. There were some articles that were not easy to classify as relevant or not. These were articles on a general topic that was of some relevance for geriatrics, and articles, for example, about studies that, among others, included patients aged above 70. We labeled these articles as possibly relevant. The remaining articles were labeled as not relevant for geriatric medicine.
The reference set was alphabetized by first author's family name, and we split it halfway into a development set and a validation set. The development set was used to find discriminating text words, phrases, and MeSH terms and to test the operating characteristics of the strategies. The validation set was used to test the strategies' performance independently.
After splitting the set, we excluded the publication type ‘letters’, because they usually refer to a published original study already identified by the search strategies. This prevented a false increase in the prevalence of relevant articles in the reference set, which would have overestimated the precision; when the prevalence of relevant information in a database is high, the positive predictive value (precision) of finding relevant information is also high. All other publication types were included, so the reference set contained various article types labeled as relevant, possibly relevant or not relevant.
Creating search strategies
To create robust search filters, we needed to choose relevant text words and MeSH terms. Two different approaches were used to find discriminating search terms. First, using the program PubReminer,13 we performed a frequency analysis to find the most frequently occurring single-term text words and MeSH terms in the development set, in both the relevant and not relevant articles. This tool was originally developed to refine literature searches by providing the most frequently used keywords in the retrieved articles. In short, PubReminer submits a user query to PubMed and retrieves the full records for all citations matching the query. From these records, publication year, journal title, first author, MeSH terms, substances, country, and text words in titles and abstracts are extracted and used to generate frequency tables. These frequency tables are then presented in an interactive way allowing adaptation of the original query based on the frequency results.
Second, with the program TerMine from the National Centre for Text Mining (NaCTem),14 we analyzed the frequency of phrases in titles and abstracts.
By comparing the most frequently occurring text words, phrases, and MeSH terms in both the relevant and not relevant records, we compiled a list of discriminating search terms to construct the test search filters. A search term was considered discriminating either when it occurred exclusively in the list of relevant articles or when it occurred five times more often in the relevant records than in the records that were not relevant. We chose the factor five because this was a good cut-off in the results. Finally, we had a list of discriminating text words, a list of discriminating MeSH terms, and a list of discriminating phrases. These three lists were combined with the Boolean operator ‘OR’ to create a search filter with high sensitivity. The search filter with high specificity consisted of search terms identified by the frequency analysis that occurred exclusively in the list of relevant records. To improve the sensitivity, we added search terms that were not exclusively in the list of relevant records, but occurred more than 10 times more often in the relevant records than in the records that were not relevant.
With a spreadsheet program, we compared the retrieved results of the different search strategies with the labeled records from the development set. Then we calculated the operating characteristics sensitivity, specificity, precision, and accuracy (table 1).
First, the possibly relevant articles were classified as not relevant records that the search strategies should not identify.
We wanted our search strategies to have either the sensitivity or specificity above 80%. Subsequently, the strategies were applied to the validation set to test their performance independently and to compare their operating characteristics with those of the development set.
Then we labeled the possibly relevant articles as relevant records that the search filters should identify and tested the operating characteristics of the search filters a second time.
Finally, we compared the performance of our search filters with the age-specific search strategies for geriatric medicine developed by Kastner et al. Therefore we tested their search strategy with best optimized sensitivity and specificity (Aged.sh, not exploded) in our reference set, and compared the operating characteristics of this search strategy with our search filter with highest specificity or sensitivity.
The reference set consisted of 3012 articles from 22 journals. After exclusion of the letters, 2255 articles remained (table 2).
A total of 1062 formed the development set, and 1195 formed the validation set. In total, 567 (25.1%) articles contained information relevant to geriatric medicine according to our criteria, 142 (6.3%) were classified as possibly relevant, and the remaining 1546 (68.6%) were classified as not relevant. There were some articles in the geriatric medicine section of articles that were not relevant. This can be explained because the majority of these concerned aging studies in animals.
The frequency analysis with PubReminer yielded a total of 20 discriminating, free-text search terms and 10 discriminating MeSH terms. With TerMine, we found five sets of multi-word terms. Using these terms, we constructed the sensitive search filter. This search filter had a sensitivity of 92.0% and a specificity of 86.9% in the development set, with similar results in the validation set. It identified 254 out of 276 relevant records correctly and missed only 22 records (false negatives). It had a NNR of 1.41 (table 3).
The search strategies can be found in the online appendix. The search filter with the highest specificity was constructed of search terms that were found exclusively in the list of relevant records. This filter had a specificity of 96.0% and a sensitivity of 69.6%, with a NNR of 1.16 with similar results in the validation set. This filter incorrectly identified only 31 of 784 not relevant records (false positives). To improve the sensitivity, we added several search terms to the search filter. The selected search terms all occurred at least 10 times more often in the relevant set than the not relevant set. By doing this, we improved the sensitivity to 74.6% at the cost of a slightly lower specificity (95.7%). These operating characteristics were also similar in the validation set (table 3).
Thereafter, we compared the operating characteristics of our search strategies with those of the best optimized age-specific search filter developed by Kastner et al.9 In our reference set, their filter had a lower sensitivity (81.6%) and specificity (79.9%)compared with the performance in their original reference set (sensitivity 93.6%, specificity 82.7%). In our reference set, our best optimized filter (No 1) had better sensitivity, specificity and precision than that of Kastner et al (table 3).
Finally, we analyzed the performance of the strategies in case they retrieved not only the indisputably relevant records but also the articles that were labeled as possibly relevant. This resulted in a slight decrease in the sensitivity of the strategies, while, in contrast, the specificity increased slightly or remained the same. Our search strategies remained usable (table 4).
Because of its high sensitivity (94.8%), our most sensitive search filter (No 1, see online appendix) is appropriate for the clinician or researcher who wishes to find as much relevant information as possible without missing too many articles. As the NNR is low (1.37), the search filter is also user friendly.
The filter with high specificity (No 2, see online appendix) has a higher than expected specificity (96.6%) with a somewhat lower NNR (1.15). Therefore this search filter is more suitable for the physician who has limited time and needs a quick answer to a clinical question. It depends on the purpose of the searcher which filter it is best to use.
In our reference set, the search strategy of Kastner et al had a lower performance than in their original reference set. This is probably because of different inclusion criteria for relevant articles in our reference set. Kastner et al only included articles that concerned patients in the age category of choice, whereas we included articles that were more specific for geriatric medicine. These articles were probably not found by the Kastner search filter.
Our most sensitive search filter performed better than that of Kastner et al in our reference set. In the original article, the reported precision of their search filter was lower than in our reference set. This was to be expected, because their reference set contained a lower percentage of articles relevant to geriatric medicine (6.7%) than ours (25%). If our search filter were to be used in the complete Medline database, the precision would be lower too, because in Medline also there is a lower percentage of geriatric information. This automatically lowers the positive predictive value of finding relevant information.
Furthermore, we used two different cut-offs for the classification ‘relevant’ (true positive search result), which is reflected in the variability of the sensitivity and specificity of the search filters. When ‘possibly relevant’ records were reclassified as ‘relevant’, the criteria for relevancy became broader and the criteria for irrelevancy became stricter. This resulted in a decreased sensitivity (more false negative search hits) and an increased specificity. After reclassification of the ‘possibly relevant’ records as ‘relevant’, the performance of the Kastner filter and ours was more comparable.
Which search strategy is best to use depends on the aim of the search. In geriatrics, it is more useful to use our filter because it is more suitable for finding information on geriatric topics in general. However, if the aim of the search is to find articles that include older people directly or indirectly, the search strategies of Kastner et al are usable too.
Our study has a number of strengths. Because we developed the reference set after consulting specialists, we enhanced the chance that the search terms and MeSH terms we used are relevant for geriatrics. In addition, systematically searching for suitable search terms to construct the search filters improves the operating characteristics of the search filters and thereby the reliability. Furthermore, splitting of the reference set enabled us to test the search filter a second time; it appeared that the retrieval performance of the search filters remained excellent in an independent set of articles.
On the other hand, these strategies have some limitations that are worth noting. We assumed that all articles were indexed in Medline, but we did not check this for the whole set. This could have influenced the performance of the search filter. However, the testing of the performance of the filters was carried out in the last quarter of 2010, and we assume that the majority of articles were indexed by then. If the search filter incorrectly had not identified a relevant article because it was not yet indexed, this would only have affected the performance negatively.
We split the reference set halfway using the alphabetized list of authors. This may have introduced bias because all articles with the same first author would fall into either the development set or the validation set. However, the percentages of relevant information in both sets are comparable, and therefore we assume the bias is limited.
Because we created the search filters using a reference set that consisted mainly of journals with a high prevalence of geriatric information, we may have overestimated the precision. However, a slight decrease in the precision when our search filters are applied to Medline is acceptable, because the precision in our test situation was very high.
The quality of any search depends on all components. Therefore the search for the topic of interest that is combined with our search filter should be methodologically sound. Also, the searcher should determine the methodological quality and appropriateness of the retrieved information before implementing it in daily practice.
Another consideration is the usability and implementation of our best performing search filters. They consist of multiple search statements and therefore may be complex to use by non-information professionals. For that reason, we want to provide these search filters on open access websites of international geriatric societies. In this way, searchers can easily copy and paste the search filter into Medline (eg, into PubMed Filters) and combine it with their topic of interest.
We conclude that our search filters contribute to a more evidence-based treatment for the geriatric patient, because finding relevant literature is the starting point of evidence-based practice. With the filters, searching Medline can readily become more efficient.
Future research should focus on the implementation of these search filters in daily practice and their contribution to decision making and medical knowledge.
We thank Dr J B B Koster for allowing us to use the program PubReminer.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.