The reliability of a quality appraisal tool for studies of diagnostic reliability (QAREL)

Thumbnail Image
Other Title
Lucas, Nicholas
Macaskil, Petra
Irwig, Les
Moran, Robert
Rickards, Luke
Turner, Robin
Bogduk, Nikolai
Author ORCID Profiles (clickable)
Journal Article
Ngā Upoko Tukutuku (Māori subject headings)
quality appraisal
systematic review
evidence-based medicine
Lucas, N., Macaskill, P., Irwig, L., Moran, R., Rickards, L., Turner, R., and Bogduk, N. (2013). The reliability of a quality appraisal tool for studies of diagnostic reliability (QAREL). BMC Medical Research Methodology. 13(1) : 111.
Background The aim of this project was to investigate the reliability of a new 11-item quality appraisal tool for studies of diagnostic reliability (QAREL). The tool was tested on studies reporting the reliability of any physical examination procedure. The reliability of physical examination is a challenging area to study given the complex testing procedures, the range of tests, and lack of procedural standardisation. Methods Three reviewers used QAREL to independently rate 29 articles, comprising 30 studies, published during 2007. The articles were identified from a search of relevant databases using the following string: “Reproducibility of results (MeSH) OR reliability (t.w.) AND Physical examination (MeSH) OR physical examination (t.w.).” A total of 415 articles were retrieved and screened for inclusion. The reviewers undertook an independent trial assessment prior to data collection, followed by a general discussion about how to score each item. At no time did the reviewers discuss individual papers. Reliability was assessed for each item using multi-rater kappa (κ). Results Multi-rater reliability estimates ranged from κ = 0.27 to 0.92 across all items. Six items were recorded with good reliability (κ > 0.60), three with moderate reliability (κ = 0.41 - 0.60), and two with fair reliability (κ = 0.21 - 0.40). Raters found it difficult to agree about the spectrum of patients included in a study (Item 1) and the correct application and interpretation of the test (Item 10). Conclusions In this study, we found that QAREL was a reliable assessment tool for studies of diagnostic reliability when raters agreed upon criteria for the interpretation of each item. Nine out of 11 items had good or moderate reliability, and two items achieved fair reliability. The heterogeneity in the tests included in this study may have resulted in an underestimation of the reliability of these two items. We discuss these and other factors that could affect our results and make recommendations for the use of QAREL.
BioMed Central Ltd
Link to ePress publication
Copyright holder
© 2013 Lucas et al.; licensee BioMed Central Ltd.
Copyright notice
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.
Copyright license
This item appears in: