Literature
Home医源资料库在线期刊英国眼科学杂志2005年第89卷第3期

Evidence about evidence

来源:英国眼科杂志
摘要:ukThequalityofevaluationsofdiagnostictestperformanceKeywords:diagnosticaccuracystudies。qualityofreportingInthisissueoftheBJO(p261),Siddiquietalreviewthecomplianceofresearcherswithqualitystandardsforevaluationsofdiagnostictestperformance(DTP)。“Standards“......

点击显示 收起

Correspondence to:
Dr Barney C Reeves
London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK; barney.reeves@lshtm.ac.uk

The quality of evaluations of diagnostic test performance

Keywords: diagnostic accuracy studies; ophthalmic journals; quality of reporting

In this issue of the BJO (p 261), Siddiqui et al review the compliance of researchers with quality standards for evaluations of diagnostic test performance (DTP). "Standards" were originally set by the McMaster evidence based medicine (EBM) group1,2 and they have continued to evolve over recent years. Unfortunately, the standards appear to have had little impact since reviews of recent evaluations have shown that they tend to be of poor quality, in medicine generally and in ophthalmology and other specialties.3–6 The review of Siddiqui et al confirms this gloomy picture.

In contrast, during the same period, there has been substantial improvement in the quality and reporting of evaluations of the effectiveness of treatments. Why has research to evaluate DTP not benefited in a similar way from the EBM "movement"? Perhaps improving the quality of research about effectiveness was seen as a priority because it was perceived to be important to patients—the "bit" of health care that makes them better—or because the resources wasted from using treatments that don’t work (and not using ones that do) was much easier for the public and media to appreciate. Perhaps the principles of high quality research to evaluate DTP are more difficult to grasp than controlled experiments to assess effectiveness. Or perhaps we can just blame Archie Cochrane!

Whatever the reason, prioritising research about effectiveness might be seen as paradoxical since it is difficult to optimise treatment without first knowing the diagnosis. It is also not clearly justified on an efficiency basis, since substantial (and increasing) amounts of healthcare resources are spent on diagnosis, with new and expensive diagnostic technologies emerging. And the diversity of evidence about DTP is often not appreciated—for example, patients’ responses to standard questions when taking a history and standardised observations of clinical signs all constitute diagnostic "test" information, the value of which can be quantified.7

The relative neglect of evidence about DTP may, at last, be about to change. The Cochrane Collaboration has long appreciated the importance of such evidence—a methods groups on the topic was registered in 1995—and, in 2003, the collaboration took the decision to develop a new database of systematic reviews of diagnostic test accuracy. This will be developed in parallel with the existing database of systematic reviews of the effectiveness of healthcare interventions.

This new review of ophthalmic tests might appear to suggest that things are improving compared to the situation during the 1990s.5 All evaluations scored some points, with scores ranging from 8–19/25 compared with 0–5/7. However, although all STARD items are important, they are not all equally important. Failure to report some item may mislead a reader but does not necessarily invalidate the evidence. In contrast, poor compliance in reporting particular items leads (on average) to biased, optimistic estimates of DTP.4 Unfortunately, compliance with these items, about masking/blinding (item 11) and workup bias (item 16), was poorer than for others, with 6/16 and 4/15 papers respectively judged to be compliant with the standard.

Reporting indeterminate results (and analysing them correctly) is also crucial, since decisions still need to be made about patients who give such results. Failure to comply will almost always cause researchers to overestimate DTP. This item was poorly reported as well (item 22: 5/16).

The lack of evidence about diagnostic test performance represents an opportunity for medical researchers to make a significant contribution

Reviews of evidence about DTP suggest that researchers, and journal editors, compartmentalise their knowledge. At last, the message about confidence intervals seems to have been learnt with respect to estimates of effect. Why, then, are estimates of DTP perceived to be immune (item 21: 4/16) (Siddiqui et al)?5,8

The STARD items illustrate the distinction between the quality of reporting and the quality of the research itself. This distinction is also true for randomised controlled trials (RCTs) (cf CONSORT quality standards9) but is less important, perhaps, because the design principles of RCTs and measures to protect against bias are now well known, relatively simple and, hence, straightforward for readers to appraise. This is not yet the case for evidence about DTP. Note the STARD item that requires researchers to describe how the study population was selected. This leaves the reader to judge the appropriateness of the population for the research question/context of interest, which is the key issue in determining the relevance of the evidence.10 The STARD initiative is a very important step forward but users of evidence of DTP need to remain vigilant and hone their appraisal skills.

Although requirements for a good evaluation (study design features to protect against bias, and analysis) are not widely appreciated, in other respects such evaluations are often relatively easy to conduct. Evaluations are typically based on cross sectional studies, often without any need for prolonged follow up. Studies often investigate tests for diagnosing rare conditions, which can cause difficulties in recruiting a representative population that includes sufficient individuals with the condition(s) of interest (also true for evaluations of screening accuracy). However, high quality evidence for common conditions, and very simple "tests" (see above), is often lacking. The lack of evidence about DTP represents an opportunity for medical researchers to make a significant contribution (www.carestudy.com).

Methodology for evaluating DTP is an evolving area. In a recent critique,11 the limitations of the current framework were laid bare and challenges for the future set out. The UK National Health Service recently prioritised the commissioning of a review of evidence about methods for evaluation of DTP when there is no gold standard, a problem that is not uncommon (www.publichealth.bham.ac.uk/nccrm/Invitations_to_tender.htm). This decision highlights the importance of DTP evidence for healthcare services.

REFERENCES

Jaeschke A, Guyatt GH, Sackett DL, for the Evidence-based Working Group. Users’ Guides to The Medical Literature, III. How to use an article about a diagnostic test. A. Are the results of the study valid? JAMA 1994;271:389–91.

Jaeschke A, Guyatt GH, Sackett DL, for the Evidence-based Working Group. Users’ Guides to the Medical Literature, IV. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? JAMA 1994;271:703–7.

Reid MC, Lachs MS, Feinstein AR. Use of methodological standards in diagnostic test research. Getting better but still not good. JAMA 1995;274:645–51.

Lijmer JG, Mol BW, Heisterkamp S, et al. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA 1999;282:1061–6.

Harper R, Reeves B. Compliance with methodological standards when evaluating ophthalmic diagnostic tests. Invest Ophthalmol Vis Sci 1999;40:1650–7.

Patel HRH, Garcia-Montes F, Christopher N, et al. Diagnostic accuracy of flow rate testing in urology. BJU Int 2003;92:58–63.

McAlister FA, Straus SE, Sackett DL. Why we need large, simple studies of the clinical examination: the problem and a proposed solution. Lancet 1999;354:1721–4.

Harper R, Reeves B. Reporting of precision of estimates for diagnostic accuracy: a review. BMJ 1999;318:1322–3.

The CONSORT Statement. Revised recommendations for improving the quality of reports of parallel group randomized trials. JAMA 2001;285:1987–91.

Harper RA, Henson D, Reeves BC. Appraising evaluations of screening/diagnostic tests: the importance of the study populations. Br J Ophthalmol 2000;84:1198–202.

Feinstein AR. Misguided efforts and future challenges for research on "diagnostic tests". J Epidemiol Community Health 2002;56:330–2.

作者: B C Reeves 2007-5-11
医学百科App—中西医基础知识学习工具
  • 相关内容
  • 近期更新
  • 热文榜
  • 医学百科App—健康测试工具