Potential Consequences for Recruitment, Power, and External Validity of Requirements for Additional Risk Factors for Eligibility in Randomized Controlled Tria 2006年第37卷第1期 | 39康复网

    the Stroke Prevention Research Unit (S.C.H., P.M.R.), Department of Clinical Neurology, University of Oxford, United Kingdom
    Rudolf Magnus Institute of Neuroscience (A.A.), Department of Neurology, and Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, The Netherlands
    Department of Clinical Neurosciences (C.P.W.), Western General Hospital, Edinburgh, UK.

Abstract

Background and Purpose— Eligibility criteria determine the external validity (generalizability) of the results of randomized controlled trials. To increase the number of outcome events, and hence statistical power, some recent stroke prevention trials have required additional vascular risk factors for eligibility.

Methods— To assess the merits of additional eligibility criteria in stroke prevention trials, we analyzed data from 3 trials and 1 hospital-referred series of patients with a transient ischemic attack or minor ischemic stroke. Patients were stratified according to 2 sets of additional risk factors similar to those used in recent trials (MATCH, SPORTIF and PRoFESS); risk of stroke, myocardial infarction, or vascular death was calculated in relation to the number of risk factors.

Results— Although the observed risk during follow-up did increase with the number of risk factors present (P<0.01 for both sets), the risks in patients with 1 risk factors were not substantially greater than those in all patients. Consequently, although the proportions of patients with no risk factors in the 4 cohorts differed substantially between the 2 sets of eligibility criteria (21% to 28% versus 56% to 73%), in neither case could their exclusion be justified on statistical grounds.

Conclusions— The degree of patient selection introduced by use of additional vascular risk factors as eligibility criteria for trials can differ substantially between apparently similar sets of risk factors. Given that the potential for additional eligibility criteria to undermine generalizability and prolong recruitment outweighs any benefits in terms of statistical power, the exclusion of patients with no risk factors is difficult to justify.

Key Words: randomized controlled trials risk factors secondary prevention

Introduction

Randomized controlled trials (RCTs) are the most reliable methods of determining the effects of treatment. They must be internally valid (ie, design and conduct must eliminate the possibility of bias),1,2 but to be clinically useful, the results must also be relevant to a definable group of patients in a particular clinical setting; this is generally termed external validity or generalizability. Lack of external validity is the most frequent criticism by clinicians of randomized trials and systematic reviews.3–8

Often the most important determinants of the external validity of the results of a trial are the criteria used to determine whether or not patients are eligible.9–11 Eligibility criteria should not be too exclusive in a pragmatic trial if it is intended that the results should be generalizable to routine clinical practice. However, a review of 41 US National Institutes of Health trials found an average exclusion rate of 73%,12 and exclusion rates in stroke trials can be much higher. In acute stroke, 1 study found that of the small proportion of patients admitted to hospital sufficiently quickly to be suitable for thrombolysis,13 96% were ineligible based on the various other criteria of the relevant RCT.14 One center in another acute stroke trial had to screen 192 patients over 2 years to find a single eligible patient.15 In secondary prevention of stroke, trial eligibility criteria tend to be broader, but the effects of interventions can still be very dependent on patient characteristics,16,17 and so eligibility criteria deserve detailed consideration.

One recent innovation in some stroke prevention trials is the requirement for certain risk factors in addition to the presenting clinical syndrome for eligibility. For example, the MATCH trial required a previous stroke or myocardial infarction (MI), angina, peripheral vascular disease (PVD), or diabetes in addition to the recent transient ischemic attack (TIA) or stroke for eligibility,18 and the main SPORTIF trials required 1 of the following additional risk factors: hypertension; >75 years of age; previous TIA, stroke, or systemic embolism; left ventricular dysfunction; coronary artery disease; or diabetes.19,20 The PRoFESS study required either "55 years of age and ischemic stroke within 90 days before study entry" or ">50 years of age, ischemic stroke within 120 days before study entry, and 2 of the following additional risk factors: diabetes, hypertension, smoking, obesity, vascular damage (previous stroke, MI, or PVD), and end organ damage."21 Such additional eligibility criteria are intended to result in higher absolute risks of the trial outcomes and therefore greater statistical power and a reduced sample size. However, they will also decrease availability of patients over a given time period and potentially undermine external validity by resulting in a trial population that is particularly unrepresentative of patients seen in routine clinical practice. In MATCH, for example, diabetes was the easiest additional risk factor to document, and so 70% of recruited patients were diabetic,18 7x more than in population-based studies of TIA/stroke patients.22

Data from previous studies can be used to provide insights into the design of new studies. For example, in acute stroke studies, the influences of different entry criteria on patient outcome have been studied.23 Given the uncertainty about the merits of additional eligibility criteria in stroke prevention trials, we analyzed data from 3 previous secondary prevention trials24–26 and 1 series of hospital-referred TIA patients27 to determine the distribution of additional risk factors among the study populations and to investigate the relationship between risk and required sample size for a hypothetical trial.

Methods

Patient Populations

The UK-TIA aspirin trial24 was a trial of long-term treatment with aspirin (1200 mg daily versus 300 mg daily versus placebo) in 2435 patients with a TIA or minor ischemic stroke. The Dutch TIA trial25 was a 2x2 factorial RCT involving 2 treatment comparisons in 3150 patients with a TIA or minor ischemic stroke. A total of 3131 patients were randomized to 30 mg aspirin daily versus 283 mg aspirin daily, and 1473 patients were also randomized to 50 mg atenolol daily versus placebo. The European Carotid Surgery Trial (ECST)26 was an RCT of carotid endarterectomy versus best medical treatment alone in 3018 patients with recently symptomatic carotid stenosis. Analyses in this article were based on data from the 1211 patients randomized to medical treatment only. The Oxford TIA cohort was a study of 469 hospital-referred TIAs.27

Analysis

Two sets of additional risk factors were used to stratify patients in each of the 4 cohorts into groups. Set 1 consisted of the 5 risk factors used as inclusion criteria in the MATCH trial: previous ischemic stroke (before qualifying event), previous MI, angina, symptomatic PVD, and diabetes.18 These risk factors were also among those used in the SPORTIF trials and PRoFESS, which also used hypertension as an additional risk factor for eligibility. We therefore studied a second set of 5 risk factors similar to the first set but excluded previous stroke and included hypertension. Hypertension was defined by the use of antihypertensive treatment at baseline or a baseline systolic blood pressure of 160 mm Hg or a baseline diastolic blood pressure of 90 mm Hg.

Within each of the 4 study populations, the proportion of patients with none, 1, 2, or >2 risk factors was determined. Within groups defined by these numbers of risk factors, 3-year risks of the composite outcome of any stroke, MI, or vascular death were calculated. This outcome was chosen because it is common to most stroke prevention trials as either a primary or secondary outcome.

Univariate hazard ratios for the risk of any stroke, MI, or vascular death were calculated for each of the risk factors studied, using Cox proportional hazards models of the pooled data stratified by study. Heterogeneity between studies, with respect to the univariate effect of each risk factor, was assessed by fitting models including terms for study, the risk factor, and a study by risk factor interaction term.

In the 2 antiplatelet trials, we also explored the relationship between risk in patients included and required sample size by considering requirements for a hypothetical trial of a powerful new antiplatelet treatment. We calculated the required sample sizes to detect a relative risk reduction of 25% with statistical powers of 80% and 90% for the different subsets of patients with increasing numbers of risk factors. We assumed that patients would be randomized into 2 equally sized groups and that the risk of a vascular event in the placebo group for each risk factor subset was equal to the corresponding observed risk in the UK-TIA and Dutch TIA trials. A desired significance level of 5% was assumed. The required sample sizes were compared against the actual number of patients in each group.

Results

Table 1 shows the baseline clinical characteristics and prevalence of the risk factors studied in the 4 study populations. Hypertension was the most prevalent of the risk factors, being present in around two thirds of the population in each study. Table 2 shows the univariate hazard ratios for each of the risk factors studied. All of the risk factors were statistically significantly associated with an increased risk of new vascular events. There was no statistically significant heterogeneity in effect of the risk factors between studies.

Figure 1 shows the proportions of patients with 0, 1, 2, or >2 risk factors and the corresponding observed 3-year risks of any stroke, MI, or vascular death during follow-up. For risk factor set 1, most patients in each population had none of the listed risk factors. In contrast, using the second set of risk factors, most patients had 1 risk factor. Although the numbers of patients in the Oxford TIA cohort were much smaller than those in the 3 trials, the patterns observed in this nontrial population were very similar to the others for both sets of risk factors.

For both sets of risk factors, the risk of vascular events observed during follow-up increased with increasing numbers of risk factors in all 4 populations (Figure 1). The trends in risk were statistically significant at the P<0.01 level in each case.

Figure 2 shows the 3-year risks of any stroke, MI, or vascular death for all patients (ie, 0 risk factors) compared with those for patients with 1 of the additional risk factors. For risk factor set 1 (those used in the MATCH trial), only around one third of patients in each population had 1 risk factor. For risk factor set 2, most patients (between 70% and 80%) had 1 risk factor. In both cases, only a modest increase in risk of vascular events during follow-up was observed between patients with 1 risk factors compared with all patients. For both sets of risk factors, the patterns were highly consistent across the 4 populations.

We used the data from the 2 antiplatelet trials (UK-TIA aspirin trial and Dutch TIA trial) to calculate the number of patients required to achieve powers of 80% and 90% in hypothetical trials for the risk factor groups shown in Figure 1. Figure 3 shows actual numbers of patients in each group together with the required numbers to detect a relative risk reduction of 25% in the 3-year risk of a vascular event. In both trials, the total number of patients (UK-TIA aspirin trial n=2435; Dutch TIA trial n=3150) was between that required for 80% power (UK-TIA aspirin trial n=2211; Dutch TIA trial n=2662) and that required for 90% power (UK-TIA aspirin trial n=2960; Dutch TIA trial n=3563). For risk factor set 1 (Figure 3i), although the number of patients required decreases with increasing risk, the availability of these patients (as shown by the actual numbers in these groups) decreases more rapidly. Consequently, if patients with 1 risk factor were selected, then there would be a large shortfall in the number of patients to achieve even 80% power, with actual and required numbers of 762 versus 1451 in the UK-TIA aspirin trial and 1108 versus 1637 in the Dutch TIA trial. This discrepancy between available and required patients would widen further if patients with >1 or >2 risk factors were selected. It would therefore take much longer to recruit sufficient numbers of these patients to achieve the desired statistical power.

A similar effect is observed for the second set of risk factors (Figure 3ii). For patients with 1 risk factor, the observed number is only just sufficient for 80% power in the UK-TIA and the Dutch TIA trials. As for risk factor set 1, the decline in the numbers of patients with multiple risk factors is more rapid than the decline in required patients to achieve a specified power, meaning that any potential benefits would be offset by increased recruitment times.

Discussion

Selection criteria for recruitment of patients into RCTs will influence the length of the recruitment period, the statistical power, and the external validity of the results. Observed distributions of risk factors and risk in patient subgroups from previous trials can be used to study the likely impact of additional entry criteria. We analyzed data on risk of vascular events in 4 populations of patients with TIA or minor ischemic stroke and related these observations to sample size requirements for hypothetical trials.

We considered 2 different sets of risk factors and investigated the likely consequences of selecting patients on the basis of the presence of 1 of these additional risk factors. In our 4 populations (and for both sets of risk factors), we found that although risk did increase with increasing number of risk factors, there was only a relatively small increase in risk in patients with 1 risk factor compared with all patients. As such, any corresponding gains in statistical power from excluding patients with 0 risk factors would only be modest. Another potential advantage of having smaller patient numbers in a trial would be reduced workload associated with follow-up. However, these benefits from recruiting higher-risk patients must be offset against 2 important factors: potential problems with loss of generalizability of the results and decreasing availability of eligible patients over the same time period. For sets of risk factors, such as the MATCH risk factors, in which the majority of patients in routine clinical practice have 0 risk factors, the external validity of the results could clearly be compromised. Recruitment can be further skewed by strict requirements for the definition and documentation of these risk factors before inclusion. In MATCH, for example, diabetes was the easiest additional risk factor to document, and so 70% of recruited patients were diabetic, between 6 and 20x more than in the studies in our analysis.

The comparison of actual patient numbers with required sample sizes shows that the decline in availability of patients with increasing numbers of risk factors is much steeper than the decline in the required patients to achieve a specified statistical power (Figure 3). On balance, these data therefore suggest that selection of patients with 1 additional risk factors would at best not be worthwhile and at worst may undermine the external validity of the results. These findings were very consistent across the 4 different populations and for 2 sets of risk factors.

A potential criticism of these analyses would be the age of these data sets, with more effective preventive interventions becoming available over the last decade. Indeed, the age of the data sets was a factor in determining an appropriate definition for hypertension in these analyses. In using threshold values of 160/90 mm Hg for these data, collected when hypertension was treated less aggressively than it is today, we used a threshold that is likely to be functionally equivalent to a threshold of 140/90 mm Hg today. Although improved preventive interventions may mean that risks in future trials would be lower than those observed here, the observations concerning the trade-off between power and availability will remain valid. Moreover, we found that the results were consistent between the low-risk population in the Dutch TIA trial and the higher risk population in ECST, suggesting that the findings are robust to populations of differing baseline risk, and will therefore continue to be applicable as risks change in the future.

Acknowledgments

S.C.H. is funded by a Beit memorial fellowship, and P.M.R. is funded by the Medical Research Council.

References

Pocock SJ. Clinical Trials: A Practical Approach. Chichester, UK: John Wiley; 1983.

Friedman LM, Furberg CD, DeMets DL. Fundamentals of Clinical Trials. 3rd ed. New York, NY: Springer; 1998.

Black D. The limitations of evidence. J R Coll Physicians Lond. 1998; 32: 23–26.

Hampton JR. Size isn’t everything. Statist Med. 2002; 21: 2807–2814.

Caplan LR. Evidence based medicine: concerns of a clinical neurologist. J Neurol Neurosurg Psychiatry. 2001; 71: 569–574.

Evans JG. Evidence-based and evidence-biased medicine. Age Ageing. 1995; 24: 461–463.

Naylor C. Grey zones in clinical practice: some limits to evidence-based medicine. Lancet. 1995; 345: 840–842.

Feinstein AR, Horwitz RI. Problems in the "evidence" of "evidence-based medicine". Am J Med. 1997; 103: 529–535.

Gurwitz JH, Col NF, Avorn J. The exclusion of elderly and women from clinical trials in acute myocardial infarction. J Am Med Assoc. 1992; 268: 1417–1422.

Rothwell PM. External validity of randomised controlled trials: to whom do the results of this trial apply Lancet. 2005; 365: 82–93.

Bungeja G, Kumar A, Banerjee AK. Exclusion of elderly people from clinical research: a descriptive study of published reports. BMJ. 1997; 315: 1059.

Charleson ME, Horwitz RI. Applying results of randomised trials to clinical practice: impact of losses before randomisation. BMJ. 1984; 289: 1281–1284.

Jorgensen HS, Nakayama H, Kammersgaard LP, Raaschou HO, Olsen TS. Predicted impact of intravenous thrombolysis on prognosis of general population of stroke patients: simulation model. BMJ. 1999; 319: 288–289.

National Institute of Neurological Disorders and Stroke rt-PA Stroke Study Group. Tissue plasminogen activator for acute ischemic stroke. N Engl J Med. 1995; 333: 1581–1587.

LaRue LJ. Alter M, Traven ND, Sterman AB, Sobel E, Kleiner J. Acute stroke therapy trials: problems in patient accrual. Stroke. 1988; 19: 950–954.

Gorter JW for the Stroke Prevention in Reversible Ischaemia Trial (SPIRIT) and European Atrial Fibrillation Trial (EAFT) groups. Major bleeding during anticoagulation after cerebral ischemia: patterns and risk factors. Neurology. 1999; 53: 1319–1327.

Rothwell PM, Eliasziw M, Gutnikov SA, Warlow CP, Barnett HJM for the Carotid Endarterectomy Trialists Collaboration. Effect of endarterectomy for symptomatic carotid stenosis in relation to clinical subgroups and to the timing of surgery. Lancet. 2004; 363: 915–924.

Diener HC, Bogousslavsky J, Brass LM, Cimminiello C, Csiba L, Kaste M, Leys D, Matias-Guiu J, Rupprecht HJ; MATCH Investigators. Aspirin and clopidogrel compared with clopidogrel alone after recent ischaemic stroke or transient ischaemic attack in high-risk patients (MATCH): randomised, double-blind, placebo-controlled trial. Lancet. 2004; 364: 331–337.

Olsson SB Executive Steering Committee on behalf of the SPORTIF III Investigators. Stroke prevention with the oral direct thrombin inhibitor ximelagatran compared with warfarin in patients with non-valvular atrial fibrillation (SPORTIF III): randomised controlled trial. Lancet. 2003; 362: 1691–1698.

Halperin JL; Executive Steering Committee, SPORTIF III, and V Study Investigators. Ximelagatran compared with warfarin for prevention of thromboembolism in patients with nonvalvular atrial fibrillation: Rationale, objectives, and design of a pair of clinical studies and baseline patient characteristics (SPORTIF III and V). Am Heart J. 2003; 146: 431–438.

http://www.profess-study.com (July 2005).

Rothwell PM, Coull A, Giles M, Howard SC, Silver L, Bull LM, Gutnikov SA, Edwards P, Mant D, Sackely CM, Farmer A, Sandercock PA, Dennis MS, Warlow CP, Bamford JM, Anslow P. Changes in stroke incidence, mortality, case-fatality, severity, and risk factors in Oxfordshire from 1981–2004: the Oxford Vascular Study. Lancet. 2004; 363: 1925–1933.

Uchino K, Billheimer D, Cramer SC. Entry criteria and baseline characteristics predict outcome in acute stroke trials. Stroke. 2001; 32: 909–916.

UK-TIA study group. The United Kingdom transient ischaemic attack (UK-TIA) aspirin trial: final results. J Neurol Neurosurg Psychiatry. 1991; 54: 1044–1054.

Dutch TIA Trial Study Group. A comparison of two doses of aspirin (30 mg vs. 283 mg a day) in patients after a transient ischemic attack or minor ischemic stroke. N Engl J Med. 1991; 325: 1261–1266.

European Carotid Surgery Trialists’ Collaborative Group. Randomised trial of endarterectomy for recently symptomatic carotid stenosis: final results of the MRC European Carotid Surgery Trial (ECST). Lancet. 1998; 351: 1379–1387.

Hankey GJ, Slattery JM, Warlow CP. The prognosis of hospital-referred transient ischaemic attacks. J Neurol Neurosurg Psychiatry. 1991; 54: 793–802.

作者： Sally C. Howard, Dphil; Ale Algra, MD; Charles P. 2007-5-14