Nonsteroidal antiinflammatory drugs (NSAIDs) were introduced in the 1960s and became the most widely prescribed class of drugs in the world, with more than 100 million prescriptions issued annually in the United States alone.1 NSAIDs inhibit cyclooxygenase (COX), which reduces pain and inflammation through the inhibition of prostaglandins. However, the COX enzyme is also present in gastric mucosa, where it stimulates gastroprotective prostaglandins. The identification of two isoforms, COX-1 and COX-2, and the recognition that antiinflammatory and analgesic effects are mediated through COX-2 inhibition — whereas the gastrointestinal toxic effects are linked to COX-1 inhibition — resulted in the development of selective COX-2 inhibitors that offered the potential to retain efficacy while reducing gastrointestinal adverse effects.2
Evidence of adverse cardiovascular outcomes in a placebo-controlled trial resulted in the withdrawal of the selective COX-2 inhibitor rofecoxib in 2004.3 On the basis of a small number of events, the results of another trial suggested that cardiovascular harm may result from the use of higher-than-approved doses of celecoxib.4 Subsequently, the Food and Drug Administration (FDA) allowed continued marketing of celecoxib, the sole remaining selective COX-2 inhibitor, but mandated a cardiovascular safety trial. In the Prospective Randomized Evaluation of Celecoxib Integrated Safety versus Ibuprofen or Naproxen (PRECISION) trial, we sought to assess cardiovascular, gastrointestinal, renal, and other outcomes with celecoxib as compared with two nonselective NSAIDs.
Trial Design and Oversight
PRECISION was a randomized, multicenter, double-blind, noninferiority trial involving patients who were at increased cardiovascular risk and had rheumatoid arthritis or osteoarthritis. Randomization was stratified according to the primary diagnosis (osteoarthritis or rheumatoid arthritis), aspirin use, and geographic region. Detailed methods for the trial have been published previously,5 and both the protocol and the statistical analysis plan are available with the full text of this article at NEJM.org. At each center, either a central institutional review board (Schulman IRB) or the local institutional review board approved the trial, and the patients provided written informed consent. A multidisciplinary executive committee supervised the trial, and an independent data and safety monitoring committee reviewed unblinded data to assess safety. The members of the committees are listed in Supplementary Appendix, available at NEJM.org. The members of the executive committee agreed not to accept any financial payments from any maker of NSAIDs for the duration of the trial. The trial sponsor (Pfizer) participated in the design of the trial and in the writing of the protocol in collaboration with the executive committee and in consultation with the FDA; the sponsor also assisted with data collection and maintained the trial database. The sponsor shared operational roles with the Cleveland Clinic Coordinating Center for Clinical Research (C5Research) and several contract research organizations. After the conclusion of the trial, the database was transferred to C5Research for statistical analyses. The academic authors wrote the manuscript. The sponsor was allowed to review and comment on the manuscript, but the decision to publish and the final contents were determined by the academic authors, with no contractual limits on the right to publish. All the authors had access to the final results, approved the manuscript, and assume responsibility for its accuracy and completeness and for the adherence of the trial and this report to the protocol.
Inclusion and Exclusion Criteria
We enrolled patients who were 18 years of age or older and who, as determined by the patient and physician, required daily treatment with NSAIDs for arthritis pain; patients whose arthritis pain was managed adequately with acetaminophen were not eligible. A key inclusion criterion was established cardiovascular disease or an increased risk of the development of cardiovascular disease (defined in the Supplementary Appendix). Other inclusion criteria and the exclusion criteria are provided in the protocol and in a previous publication.5
Patients were randomly assigned, in a 1:1:1 ratio, to receive celecoxib (100 mg twice a day), ibuprofen (600 mg three times a day), or naproxen (375 mg twice a day) with matching placebo. At subsequent visits, for patients with rheumatoid arthritis, investigators could increase the dose of celecoxib to 200 mg twice a day, the dose of ibuprofen to 800 mg three times a day, or the dose of naproxen to 500 mg twice a day for the treatment of symptoms. For patients with osteoarthritis, increases in the doses of ibuprofen and naproxen were permitted; however, regulatory dosing restrictions precluded dose escalation for celecoxib in these patients. Esomeprazole (20 to 40 mg) was provided to all patients for gastric protection. Investigators were encouraged to provide cardiovascular preventive management in accordance with local standards and guidelines. Patients who were taking low-dose aspirin (≤325 mg daily) were permitted to continue this therapy.
Adjudicated and Other Outcomes
The primary composite outcome, in a time-to-event analysis, was the first occurrence of an adverse event that met Antiplatelet Trialists Collaboration (APTC) criteria (i.e., death from cardiovascular causes, including hemorrhagic death; nonfatal myocardial infarction; or nonfatal stroke).6 A secondary composite outcome, major adverse cardiovascular events, included the components of the primary outcome plus coronary revascularization or hospitalization for unstable angina or transient ischemic attack. Secondary outcomes also included clinically significant gastrointestinal events. Tertiary outcomes included clinically significant renal events, iron deficiency anemia of gastrointestinal origin, and hospitalization for heart failure or hypertension. (Definitions are provided in the Supplementary Appendix.) Although it is not described in the protocol, the composite outcome of clinically significant gastrointestinal events or iron deficiency anemia of gastrointestinal origin was designated as the key gastrointestinal safety outcome before the trial data were unblinded. An independent committee of multidisciplinary specialists at C5Research who were unaware of the treatment assignments reviewed and adjudicated events. An assessment of the intensity of arthritis pain with the use of the Visual Analogue Scale for Pain (VAS) (scores range from 0 to 100 mm, with higher scores indicating worse pain) was a nonadjudicated secondary outcome; differences greater than 13.7 mm are considered to be clinically meaningful.7 The incidence of death from any cause was a prespecified tertiary outcome. Other prespecified outcomes are listed in the protocol and statistical analysis plan.
Naproxen was designated as the primary comparator for the assessment of the noninferiority of celecoxib. Noninferiority comparisons of celecoxib versus ibuprofen and of ibuprofen versus naproxen were also prespecified. Noninferiority required four criteria to be met; in the original design, a hazard ratio not exceeding 1.12 was required, with an upper limit of the one-sided 97.5% confidence interval of less than 1.33 in both the intention-to-treat population and the on-treatment population. The assessment of the on-treatment population included events that occurred while patients were taking the study drug and during the 30 days after discontinuation. The trial was event-driven, requiring 762 events to provide 90% power to determine noninferiority. Under the assumption of an annual event rate of 2% and a treatment discontinuation rate of 40%, the required sample size was estimated to be 20,000 patients. The observed event rate was lower, the discontinuation rate higher, and the enrollment rate slower than anticipated. At the recommendation of the data and safety monitoring committee and after consultation with the FDA, the protocol was amended to have the study provide 80% power, and the upper 97.5% confidence limit for noninferiority in the on-treatment population was modified to 1.40, which required 580 events in the intention-to-treat population and 420 events in the on-treatment population. The protocol prespecified a minimum follow-up time of 18 months, with censoring of data from event-free patients after 30 months in the intention-to-treat population and after 43 months in the on-treatment population.
A Cox proportional-hazards model with adjustment for stratification factors was used to calculate the hazard ratios and confidence intervals. A one-sided noninferiority P value of less than 0.025 was considered to indicate statistical significance for the primary end point, with no adjustment for multiple comparisons. P values for secondary analyses in the intention-to-treat population are reported for descriptive purposes; a two-sided P value of less than 0.05 was considered to indicate statistical significance, with no adjustment for multiple comparisons. For the on-treatment analyses, P values for noninferiority are reported for the primary APTC outcome, but P values are not reported for superiority comparisons. Additional details regarding the statistical analyses are provided in the Supplementary Appendix.
We screened 31,857 patients; a total of 24,222 patients underwent randomization at 926 centers in 13 countries between October 23, 2006, and June 30, 2014, and 141 were excluded from the analysis (106 were determined to be fraudulently enrolled, and 35 enrolled more than once), leaving 24,081 participants who could be included in the analysis. There were 8072 patients assigned to the celecoxib group (mean [±SD] daily dose, 209±37 mg), 7969 assigned to the naproxen group (852±103 mg), and 8040 assigned to the ibuprofen group (2045±246 mg). The characteristics of the patients at baseline were similar among the treatment groups
The mean durations of treatment and follow-up, respectively, were 20.3±16.0 and 34.1±13.4 months for all patients: 20.8±16.0 and 34.2±13.4 months in the celecoxib group, 20.5±15.9 and 34.2±13.3 months in the naproxen group, and 19.6±16.0 and 33.8±13.6 months in the ibuprofen group. During this 10-year trial, 68.8% of patients stopped taking the study drug, and 27.4% of patients discontinued follow-up; 2.5% of patients died, 8.3% withdrew consent in writing, 7.4% verbally expressed unwillingness to continue participation, and 7.2% were lost to follow-up before a final follow-up visit. Details regarding patient disposition, time to study-drug discontinuation, and time to nonretention in the trial are provided in Figs. S1, S2, and S3 in the Supplementary Appendix.
Primary APTC Outcome
In the intention-to-treat population (Table 2
Adjudicated Outcomes in the Intention-to-Treat Population.
and Figure 1
Time-to-Event Analysis for Primary and Secondary Outcomes.
), the primary APTC outcome occurred in 188 patients in the celecoxib group (2.3%), 201 in the naproxen group (2.5%), and 218 in the ibuprofen group (2.7%). The hazard ratio for this outcome in the celecoxib group, as compared with the naproxen group, was 0.93 (95% confidence interval [CI], 0.76 to 1.13; P<0.001 for noninferiority). The hazard ratio for celecoxib versus ibuprofen was 0.85 (95% CI, 0.70 to 1.04; P<0.001 for noninferiority), and the hazard ratio for ibuprofen versus naproxen was 1.08 (95% CI, 0.90 to 1.31; P=0.02 for noninferiority) (Table S1 in the Supplementary Appendix).
In the on-treatment population (Table 3
Adjudicated Outcomes in the On-Treatment Population.
and Figure 1), the primary APTC outcome occurred in 134 patients in the celecoxib group (1.7%), 144 in the naproxen group (1.8%), and 155 in the ibuprofen group (1.9%). The hazard ratio in the celecoxib group, as compared with the naproxen group, was 0.90 (95% CI, 0.71 to 1.15; P<0.001 for noninferiority); for celecoxib versus ibuprofen, the hazard ratio was 0.81 (95% CI, 0.65 to 1.02; P<0.001 for noninferiority), and for ibuprofen versus naproxen, the hazard ratio was 1.12 (95% CI, 0.89 to 1.40; P=0.025 for noninferiority) (Table S2 in the Supplementary Appendix).
Celecoxib, as compared with either naproxen or ibuprofen, met all four prespecified noninferiority requirements (P<0.001 for noninferiority in both comparisons). Ibuprofen, as compared with naproxen, just met the noninferiority criteria (P=0.025).
Major Adverse Cardiovascular Events and Mortality Outcomes
The results of the intention-to-treat analyses for the composite outcome of major adverse cardiovascular events and for the components of the outcome are reported in Table 2 and Figure 1. The hazard ratio for celecoxib versus naproxen was 0.97 (95% CI, 0.83 to 1.12; P=0.64), and the hazard ratio for celecoxib versus ibuprofen was 0.87 (95% CI, 0.75 to 1.01; P=0.06). In pairwise comparisons for the components of the primary outcome, the differences between celecoxib and naproxen and between celecoxib and ibuprofen were not significant. The hazard ratio for death from any cause was 0.80 for celecoxib versus naproxen (95% CI, 0.63 to 1.00; P=0.052) (Table 2 and Figure 1). The rate of nonfatal myocardial infarction was higher in the ibuprofen group than in the naproxen group (hazard ratio, 1.39; 95% CI, 1.01 to 1.91; P=0.04) (Table S1 in the Supplementary Appendix).
Gastrointestinal and Renal Outcomes
The results of the intention-to-treat analyses of gastrointestinal and renal outcomes are provided in Table 2 and Figure 1. The event rate for the composite outcome of serious gastrointestinal events was lower in the celecoxib group than in the naproxen group (hazard ratio, 0.71; 95% CI, 0.54 to 0.93; P=0.01) and was lower in the celecoxib group than in the ibuprofen group (hazard ratio, 0.65; 95% CI, 0.50 to 0.85; P=0.002). The hazard ratio for gastrointestinal events in the ibuprofen group versus the naproxen group was 1.08 (95% CI, 0.85 to 1.39; P=0.53). Serious renal events occurred at a significantly lower rate in the celecoxib group than in the ibuprofen group (hazard ratio, 0.61; 95% CI, 0.44 to 0.85; P=0.004), but the difference in the rate of this outcome in the celecoxib group versus the naproxen group was not significant (hazard ratio, 0.79; 95% CI, 0.56 to 1.12; P=0.19).
The rate of hospitalization for hypertension was significantly lower in the celecoxib group than in the ibuprofen group (hazard ratio, 0.60; 95% CI, 0.36 to 0.99; P=0.04) but was not significantly lower in the celecoxib group than in the naproxen group (Table 2). The results of analyses of quality of life and efficacy for the relief of arthritis symptoms are reported in Table S3 in the Supplementary Appendix. In the assessment of pain with the use of the VAS scale, a significant but small benefit was found for naproxen relative to celecoxib or ibuprofen; the change in VAS score from baseline was −9.3±0.26 mm for celecoxib, −9.5±0.26 for ibuprofen, and −10.2±0.26 for naproxen (P<0.001 for naproxen versus celecoxib, P=0.01 for naproxen versus ibuprofen). The analyses of the primary composite outcome among prespecified subgroups showed no significant interactions for any pairwise comparison, including among the subgroups that were defined by aspirin use at baseline (Fig. S5 in the Supplementary Appendix). Investigator-reported adverse effects that occurred in 3% or more of the patients in any treatment group are reported in Table S4 in the Supplementary Appendix.
The PRECISION trial was designed in the aftermath of the withdrawal of rofecoxib during a period of considerable scientific and public controversy about the cardiovascular safety of selective COX-2 inhibitors. Previous knowledge about the relative safety of selective or nonselective COX inhibitors was limited, because NSAIDs received initial approval on the basis of relatively small, short-term studies that typically had been designed to assess pain relief and general safety. The primary clinical concern was that celecoxib might be associated with adverse cardiovascular effects similar to those associated with rofecoxib. The PRECISION trial provides statistically strong evidence that the cardiovascular risk associated with moderate doses of celecoxib is not greater than that associated with nonselective NSAIDs. As compared with two widely used nonselective NSAIDs — naproxen and ibuprofen — celecoxib was associated with numerically fewer cardiovascular events, which resulted in noninferiority P values of less than 0.001. The trial results do not support the widely advocated belief that naproxen treatment, as compared with treatment with other NSAIDs, results in better cardiovascular outcomes.8
To establish noninferiority, the trial design required that prespecified criteria be met in both the intention-to-treat population and the on-treatment population. We selected this approach because these two alternative analyses provide complementary insights into drug safety. The intention-to-treat analysis is the only analysis that preserves the integrity of randomization, but it tends to dilute safety signals when patients do not adhere to the study treatment. The on-treatment analysis considers events that occur only while patients are actually taking the study drug, which can strengthen safety signals. Although both the intention-to-treat and the on-treatment analyses were used to assess noninferiority, superiority comparisons were performed with the intention-to-treat population. The on-treatment analyses are included to provide a complete accounting of outcomes, but the results in this population may have been influenced by between-group differences in rates of treatment discontinuation; therefore, these results are reported without P values and should be considered exploratory (Table 3).
We also included a broader outcome — major adverse cardiovascular events — as a secondary composite outcome to provide greater power to detect differences among the three treatments. Fewer major adverse cardiovascular events occurred in the celecoxib group than in the ibuprofen group, but the difference did not reach significance in the intention-to-treat population (P=0.06). The rate of death from any cause was lower in the celecoxib group than in the naproxen group, although the difference did not reach significance (P=0.052). We urge caution in interpreting these findings, because major adverse cardiovascular events was a secondary outcome and death from any cause a tertiary outcome, and these outcomes were not adjusted for end-point multiplicity; in addition, major adverse cardiovascular events included more subjective components than did the APTC outcome.
Although the primary purpose of the trial was to assess cardiovascular outcomes, we also adjudicated gastrointestinal and renal outcomes to provide a comprehensive safety evaluation. To compare the three drugs, we constructed a two-component composite of two adjudicated outcomes — clinically significant gastrointestinal events and iron-deficiency anemia of gastrointestinal origin. For this outcome, significantly fewer events occurred in the celecoxib group than in either the naproxen group or the ibuprofen group. These findings were expected on the basis of the theoretical gastrointestinal advantages of selective COX-2 inhibition. The differences were found despite esomeprazole, a proton-pump inhibitor, being provided for all patients, although we do not have information on adherence to this therapy. The rates of renal adverse events and hospitalization for hypertension were also significantly lower in the celecoxib group than in the ibuprofen group, although they did not differ significantly between the celecoxib group and the naproxen group. The pattern we found for investigator-reported adverse effects was similar to that for centrally adjudicated events, with a higher reported incidence of increased creatinine levels in the ibuprofen group than in the celecoxib group and a higher incidence of hypertension in both the naproxen group and the ibuprofen group, as compared with the celecoxib group (Table S4 in the Supplementary Appendix). Although naproxen-treated patients had a slightly greater reduction in pain, as assessed with the use of VAS scores, than did patients treated with celecoxib or ibuprofen, the differences were smaller than the 13.7-mm difference that is considered to be clinically meaningful.
The PRECISION trial had limitations. Adherence and retention were lower than in most trials that assess cardiovascular outcomes, which reflects the challenges of long-term treatment of a painful condition in patients who frequently experience frustration with unrelieved symptoms and switch therapies or leave the trial. Low levels of adherence and retention have also been found in previous pain studies.9 Although the similarity in the results for the intention-to-treat and on-treatment populations suggests that low adherence was unlikely to have influenced the principal conclusions, the high levels of nonretention make interpretation of the findings challenging. Although the rates of nonretention were similar for all three treatments, the possibility of informative censoring (i.e., the bias that is created when participants drop out of a study because of factors related to the study itself) cannot be ruled out. The large number of comparisons without adjustment for multiplicity increases the possibility of false positive findings.
The dose of celecoxib was limited by regulatory restrictions to 200 mg daily for most patients, which may have provided a potential safety advantage for celecoxib, although the mean doses for both nonselective NSAIDs were also submaximal. Three previous trials assessed higher doses of celecoxib (400 to 800 mg per day),4,10,11 one of which showed a significantly higher risk of cardiovascular events in association with the unapproved 800-mg dose than with placebo, although the trial included only a small number of events. Our results provide reassurance regarding the safety of moderate doses of celecoxib but not the safety of high doses of celecoxib. Although ibuprofen and naproxen have been reported to potentially interfere with the antiplatelet effects of aspirin,12 we found no statistical interaction for aspirin use (Fig. S4 in the Supplementary Appendix). However, the trial was not specifically designed to assess the effects of aspirin on the relative safety of NSAIDs. Although enrollment was stratified according to aspirin use to ensure equal distribution of aspirin use among the treatment groups, patients were not randomly assigned to receive or not receive aspirin.
The current results reflect the relative safety of only these three drugs and cannot provide insight into the effects of the more than two dozen other marketed NSAIDs, particularly because each of these drugs may have a unique safety profile. No inferences are possible regarding the effects of NSAIDs as compared with placebo or regarding the safety of intermittent treatment with low-dose over-the-counter preparations. For ethical reasons, a placebo comparison group was not feasible, since we required all patients and physicians to document that participants had required NSAID treatment for at least 6 months for adequate symptom relief. Acetaminophen was not selected as a comparator because previous studies had shown this drug to be ineffective for the treatment of patients with NSAID-dependent arthritis.13
In summary, the PRECISION trial showed the noninferiority of moderate doses of celecoxib, as compared with naproxen or ibuprofen, with regard to the primary APTC cardiovascular outcome. Celecoxib treatment also resulted in lower rates of gastrointestinal events than did either comparator drug and in lower rates of renal adverse events than did ibuprofen.