What differences are detected by superiority trials or ruled out by noninferiority trials? A cross-sectional study on a random sample of two-hundred two-arms parallel group randomized clinical trials
Background: The smallest difference to be detected in superiority trials or the largest difference to be ruled out in noninferiority trials is a key determinant of sample size, but little guidance exists to help researchers in their choice. The objectives were to examine the distribution of differences that researchers aim to detect in clinical trials and to verify that those differences are smaller in noninferiority compared to superiority trials. Methods: Cross-sectional study based on a random sample of two hundred two-arm, parallel group superiority (100) and noninferiority (100) randomized clinical trials published between 2004 and 2009 in 27 leading medical journals. The main outcome measure was the smallest difference in favor of the new treatment to be detected (superiority trials) or largest unfavorable difference to be ruled out (noninferiority trials) used for sample size computation, expressed as standardized difference in proportions, or standardized difference in means. Student t test and analysis of variance were used. Results: The differences to be detected or ruled out varied considerably from one study to the next; e.g., for superiority trials, the standardized difference in means ranged from 0.007 to 0.87, and the standardized difference in proportions from 0.04 to 1.56. On average, superiority trials were designed to detect larger differences than noninferiority trials (standardized difference in proportions: mean 0.37 versus 0.27, P = 0.001; standardized difference in means: 0.56 versus 0.40, P = 0.006). Standardized differences were lower for mortality than for other outcomes, and lower in cardiovascular trials than in other research areas. Conclusions: Superiority trials are designed to detect larger differences than noninferiority trials are designed to rule out. The variability between studies is considerable and is partly explained by the type of outcome and the medical context. A more explicit and rational approach to choosing the difference to be detected or to be ruled out in clinical trials may be desirable.