Sample size and power estimation for studies with health related quality of life outcomes: a comparison of four methods using the SF-36

  • Walters S
  • 2


    Mendeley users who have this article in their library.
  • N/A


    Citations of this article.


We describe and compare four different methods for estimating sample size and power, when the primary outcome of the study is a Health Related Quality of Life (HRQoL) measure. These methods are: 1. assuming a Normal distribution and comparing two means; 2. using a non-parametric method; 3. Whitehead's method based on the proportional odds model; 4. the bootstrap. We illustrate the various methods, using data from the SF-36. For simplicity this paper deals with studies designed to compare the effectiveness (or superiority) of a new treatment compared to a standard treatment at a single point in time. The results show that if the HRQoL outcome has a limited number of discrete values (< 7) and/or the expected proportion of cases at the boundaries is high (scoring 0 or 100), then we would recommend using Whitehead's method (Method 3). Alternatively, if the HRQoL outcome has a large number of distinct values and the proportion at the boundaries is low, then we would recommend using Method 1. If a pilot or historical dataset is readily available (to estimate the shape of the distribution) then bootstrap simulation (Method 4) based on this data will provide a more accurate and reliable sample size estimate than conventional methods (Methods 1, 2, or 3). In the absence of a reliable pilot set, bootstrapping is not appropriate and conventional methods of sample size estimation or simulation will need to be used. Fortunately, with the increasing use of HRQoL outcomes in research, historical datasets are becoming more readily available. Strictly speaking, our results and conclusions only apply to the SF-36 outcome measure. Further empirical work is required to see whether these results hold true for other HRQoL outcomes. However, the SF-36 has many features in common with other HRQoL outcomes: multi-dimensional, ordinal or discrete response categories with upper and lower bounds, and skewed distributions, so therefore, we believe these results and conclusions using the SF-36 will be appropriate for other HRQoL measures.

Author-supplied keywords

  • *Quality of Life
  • *Sample Size
  • *Sickness Impact Profile
  • Algorithms
  • Community Health Workers
  • Data Interpretation, Statistical
  • Female
  • Humans
  • Midwifery
  • Odds Ratio
  • Outcome Assessment (Health Care)/*methods/statisti
  • Postnatal Care
  • Psychometrics/instrumentation/*methods
  • Randomized Controlled Trials as Topic
  • Reproducibility of Results
  • Research Design
  • Social Support

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in


  • S J Walters

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free