Sign up & Download
Sign in

Interpretation of changes in health-related quality of life: the remarkable universality of half a standard deviation.

by Geoffrey R Norman, Jeff A Sloan, Kathleen W Wyrwich
Medical Care ()

Abstract

BACKGROUND: A number of studies have computed the minimally important difference (MID) for health-related quality of life instruments. OBJECTIVE: To determine whether there is consistency in the magnitude of MID estimates from different instruments. METHODS: We conducted a systematic review of the literature to identify studies that computed an MID and contained sufficient information to compute an effect size (ES). Thirty-eight studies fulfilled the criteria, resulting in 62 ESs. RESULTS: For all but 6 studies, the MID estimates were close to one half a SD (mean = 0.495, SD = 0.155). There was no consistent relationship with factors such as disease-specific or generic instrument or the number of response options. Negative changes were not associated with larger ESs. Population-based estimation procedures and brief follow-up were associated with smaller ESs, and acute conditions with larger ESs. An explanation for this consistency is that research in psychology has shown that the limit of people's ability to discriminate over a wide range of tasks is approximately 1 part in 7, which is very close to half a SD. CONCLUSION: In most circumstances, the threshold of discrimination for changes in health-related quality of life for chronic diseases appears to be approximately half a SD.

Cite this document (BETA)

Available from www.ncbi.nlm.nih.gov
Page 1
hidden

Interpretation of changes in heal...

Point/Counterpoint Interpretation of Changes in Health-related Quality of Life The Remarkable Universality of Half a Standard Deviation GEOFFREY R. NORMAN, PHD,* JEFF A. SLOAN, PHD,��� AND KATHLEEN W. WYRWICH, PHD��� BACKGROUND. A number of studies have com- puted the minimally important difference (MID) for health-related quality of life instruments. OBJECTIVE. To determine whether there is consistency in the magnitude of MID esti- mates from different instruments. METHODS. We conducted a systematic review of the literature to identify studies that com- puted an MID and contained sufficient informa- tion to compute an effect size (ES). Thirty-eight studies fulfilled the criteria, resulting in 62 ESs. RESULTS. For all but 6 studies, the MID esti- mates were close to one half a SD (mean 0.495, SD 0.155). There was no consistent relation- ship with factors such as disease-specific or ge- neric instrument or the number of response options. Negative changes were not associated with larger ESs. Population-based estimation procedures and brief follow-up were associated with smaller ESs, and acute conditions with larger ESs. An explanation for this consistency is that research in psychology has shown that the limit of people���s ability to discriminate over a wide range of tasks is approximately 1 part in 7, which is very close to half a SD. CONCLUSION. In most circumstances, the threshold of discrimination for changes in health-related quality of life for chronic dis- eases appears to be approximately half a SD. Key words: Quality of life threshold inter- pretation MID effect size. (Med Care 2003 41:582���592) The interpretation of changes in health-related quality of life (HRQL) has been a research focus for more than a decade.1 More recently, research- ers have been devising methods to identify a minimal level of change consistent with real, as opposed to statistically significant, benefit.2 The determination of the minimal level of real change for any HRQL scale can be a daunting task. It may potentially vary for different question- naires, different diseases, and different demo- graphic groups. Potential influences related to the questionnaire itself include the relative position of the individual on the HRQL scale (ie, floor and ceiling effects3), the number of steps on the scale, the number of items, and so forth. The intent of the analysis (making a diagnosis vs. testing the efficacy of an intervention) and the identity of the individual performing the assessment (patient vs. clinical staff) could potentially result in a different estimate of important change. Collectively, consid- ering all these variables for each measure, this variation represents a prohibitive impediment to the successful implementation of HRQL end- points in clinical research and practice. However, there is some evidence that some of these various factors may have a relatively small *From McMaster University, Hamilton, Canada. ���From the Mayo Clinic, Rochester, Minnesota. ���From Saint Louis University, St. Louis, Missouri. Address correspondence and reprint requests to: Geoffrey R. Norman, PhD, Program for Educational Research and Development, Building T-13, McMaster University, Hamilton, ON, L8S 4K1, Canada. E-mail: Norman@mcmaster.ca MEDICAL CARE Volume 41, Number 5, pp 582���592 ��2003 Lippincott Williams & Wilkins, Inc. 582
Page 2
hidden
impact on the magnitude of the minimal differ- ence. Certainly, some authors have noted that, over a series of studies with a diversity of condi- tions and age groups using disease-specific mea- sures with 7-point response scales, the minimally important difference (MID) appears to fall consis- tently close to 0.5 points on the 7-point scale.1,4,5 It is the thesis of the present article that there is more commonality than difference in the variety of approaches. We will show that a multiplicity of methods, using several different scales from time tradeoff to visual analogue, with the number of items ranging from 1 to more than 50, in a diversity of chronic conditions, led to remarkably similar estimates. We will argue that this conver- gence is not accidental, but is a direct consequence of the limit of human discrimination ability. Fi- nally, we will point out some circumstances that diverge from this consistency. Approaches to Estimating the Minimally Important Difference Perhaps the earliest criterion for identifying important change was devised by Cohen,6 who expressed differences as an effect size���the aver- age change divided by the baseline SD. He stated that in the context of comparing group averages, a small effect size was 0.2, a medium was 0.5, and a large effect size was 0.8. His primary intent was to provide some basis for sample size calculations. However, although Cohen6 did indicate the crite- ria ���as convention,��� they have frequently been referred to in health sciences literature to decide whether a change is important or unimportant, including the assertion that a moderate effect size of half a SD was typically important. Some authors have disputed the definition by Cohen6 of a moderate effect size as arbitrary, al- though without substantive arguments for other values. Testa7 suggested that a threshold medium effect size for individual change should be set at 0.6 rather than 0.5, but this definition also appears arbitrary. Feinstein8 suggested an alternative value of 0.56, which resulted from detailed mathematical derivations related to the correlation coefficient. Still, although there may be a case for these different values, they remain fairly close to a 0.5 effect size. Conversely, Sloan et al9 argued that an effect size of 0.5 corresponds to roughly the same value as the 0.5/7 shown with anchor-based methods1 by assum- ing that, if the entire range of any scale is considered to span 6 standard deviations, then the 0.5 effect size by Cohen6 would equate to approximately 8%, or almost exactly 0.5 on a 7-point scale. By contrast, anchor-based methods explicitly ex- amine the relationship between an HRQL measure and an independent criterion (or anchor) to eluci- date the meaning of a particular degree of change. The most popular anchor-based approach uses an estimate of the MID, defined as ���the smallest differ- ence in score in the domain of interest which pa- tients perceive as beneficial and which would man- date, in the absence of troublesome side-effects and excessive cost, a change in the patient���s manage- ment.���1 Typically, methods used to assess the MID from patients are based on a retrospective judgment about whether they have improved, stayed the same, or worsened over some period of time. The methods then establish a threshold based on the change in HRQL in patients who report minimal change, either for better or for worse. An important consideration, in view of what follows, is one of terminology. Nowhere in the operationalization of the MID approach is there a consideration of importance, or of the tradeoff be- tween benefit and side effects or costs, as appears in the frequently cited definition above. Thus, the cri- terion may be more appropriately thought of as a minimally detectable difference (MDD), not a mini- mally important difference.10,11 As such, it can be viewed as a threshold of detection, analogous to the just noticeable difference of psychophysics. Another class of anchor-based method involves longitudinal follow-up to determine whether sub- groups can be identified that have clinically different outcomes, such as rehospitalization, relapse of can- cer, Medical Research Council grading, or different interventions. Although these approaches clearly yield differences that are clinically important (CID), it is not at all clear that they are, in any sense, minimal. A final approach, population-based, popularized by Stewart et al,12 identifies subpopulations with minimally different levels of health (for example, on hypertensive therapy vs. normal, 50 vs. 45 years old) and then looks at the differences in scores on a generic HRQL measure. Although these differ- ences have external significance in terms of pop- ulation differences, the link to clinical significance, or to any estimate of minimal difference at the individual level, is unclear. To create a uniform nomenclature in the text to follow, we refer to a difference derived from an individual anchor-based method as minimal differ- ence (MD), with the two subclasses defined above as Vol. 41, No. 5 INTERPRETATION OF CHANGES IN HEALTH-RELATED QUALITY OF LIFE 583

Readership Statistics

56 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
23% Ph.D. Student
 
14% Post Doc
 
14% Researcher (at an Academic Institution)
by Country
 
36% United States
 
13% Netherlands
 
11% United Kingdom

Tags

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in