Simple guidelines for calculating efficient sample sizes in cluster randomized trials with unknown intraclass correlation (ICC) and varying cluster sizes.
A simple equation is given for the optimal number of clusters and sample size per cluster. Here, optimal means maximizing power for a given budget or minimizing total cost for a given power. The problems of cluster size variation and specification of the ICC of the outcome are solved in a simple yet efficient way.
The optimal number of clusters goes up, and the optimal sample size per cluster goes down as the ICC goes up or as the cluster-to-person cost ratio goes down. The available budget, desired power, and effect size only affect the number of clusters and not the sample size per cluster, which is between 7 and 70 for a wide range of cost ratios and ICCs. Power loss because of cluster size variation is compensated by sampling 10% more clusters. The optimal design for the ICC halfway the range of realistic ICC values is a good choice for the first stage of a two-stage design. The second stage is needed only if the first stage shows the ICC to be higher than assumed.
Efficient sample sizes for cluster randomized trials are easily computed, provided the cost per cluster and cost per person are specified.
Back to Article Outline
The effects of interventions to improve health or change lifestyle are often evaluated with a cluster randomized trial , , also known as group randomized trial . In such a trial, organizations (clusters) are randomly assigned to one of the treatment conditions, and all persons sampled within a given cluster get the same treatment. Examples from primary care are patient-centered care of newly diagnosed diabetes  and detection and treatment of depression  in general practice. Examples from public health are smoking prevention  and stress management  in primary school. Cluster randomization is less efficient than individual randomization because outcome variation between clusters, reflected by the so-called intraclass correlation (ICC), increases the sampling error of the treatment effect estimate in a cluster randomized trial , , , . However, such trials are needed for logistic reasons or to prevent treatment contamination. Individual assignment is impossible for treatments such as health promotion in classrooms. For other treatments, it may induce serious treatment contamination.
Because cluster randomized trials are less efficient but sometimes the only option, it is important to optimize their efficiency. This is the topic of this article. Sample sizes (number of clusters and number of persons per cluster) will be presented that minimize the sampling error, thereby maximizing test power and precision of estimation, for treatment effects, under the constraint of a given budget for sampling and measuring clusters and persons. Here, budget is in terms of money, but it can also be expressed as time or demands on participants. Equal sample sizes per cluster will first be assumed for simplicity and because they are the most efficient. Later, this assumption is relaxed. To prevent misunderstanding, we emphasize that this article is about sampling large clusters such as general practices and sampling persons from these clusters and not about sampling small clusters such as families and then including all its members. The outline of this article is as follows. First, the optimal sample size for a cluster randomized trial is given as a function of the ICC of the outcome and the costs per included cluster and per person. Second, an equation is given to calculate the sample size and the budget needed for a given power and effect size. Third, simple solutions to two problems are given: uncertainty about the ICC and varying cluster sizes. Finally, the theory is applied to a published trial.
Back to Article Outline
Suppose we have a cluster randomized trial with K clusters of n persons per cluster and a quantitative outcome y like body mass index or a clinical questionnaire score. The data can be analyzed not only by a mixed (multilevel) regression but also by an unpaired t-test on the K cluster means, obtained by averaging individual outcomes within each cluster. The latter method is equivalent to mixed regression of the individual data if the sample size is the same for each cluster and there are no covariates . Varying cluster sizes and covariates are discussed later. In a trial, we want to estimate the treatment effect as precisely, and to test it with as much power, as possible. So a good criterion for the design efficiency is the variance (squared standard error [SE2]) of the estimated treatment effect, which is expressed as follows , , :
Increasing either the cluster size n or the number of clusters K decreases the SE and thus improves power and precision. Increasing n also increases the DE and is thus less effective than increasing K. On the other hand, increasing the number of clusters K may be very expensive. So the question is, “What is the best choice of n and K for a given trial?” This is addressed by optimal design theory : how to find that n and K which minimize the SE in Equation (1), thus maximizing power and precision, for a given total sampling cost? Or equivalently, which n and K minimize the total cost for a target SE, power, and precision? To find this optimal design, we need a function that relates sample size to costs. Assume that inclusion of a cluster into the study costs c units (of money, time, or demands), whereas inclusion of a person in an included cluster costs s units. The budget B needed for K clusters of n persons, ignoring those costs that do not depend on sample size, is then , where (c+sn) is the total sampling cost per cluster with sample size n. The optimal design minimizes the SE of the treatment effect as a function of n and K, given the budget constraint . Fig. 1 shows how the SE depends on the cluster size n for various ICC values. The cluster size that gives the smallest SE for a given ICC is the optimal design for that ICC. Note that the SE does not continue to decrease as n increases because an increase of n implies a decrease of K because of the budget constraint. The SE is minimal for the following cluster size , :
Inserting Equation (2) into Equation (1) gives the SE2 for the optimal design and thus the smallest possible SE and largest possible power and precision, given the budget, sampling costs, outcome variance, and ICC. If that SE is still too large for sufficient power and precision, then the budget must be increased, resulting in more clusters rather than in more persons per cluster (Equation (2)). More specifically, given a cluster size of n persons, the number of clusters K needed for a power (1−γ), two-tailed type I error risk α, and effect size where μ1−μ2 is the mean outcome difference between treatments  is given by
Back to Article Outline
As Equation (1) and Fig. 1 show, the SE of the treatment effect increases with the ICC, and so a safe strategy is to use the optimal design for the largest realistic ICC based on published trials. However, this may require a large budget, as it follows from Equation (3) that K and thus B increases with the ICC. A less-expensive and still safe choice is to assume an intermediate ICC like the midpoint of the assumed ICC range, for instance, 0.05 if the range is 0–0.10 as suggested by reviews of ICC values in primary care trials , . This leads to a smaller B for a given power and effect size. This can then be used in a two-stage design as follows. First, we apply the optimal design and budget needed for the midpoint ICC scenario. Then, we estimate the ICC from the data to recalculate the number of clusters needed, leading to more clusters (which requires more budget) only if the ICC is higher than the midpoint. According to a recent review , the final analysis can usually be done on all data without correction for this interim look and sample size recalculation based on the ICC. This review concerned the unpaired t-test in a classic RCT, but it also applies to cluster randomized trials analyzed with a t-test on cluster means, which is equivalent to mixed regression of individual data . Furthermore, a simulation study by Lake et al.  (Table 1) suggests that the present two-stage approach is safe in controlling the type I error risk and power.
Abbreviation: ICC, intraclass correlation.
The last column shows the percentage extra budget needed for the actual design compared with the optimal design.
So instead of taking the number of clusters needed for the maximum possible ICC, we take the number needed for the midpoint ICC, thus saving costs if data analysis of the first stage confirms the midpoint ICC. To see how much can be saved, we computed the percentage extra budget needed for the one-stage design based on the maximum ICC relative to the first stage of the two-stage design based on the midpoint ICC. Fig. 3A shows that much can be saved especially if the ICC range or cost ratio is large. Of course, the actual savings depend on the ICC value obtained in the interim analysis. Fig. 3A shows the savings if the ICC turns out to be smaller than, or equal to, the midpoint so that no second stage is needed. If the ICC is larger, a second stage with extra clusters is needed, increasing the costs for the two-stage design. We therefore also computed the expected percentage budget increase for the one-stage maximum ICC design relative to the two-stage midpoint ICC design, assuming that all ICC values from 0 to the maximum are equally likely. The resulting percentages are plotted in Fig. 3B and are two-thirds of those in Fig. 3A.
Now, there is a chance that the maximum ICC is correct, and then we loose efficiency by choosing the midpoint ICC for the first stage because the optimal design (i.e., cluster size) for the first stage is then n
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below