Decide how wide you want your confidence interval to be. Your confidence interval is the plus-or-minus amount around your sample's response that allows it to represent the larger population. For example, if 25 percent of your sample has diabetes and your confidence interval is 5, then you can be sure that the true percentage of people in the larger population that have diabetes is between 20 percent (25 percent - 5 percent) and 30 percent (25 percent + 5 percent). You need to determine ahead of time how large your confidence interval can be and still be meaningful for your particular situation.
Decide how certain you want your confidence interval to be. In other words, decide whether you want your final result to be "I am 95 percent sure that the percentage of people between the ages of 20 and 30 with diabetes is between 20 percent and 30 percent" or if you want to say "I am 99 percent sure that... is between 20 percent and 30 percent." Most researchers use 95 percent certainty on their confidence intervals, but other common confidence levels are 90 percent and 99 percent.
Determine the Z score that corresponds to your chosen confidence level.
For a confidence level of 90 percent, Z score equals 1.65
For a confidence level of 95 percent, Z score equals 1.96
For a confidence level of 99 percent, Z score equals 2.58
Use the following formula to calculate sample size:
ss = Z^2 * (0.25) / C^2
where Z = Z score determined in Step 3
and C = confidence interval expressed as a decimal (0.04 = +/- 4)
Determine the size of the overall population you want to make conclusions about. It is more important to know the scale of the population than the exact number. As long as the overall population is large enough that your sample size is less than 5 percent of the total population, the formula in Step 4 holds true. If you determine that your overall population of interest is small and that your sample size will be larger than 5 percent of the population, then you need to adjust your sample size calculation with the following formula:
new ss = (ss) / 1 + ((ss-1) / pop)
where ss =sample size calculated in Step 4
and pop = population