The probability distribution of a variable is a description of how the probabilities of an event are spread over a sample space. If the event had two possible outcomes, such as in a coin toss, and each outcome was equally likely, then the probability distribution would be 0.5 for heads, and 0.5 for tails. The sum of the probabilities for all possible outcomes in a sample space will always equal one. The frequency distribution of a sample space is a description of the frequencies of the actual outcomes of an event. If the event is truly random, then the frequency distribution will closely resemble the probability distribution, provided the sample size is large enough.
The concepts of mean and standard deviation are essential to probability theory. The mean of a variable is the average value for that variable within the sample space. The standard deviation is a measure of the spread of the values for the variable around the mean. A distribution in which most of the values are tightly gathered around the mean will have a small standard deviation. A distribution in which the values are widely spread from the mean will have a large standard deviation.
A normal distribution is one in which the probabilities for the variable are symmetrical about the mean. The probability that a variable will have a value 10 points above the mean is identical to the probability that it will have a value 10 points below the mean. The probability distributions for many variables follow the pattern of a normal distribution. The height of individuals of a specific age, student ability and achievement, and recorded temperatures for a specific day of the year, all tend to have the bell-shaped normal distribution pattern.
Researchers test hypotheses by observing whether results stray from that normal distribution. In a normal distribution, 68 percent of the results fall within one standard deviation of the mean, 95 percent fall within two standard deviations of the mean, and 99.7 percent fall within three standard deviations. If a sample distribution falls outside two or three standard deviations from the mean, that is usually taken as evidence that some other factor is affecting the distribution of the variable.