Sort your data into a frequency table. If your data is qualitative, then you just sort your data based on the qualitative value, such as color. If your data is quantitative, you use the method in section two to create the equivalent of a frequency table, known as a frequency distribution. Using an example, imagine a handful of M&M's, and sort the M&M's based on their color to find:
Blue 10
Red 8
Yellow 9
Green 12
This table indicates that in your data set 10 M&M's are blue, eight are red, nine are yellow, and 12 are green.
Find the relative frequency. This is found by dividing the frequency of each group by the total number of items in the data set. For instance:
Blue 10
Red 8
Yellow 9
Green 12
This group has a total number of 39 objects, found by adding 10 plus 8 plus 9 plus 12. The relative frequencies would then be:
Blue 10 / 39 = 0.256
Red 8 / 39 = 0.205
Yellow 9 / 39 = 0.231
Green 12 / 39 = 0.308
Add each group's relative frequency to the previous group's cumulative relative frequency. For instance, starting with group one, there is no previous group's cumulative relative frequency, so you find that group one is:
Blue 0.256
However, with group two, there is a previous group; so you find group two's cumulative relative frequency by adding the two frequencies, as such:
Red 0.205 + 0.256 = 0.461
You continue this method with group's three and four to get:
Yellow 0.461 + 0.231 = 0.692
Green 0.692 + 0.308 = 1
If you did this step correctly, the last group will have a cumulative frequency of 1, or very close to 1 allowing for rounding error.
Calculate how many groups of data you need. You do this by using the equation:
2^k > N
Where:
k = number of groups
N = number of data
So, if you were given the data set: {2, 5, 9, 19, 23, 34, 65, 87} then N = 8 because there are eight pieces of data in the set given. Additionally 2^k > 8 so k = 4. It is important to remember to find the first value of k this inequality is true for and to round up to the highest whole number when solving. Solving this step can be done by trial and error, starting at k = 1 and incrementing by 1 each time.
Calculate the interval. The interval of each group is found by taking:
I >= (H-L)/k
Where:
I = the interval
H = the highest value in the group
L = the lowest value in the group
k = the number of groups previously found
So, for the data set {2, 5, 9, 19, 23, 34, 65, 87} and k = 4 you find I >= (87 - 2) / 4 or 21.25. Because of the inequality, you can round up to some degree, so you can take I = 22. However, you can round too much. If you round too much, in the last step the last group will have no data. If this is true, you need to recalculate I.
Create the intervals. You do this by starting at the low value, and adding I to that value to find the first interval. The next interval starts where the first left off and increments by I. This continues until you have k classes. So for the data set {2, 5, 9, 19, 23, 34, 65, 87}, k = 4 and I = 22 you would create the following classes:
class 1: 2 up to 24
class 2: 24 up to 46
class 3: 46 up to 68
class 4: 68 up to 90
Sort the data and find the frequency. This step is found by putting each piece of data in the correct class. For the data set {2, 5, 9, 19, 23, 34, 65, 87} you would find:
class 1: 5
class 2: 1
class 3: 1
class 4: 1
This indicates five pieces of data fell within the range of 2 and 24, one piece of data between 24 and 46, one between 46 and 68 and one between 68 and 90.
Find the relative frequency. This is found by taking the frequency in each group and dividing it by the total number of data in the data set, referenced as N.
class 1: 5/8 = 0.625
class 2: 1/8 = 0.125
class 3: 1/8 = 0.125
class 4: 1/8 = 0.125
A quick check is to add all of the values, if it's sum is one, then you did the step correctly. A value very close to one, such as 0.99 or 1.01 could indicate a rounding error and is an acceptable answer as well.