Write down the formula for determining the correlation coefficient:
r = (n*sum(xi*yi) - sum(xi)*sum(yi)) / sqrt(n*sum(xi^2) - (sum(xi))^2)*sqrt(n*sum(yi^2) - (sum(yi))^2).
Calculate xi*yi, where i = 1, 2, ..., n, taking each pair of numbers in turn. Also calculate each (xi)^2 and (yi)^2. Use the following data for an example:
x values: 0, 1, 2
y values: 2, 4, 6
xi*yi values: 0, 4, 12
(xi)^2 values: 0, 1, 4
(yi)^2 values: 4, 16, 36
Calculate the following sums: xi, yi, xi*yi, (xi)^2, (yi)^2.
Data:
x values: 0, 1, 2
y values: 2, 4, 6
xi*yi values: 0, 4, 12
(xi)^2 values: 0, 1, 4
(yi)^2 values: 4, 16, 36
Sums:
x values: 3
y values: 12
xi*yi values: 16
(xi)^2 values: 5
(yi)^2 values: 56
Plug the numbers into the equation, including n, the number of data points. Solve the equation.
r = (n*sum(xi*yi) - sum(xi)*sum(yi)) / sqrt(n*sum(xi^2) - (sum(xi))^2)*sqrt(n*sum(yi^2) - (sum(yi))^2)
r = (3*16 - 3*12) / sqrt(3*5 - 3^2)*sqrt(3*56 - 12^2)
r = (48 - 36) / sqrt(15 - 9)*sqrt(168 - 144)
r = 12 / sqrt(6)*sqrt(24)
r = 12 / sqrt(6*24) = 12 / sqrt(144) = 12 / 12 = 1
Note that this data set represents a straight line with a positive slope, so you expect that the correlation coefficient for this example will be 1.