Data Interpretation Tricks

When collecting data, it is not always clear what correlations will exist or which variables will prove to be the most important. However, correctly interpreting the data to arrive at solid conclusions is the most important step. Fortunately, combination of statistical and visual methods can make it easier to find meaningful patterns.
  1. Regression

    • Regression techniques use data points to determine the functional relationship between two variables. Linear regression, for example, finds the straight line that best fits the data. Although linear regression can be used for any data set, it is up to the researcher to determine when these techniques are appropriate. Calculating the correlation coefficient between two variables is often a good guide as to when regression techniques might work.

    Residuals

    • If you think that your data lies along a particular curve, it can be helpful to make a residuals plot, showing the deviation between the expected curve and the actual data. If all the points are near zero, the prediction is probably valid. If there is some uniform deviation, for instance, if all the residuals are hovering around a point other than zero, then the prediction might have a straightforward correction. More complicated patterns sometimes reveal nonlinear relationships and variables that have not been accounted for.

    Outliers

    • Sometimes every point in a data set will lie along a curve, except one. Outliers are points that are noticeably different from all the rest of the data. Systematic mistakes are the first culprit, accidentally typing in an extra zero can make a data point far too large. Outliers are sometimes ignored during curve-fitting, but they shouldn't be discarded. The hole in the ozone layer was discovered by a scientist looking into a few outliers, which other scientists had been ignoring for years.

    Dimensional Analysis

    • Most data interpretation tricks involve looking at a graph, but sometimes it isn't clear what data should be placed onto a graph in the first place. Dimensional analysis looks for dimension-free numbers and then uses them as independent variables.

      One of the most famous examples in physics is Reynold's number, a dimensionless value, involving velocity, density, viscosity and length, that predicts the turbulence of a fluid. Plotting any one of the other variables instead of the dimensionless Reynold's number produces graphs that seem to make no sense.

Learnify Hub © www.0685.com All Rights Reserved