Techniques of Statistical Analysis

Despite the old adage of "lies, damned lies and statistics," statistical methods are vital for analysis of any quantitative data. Statistical techniques help to summarize and make sense of what otherwise might appear as a huge number of confusing numbers and codes.



The analysis usually starts with descriptive statistics, which present a summary of the data. In the main data analysis, researchers employ inferential statistics to construct models that allow them to go beyond the data and make statements about the wider population.
  1. Uni-variate Descriptive Statistics

    • The most useful descriptive statistics for single variables (uni-variate statistics) present its distribution and measures of central tendency and dispersion. The distribution is the frequency with which each value appears in the set, and it can be expressed either by listing how many times a value appears (for example how many people in the data set were aged 18, 19, 20, 21 etc.) or by grouping values into categories (how many people were under 25, 25 to 34, 35 to 44 etc.). The measures of central tendency include the mean (average), the median (the central value which splits the data set exactly in half), and the mode (the most common value). The measures of dispersion include the range (highest minus the lowest value), standard deviation and variance (both are measures of average deviation from the mean).

    Correlations

    • Correlation is a descriptive measure of association, which shows how related two variables are. Correlation coefficients range from -1 to +1, where -1 and +1 means that the two variables are perfectly related, i.e. if you know the value of one you can compute the value of the other without an error. A negative correlation means that for higher values of one variable, the values of the other variable are lower; a positive correlation means that higher values of one variable correspond to higher values of the other. Correlation coefficient 0 means no relationship between the variables.

    Estimating Error

    • Although descriptive statistics provide a summary picture of the data, most research projects doesn't gather data from the whole population that interests researchers. Statistical techniques are thus used to calculate the error of measurement and to estimate the likely values of the variable in the whole population.

    Inferential Statistics: Testing the Difference

    • One of the most common tasks of any data analysis is to test a difference between two or more measurements, either between groups or for the same group measured at different points of time. Testing here means establishing whether the observed difference is purely due to chance or whether it is likely to reflect actual differences in the population. Statistically significant differences mean differences that are unlikely to be caused purely by chance (incidentally, the terms "statistically significant" says nothing about the size or importance of the difference).

      The specific techniques suitable for testing the differences vary depending on the type of data. Differences between means are tested by t-test or (in more complex situations) various versions of analysis of variance or covariance (ANOVA, ANCOVA, MANOVA). If it is not possible to calculate means for the variable --- the variable is not measured on an interval scale, for example sex, religion, nationality --- non-parametric tests are suitable.

    Inferential Statistics: Models of Association

    • Oftentimes, researchers want to go beyond testing the differences between groups and want to determine the precise relationships between variables. The simplest measure of such association is correlation, but it's possible to construct and statistically test more complex models that can separate and individually quantify the influence of several variables. Multiple linear regression analysis is the most common and very powerful statistical modeling technique, often combined with analysis of variance in attempts to construct causal models. Other regression models also exist, including nonlinear and nonparametric models.

    Inferential Statistics: Classification

    • Among many other techniques for statistical analysis, an important group includes statistical techniques that may be used to classify and categorize variables and subjects. These techniques for classification include factor analysis and correspondence analysis, for classifying variables into meta-level dimensions; as well as various methods for classifying subjects (cases), including cluster analysis and classification trees.

Learnify Hub © www.0685.com All Rights Reserved