A favorite method of data analysis in social research, regression involves techniques for finding the statistical equation that best describes a set of data. Regression techniques attempt to measure the effect of one or more independent variables on an outcome, or dependent variable. The most rigorous regression techniques are multivariate, regressing a dependent variable on multiple independent variables, or predictors. Researchers can use regression analysis to gauge the impact of early-childhood development programs on various outcome measures of childhood development, such as cognitive ability.
This technique is as much a method of data reduction as a tool of analysis. Factor analysis enables researchers to examine and explain a large number of variables, such as multiple measures of early-childhood development, in terms of a number of underlying patterns, or factors. Factor analytic techniques are commonly used to analyze survey data. Researchers from the U.S. Department of Education’s Office of Education Research and Improvement (OERI) used data from the Early Childhood Longitudinal Study (ECLS), a multiyear survey data set, to study early-childhood assessment instruments. Factor analysis also has been applied to studies of young children’s behavior and home environment, using survey questions taken from the ECLS.
Probability unit, or probit, analysis is a specialized type of regression technique used when the dependent variable is a dichotomous, or dummy, variable. Such a variable places subjects into only one of two categories, such as male or female. Quantitative data sets usually code such a variable with a 1 to indicate the presence of a characteristic or zero to denote its absence. Researchers at the World Bank used this technique to study early-childhood development programs in Kenya and their effect on children’s schooling and mothers’ employment. It used data from the Kenya Welfare Monitoring Survey and the Kenya Early Childhood Development Centers Survey (KECDCS).
This technique, known as HLM, recognizes that many types of data in education and the social sciences are nested, meaning units of measurement are subsets of a larger unit and therefore are correlated. For example, individual students are nested within classrooms, which are in turn nested within schools. Hierarchical linear modeling analyzes such data by employing a two-stage regression approach, in which the first stage analyzes the micro units, or the subjects within larger units (such as children in classrooms). The second stage analyzes the larger units. Researchers from the National Center for Education Statistics used early-childhood development data from ECLS to examine the relationship among teacher characteristics, school practices and the educational achievement of young children. They employed an HLM model to determine which aspects of teacher training and experience were associated with the young children’s abilities in reading and mathematics.