How to Run a Negative Binomial Regression

Negative binomial regression is a form of regression that generates a prediction function for predicting count data. Count data tally the number of times a specific event has occurred. For example, the number of times an individual borrows books from a library is an instance of count data. Because of the complexity of the negative binomial regression function , running binomial regression on a set of count data requires statistical software.

Things You'll Need

  • Statistical software (R, SPSS, or SAS)
Show More

Instructions

    • 1

      Organize your data. Write the data from the study in an organized fashion in either a text document or a spreadsheet. Different statistical software programs use different formats of data, but most accept either type. For example, R can accept data in the form of a text document (with data in rows separated by commas) or in an Excel document.

    • 2

      Read the data into the statistical software program. Attach the data to a variable that can be easily called by a regression function. In R, for example, you can use the command "dat1<-read.table("data file")" where "data file" is the location of the data on your hard drive.

    • 3

      Observe the descriptive statistics of the data. This is a basic part of any data analysis procedure. Check the descriptive statistics, such as means and standard deviations, to check for any irregularities or outliers in the data. Plot a histogram, and confirm that it appears in the shape of a negative binomial distribution. Most of the data should be to the left side of the histogram, with exponentially less data being visible as you look to the right. If there are any problems with the data, or the data do not appear to be distributed in a negative binomial distribution, consider different forms of analysis, and do not perform negative binomial regression.

    • 4

      Run the negative binomial regression function, specifying the dependent and independent variables. The call command for such a function varies by software, but for R it is "glm.nb(DV ~ IVs)," where "DV" is the dependent variable (the count data) and "IV" is the function of independent variables, which depends on your assumed relationship regarding the interaction of the independent variables.

Learnify Hub © www.0685.com All Rights Reserved