How to Boost Naive Bayes

Naive Bayes is a machine learning algorithm that produces a classifying function that allows researchers to categorize new pieces of data according to the variables chosen by the algorithm. Naive Bayes is special in that it assumes all of the variables included in the classification function are independent. The machine works by undergoing a training data set before being applied to the data for testing. However, sometime researchers want to “boost” the Naive Bayes algorithm so that its resulting function will be more accurate than standard methods. In short, this form of running Naive Bayes requires you to adjust the data set before running the algorithm. The boosting procedure should be performed in statistical software, as doing this by hand will be intractable.

Instructions

    • 1

      Locate the training data and testing data used for the Naive Bayes algorithm. If Naive Bayes has already been performed, this data will appear as two vectors, one for each set. The longer vector is the training data and the shorter vector is the testing data.

    • 2

      Concatenate the training and testing data vectors into one single, longer vector. In some statistical programs, this is easily done. For example, in R, use the command: new <- c(train, test). This will get the new vector. "Train" is the training vector and "test" is the testing vector.

    • 3

      Run the Naive Bayes algorithm as normal, but with the exception of running it with the new data vector as the input. The running time will be longer than normal; this is normal.

    • 4

      Observe the output. It will differ from non-boosted Naive Bayes. This function makes more accurate predictions.

Learnify Hub © www.0685.com All Rights Reserved