Put your data into your statistical software of choice. When you do this, you should clearly label each variate in your data, as multivariate data estimation requires marginalizing your data (i.e., you will need to break you data into multiple sets of univariate data). As long as you input your data in the form of a matrix, there will be no problems. For example, in the statistical software R, you may first put the data into an .csv Excel file, and then read in the data with the command “data <- read.csv(“data.csv”).
Decide which kernel you will apply to the data. The Gaussian kernel serves most practical purposes. However, most statistical software packages offer a variety of kernels for users with particular purposes. For example, R offers almost one dozen kernels, including triangular, rectangular and cosine. It is also possible to program your own kernel, provided you are familiar with how to program in your software package of choice. If in doubt as to which kernel to use, choose the Gaussian kernel.
Decide on the bandwidth for the density estimation. The bandwidth, in short, is the equivalent of the standard deviation for the smoothing process. There is no standard method of choosing a bandwidth for multivariate density estimation. Keep in mind that smaller bandwidths are less biased but lead to higher levels of variation, while larger bandwidths have less variation but are more biased. You may want to return to this step multiple times, experimenting with different bandwidths for your density estimation.
Perform the multivariate density estimation. Use the data, bandwidth and kernel you selected earlier. Most statistical software packages use a one-line call for this task, asking only for the parameters needed (data, bandwidth and kernel). In R, you call this function with “density (data, bandwidth, kernel).” The result (output) will be the multivariate density estimation.