2D Histogram Plus Kernel Density

 

This macro creates histograms from a single column of data. Two types of histograms are available. The usual histogram, called ‘histogram’ in the macro, counts the number of occurrences of the data in pre-defined bins. The cumulative histogram, called ‘cumulative’, is the running sum of counts in the histogram bins. The kernel density may be added to the histogram or displayed separately. The macro provides various options as shown by the macro dialog.

The Source dropdown box identifies columns containing data. This allows selection of the histogram column from a list of column numbers or column titles.

Three Binning Methods are available – Automatic, Min/Max and Start/End.

Automatic creates bin boundaries that fall on tick marks or sub-multiples of tick marks. You specify the approximate number of bins and this method finds the closest number of bins to that value with appropriate bin boundaries. Min/Max finds the minimum and maximum of the data and sets the left edge of the first bin to the minimum value and the right edge of the last bin to the maximum value. You may enter or select the number of bins to produce a pleasing result. Min/Max may not result in appropriate bin boundaries so Start/End allows you to specify the left edge of the first bin boundary and the right edge of the last bin boundary. The following options result in bin boundaries every 0.5.


The two types of histograms are shown in Figure 1.

Figure 1. The two types of histograms – ‘histogram’ on the left and ‘cumulative’ on the right. Options used were Automatic, Normalization = None and Vertical Bar.

The Normalization option allows results to be expressed as fraction or percent. If Normalization = 1 is selected then the counts in each bin are divided by the total counts. Thus the sum of the histogram bar heights will equal 1.0. For the cumulative the last bar height will equal 1.0. This is shown in Figure 2 where the bar outline has been deselected.

Figure 2. Normalization = 1. Min/Max, Bar Outline deselected. The histogram bar heights sum to 1.0.

Normalization = 100 results in bar heights summing to 100. This is shown in Figure 3 where the Vertical Step plot is used.

Figure 3. Normalization = 100. Min/Max, Vertical Step plot. The histogram bar heights sum to 100.

Four types of graphs are provided: Vertical Bar, Vertical Step, Horizontal Bar and Horizontal Step. The Vertical Bar and Vertical Step plots are shown in Figures 2 and 3. Examples of the Horizontal Bar and Horizontal Step are shown in Figure 4.

Figure 4. 2000 Gaussian random numbers. A. Histogram, Horizontal Bar, Start/End, Interval Start = -3.5, Interval End = 3.5, 70 bins. B. Cumulative, Horizontal Step, Start/End, Interval Start = -3.5, Interval End = 3.5, 70 bins.

Bar width may be specified as a percentage (Bar Width%). Using Bar Width% = 1 results in the needle plot shown in Figure 5.

Figure 5. 2000 Gaussian random numbers. A. Histogram, Vertical Bar, Start/End, Interval Start = -3.5, Interval End = 3.5, 28 bins, Bar Width% = 50. B. Histogram, Vertical Bar, Start/End, Interval Start = -3.5, Interval End = 3.5, 28 bins, Bar Width% = 1.

The kernel density is created by selecting this option. The Bandwidth parameter determines the curve smoothness. An optimal value from SilvermanĀ“s algorithm is displayed. Increasing the bandwidth will increase the curve smoothness.

Try SigmaPlot FREE for 30 Days!