How Maximum Likelihood Classification works

The algorithm used by the Maximum Likelihood Classification tool is based on two principles:

The Maximum Likelihood Classification tool considers both the variances and covariances of the class signatures when assigning each cell to one of the classes represented in the signature file. With the assumption that the distribution of a class sample is normal, a class can be characterized by the mean vector and the covariance matrix. Given these two characteristics for each cell value, the statistical probability is computed for each class to determine the membership of the cells to the class. When the default EQUAL a priori option is specified, each cell is classified to the class to which it has the highest probability of being a member.

If the likelihood of occurrence of some classes is higher (or lower) than the average, the FILE option should be used with an input a priori probability file. The weights for the classes with special probabilities are specified in the a priori file. In this situation, an a priori file assists in the allocation of cells that lie in the statistical overlap between two classes. These cells are more accurately assigned to the appropriate class, resulting in a better classification. This weighting approach to classification is referred to as the Bayesian classifier.

By choosing the SAMPLE a priori option, the a priori probabilities assigned to all classes sampled in the input signature file will be proportional to the number of cells captured in each signature. Consequently, classes that have fewer cells than the average in the sample will receive weights below the average and those with more cells will receive weights greater than the average. As a result, the respective classes will have more or fewer cells assigned to them.

When a maximum likelihood classification is performed, an optional output confidence raster can also be produced. This raster shows the levels of classification confidence. The number of levels of confidence is 14, which is directly related to the number of valid reject fraction values. The first level of confidence, coded in the confidence raster as one, consists of cells with the shortest distance to any mean vector stored in the input signature file; therefore, the classification of these cells has highest certainty. The cells comprising the second level of confidence (cell value two on the confidence raster) would be classified only if the reject fraction is 0.99 or less. The lowest level of confidence has a value of 14 on the confidence raster, showing the cells that would most likely be misclassified. Cells of this level will not be classified when the reject fraction is 0.005 or greater.


Example

The following example shows the classification of a multiband raster with three layers into five classes. The five classes are dry riverbed, forest, lake, residential/grove, and rangeland. An output confidence raster will also be produced. The input raster bands are displayed below.


Maximum Likelihood Classification illustration

The Maximum Likelihood Classification tool is used to classify the stack into five classes. The following settings were used:

Input raster bands = "redlands"

Input signature file = "wedit.gsg"

Output classified raster = "mlclass_1"

Reject fraction = "0.01"

A priori probability weighting = "EQUAL"

Input a priori probability file = "apriori_file_1"

Output confidence raster = "reject_ras"

The classified raster appears as:


Maximum Likelihood Classification illustration

Areas displayed in red are cells that have less than a 1 percent chance of being correctly classified. These cells are given the value NoData due to the 0.01 reject fraction used. The dry riverbed class is displayed as white, with the forest class as green, lake class as blue, residential/grove class as yellow, and rangeland as orange.


The list below is the value attribute table for the output confidence raster. It shows the number of cells classified with what amount of confidence. Value 1 has a 100 percent chance of being correct. There are 3,033 cells that were classified with that level of confidence. Value 5 has a 95 percent chance of being correct. There were 10,701 cells that have a 0.005 percent chance of being correct with a value of 14.

Record    VALUE    COUNT

1 1 3033

2 2 3061

3 3 9187

4 4 16717

5 5 37361

6 6 136420

7 7 269592

8 8 250863

9 9 105001

10 10 23598

11 11 11190

12 12 11546

13 13 3621

14 14 10701