Evaluating the stability of the classification of community data
Authors | |
---|---|
Year of publication | 2011 |
Type | Article in Periodical |
Magazine / Source | Ecography |
MU Faculty or unit | |
Citation | |
web | Fulltext on Wiley Online Library |
Doi | http://dx.doi.org/10.1111/j.1600-0587.2010.06599.x |
Field | Botany |
Keywords | Clustering methods; Vegetation classification strategies; Validation; Bootstrap; Algorithm; Fidelity |
Description | We propose a method for a posteriori evaluation of classification stability which compares the classification of sites in the original data set (a matrix of species by sites) with classifications of subsets of its sites created by without-replacement bootstrap resampling. Site assignments to clusters of the original classification and to clusters of the classification of each subset are compared using Goodman-Kruskal's lambda index. Many resampled subsets are classified and the mean of lambda values calculated for the classifications of these subsets is used as an estimation of classification stability. Furthermore, the mean of the lambda values based on different resampled subsets, calculated for each site of the data set separately, can be used as a measure of the influence of particular sites on classification stability. This method was tested on several artificial data sets classified by commonly used clustering methods and on a real data set of forest vegetation plots. |
Related projects: |