Split and merge algorithm for Gaussian mixture model based on KS test
-
Abstract
Gaussian mixture model is a linear combination of finite numbers of independent Gaussian models. Estimating the number of components is an important research area. One class of algorithms based on the minimum description length determine the number of components by splitting and merging components during the iterations. Traditional algorithms use entropy ratio, KL divergence, model similarity as split and merge criteria. However, entropy ratio and KL divergence might result in excessive split because of their excessive sensitivity to sparse or concave models, and model similarity might result in excessive merge because of its inability to assess the merged models’ goodness of fitting Gaussian. In the iterations of algorithm, these excessive splitting and merging operations may cause oscillations. For these problems entropy ratio and KS test as split criteria, and models similarity and KS test were used as merge criteria, which be called problems, a split and merge algorithm for Gaussian mixture model based on KS test is proposed, with entropy ratio and KS test used as split criteria and model similarity and KS test as merge criteria. This algorithm is capable of preventing excessive split and merge, as validated by experiments conducted on seven datasets.
-
-