报告题目:Determine the number of clusters by data augmentation
报告时间:2023-11-23 10:00-11:00
报 告 人 :骆威 教授 浙江大学
报告地点:理学院东北楼一楼报告厅(110)
Abstract:Determining the number of clusters
is crucial for the successful application of clustering. In this paper, we
propose a new order-determination method called the data augmentation estimator
(DAE), for the general model-based clustering. The estimator is based on a
novel idea that augments data with an independently generated small cluster,
which enables us to justify how the instability of clustering changes with the
number of clusters assumed in clustering. The pattern of instability provides
an alternative characterization of the true number of clusters to the commonly
used goodness-of-fit measure. By combining the two sources of information
appropriately, the proposed estimator reaches asymptotic consistency under
general conditions and is easily implementable. It is also more efficient than
the conventional BIC-type approaches that use the goodness-of-fit measure only.
These properties are illustrated by the simulation studies and real data
examples at the end.