Clustering of functional data using fuzzy c-means based on divergences
University of Applied Sciences Mittweida and University of Groningen
In many scientific fields like biology, medicine, geology etc. clustering of
data plays an important role. Sets of multi-dimensional data samples are
grouped to detect or visualize the underlying structure. One family of
algorithms are prototype based methods, where each prototype represents
one cluster center. Famous representatives are c-means and neural gas.
In this talk we focus on the fuzzy cmeans as a variant of the standard
c-Means algorithm determining fuzzy memberships to the cluster centers.
Usually the Euclidean distance is applied to calculate distances between
prototypes and data samples. Yet if these samples represent functions,
i.e. high-dimensional data vectors with spatially correlated components,
generalized divergences could be a more appropriate distance measures.
Furthermore, incorporation of relevance learning, i.e. weighting of function
intervals, might lead to improved clustering solutions. We modified the
fuzzy c-means such, that both concepts - divergences and relevance
learning – are integrated and present the theory in combination with some
examples. To compare the respective results different cluster evaluation
measures for fuzzy clustering were applied and will be discussed briefly.