Clustering of functional data using fuzzy c-means based on divergences
Tina Geweniger
University of Applied Sciences Mittweida and University of Groningen

Abstract:

In many scientific fields like biology, medicine, geology etc. clustering of data plays an important role. Sets of multi-dimensional data samples are grouped to detect or visualize the underlying structure. One family of algorithms are prototype based methods, where each prototype represents one cluster center. Famous representatives are c-means and neural gas. In this talk we focus on the fuzzy cmeans as a variant of the standard c-Means algorithm determining fuzzy memberships to the cluster centers. Usually the Euclidean distance is applied to calculate distances between prototypes and data samples. Yet if these samples represent functions, i.e. high-dimensional data vectors with spatially correlated components, generalized divergences could be a more appropriate distance measures. Furthermore, incorporation of relevance learning, i.e. weighting of function intervals, might lead to improved clustering solutions. We modified the fuzzy c-means such, that both concepts - divergences and relevance learning – are integrated and present the theory in combination with some examples. To compare the respective results different cluster evaluation measures for fuzzy clustering were applied and will be discussed briefly.