Adaptive Dissimilarity Measures, Dimension Reduction and Visualization (University of Groningen)

Date
Abstract
Links
Bib
@article{Bunte_ki_2012,
author = {Kerstin Bunte},
title = {{Adaptive Dissimilarity Measures, Dimension Reduction and Visualization (University of Groningen)}},
journal = {K\"unstliche Intelligenz},
pages = {417--418},
volume = {26},
number = {4},
doi = {10.1007/s13218-012-0200-9},
year = {2012},
url = {http://dx.doi.org/10.1007/s13218-012-0200-9},
abstract = { My thesis presents several extensions of the Learning Vector Quantization (LVQ) algorithm based on the concept of adaptive dissimilarity measures. The metric learning gives rise to a variety of applications.  This thesis includes applications of Content Based Image Retrieval (CBIR) for dermatological images, supervised dimension reduction and advanced texture learning in image analysis, which are discussed in the first part. The detailed investigation of dimensionality reduction is addressed in the second half of the thesis. We propose a general framework which facilitates the adaptation of a variety of dimension reduction methods for explicit mapping functions. This enables not only the possibility of direct out-of-sample extensions, but also the theoretical investigation of the generalization ability of dimension reduction. The concept is illustrated on several unsupervised and supervised examples. Furthermore, a novel technique for efficient unsupervised non-linear dimension reduction is proposed combining the concept of fast online learning and optimization of divergences. In contrast to most non-linear techniques, which display a computational effort growing at least quadratic with the number of points, the proposed method comprise a linear complexity. Finally, three divergence based algorithms are generalized and investigated for the use of arbitrary divergences. },
intro = {Due to advanced sensor technology, rapidly increasing digitalization capabilities and the availability of less and less expensive storage volume the amount of data has grown tremendously in the last decades. In the years between 1999 and 2002 an increase of stored information about 30% each year was estimated (Lyman and Varian 2003).  Usually this data consists of a variety of measured features leading to also very high dimensional data sets. Manually inspection of the data becomes more costly and automatic methods to help humans to quickly scan through massive data amounts are desirable. This gave rise to many applications in computer science to process the available data: advanced techniques including data mining (Han and Kamber 2005), pattern recognition (Duda et al. 2000) and machine learning (Mitchell 1997, Ripley 1996, Bishop 2006), among others. Even with great progress in those fields the optimization of existing methods and development of novel schemes is highly desirable to perform faster and more efficient data analysis. The field of machine learning concerns the design of algorithms, which aim at the optimization of adaptive systems on the basis of example data. A model is adapted to learn complex patterns and process new data coming from the same domain better regarding the specified objective. The analysis of patterns involves a number of tasks including data representation, classification, clustering, density estimation, regression, feature extraction and dimension reduction, just to name a few. A lot of data visualization tools have been developed to use cognitive capabilities of humans for structure detection in visual images. Structural characteristics of the data can be captured almost instantly by humans despite the amount of data points which are represented in the visualization. Hence, dimension reduction and visualization are commonly used modern data mining techniques (Lee and Verleysen 2007). Machine learning is broadly categorized into reinforcement, supervised and unsuper- vised learning. Reinforcement learning is inspired by behaviorist psychology and concerns the finding of suitable actions to maximize some notion of reward (Sutton and Barto 1998). Supervised techniques involve external supervision, which provides correct responses to the given inputs. The aim is usually the discrimination of the categories and to maximize the generalization for novel data. Unsupervised methods, on the other hand, do not need supervision and their goal is the discovery of underlying structures and regularities based on the definition of some basic properties of the data. An elaborate description concerning the history of machine learning can be found in, e. g. (Bishop 1995, Ripley 1996, Mitchell 1997, Duda et al. 2000, Bishop 2006).  A very intuitive supervised technique called k-Nearest Neighbor (k-NN) classifier compares the unknown data to all known examples with respect to some dissimilarity measure (Duda et al. 2000). Obviously the computational effort and memory usage scales with the number of known samples. Therefore prototype-based techniques were developed, which employ representations of data subsets.  The prototypes are vector locations in the feature space. They usually serve as typical representatives and reflect the characteristics of the data in their direct neigh- borhood. Some prominent unsupervised examples are the Self-organizing Map (SOM) (Kohonen et al. 2001) and Neural Gas (NG) (Martinetz and Schulten 1991). And a popular supervised family of such prototype-based classification methods is Learning Vector Quantization (LVQ) (Kohonen et al. 2001).  All these methods crucially depend on the distance measure, which is used to adapt the prototype positions and performs the nearest prototype classification. Therefore the learning of adaptive metrics with respect to the given problem at hand was investigated (Xing et al. 2002, Chopra et al. 2005, Frome et al. 2007, Schneider et al. 2009b, Schneider et al. 2009a). This thesis investigates adaptive dissimilarities and applications varying from classification up to supervised and unsupervised dimension reduction. },
}