Jump to: overview review papers example topics

A key issue in my scientific activities is the analysis, modelling, and application of learning in adaptive information processing systems. This includes, for instance, neural networks for classification and regression tasks as well as problems of unsupervised learning and data analysis.

Frequently, training can be formulated as an optimization problem which is guided by an appropriate cost function. One particularly successful approach is suitable for the investigation of systems with many adaptive parameters, the choice of which is based on high-dimensional randomized example data.

Key ingredients of the appoach are:

The

M. Biehl and N. Caticha

in:

M.A. Arbib (ed.), MIT Press (2003)

M. Biehl, in:

M. Biehl, M. Ahr, E. Schlösser,

in: Advances in Solid State Physics 40, B. Kramer (ed.), Vieweg (2000)

T.L.H. Watkin, A. Rau, and M. Biehl, Reviews of Modern Physics 65 (1993) 499.

** Selected earlier projects and publications **
(back to the top)

For a more complete overview of my activities, see also my related, electronically

available
articles and the complete
list of
publications.

A statistical physics analysis of perceptron training, due to M. Opper, led to the development of this training prescription. It maintains the favorable convergence properties of the Adaline algorithm (Widrow and Hoff) and - at the same time - yields the perceptron of optimal stability in linearly separable cases.

Recently, the algorithm has re-gained popularity in the context of

the influence of sample size and prevalence

A. Freking, M. Biehl, C. Braun, W. Kinzel, M. Meesmann, Phys. Rev. E 60 (1999) 5926.

C. Marangi, M. Biehl, and S.A. Solla, Europhysics Letters 30 (1995) 117.

M. Biehl, J.K. Anlauf, and W. Kinzel

in:

J.K. Anlauf and M. Biehl, Europhysics Letters 10 (1989) 687.

The single layer perceptron can be interpreted as the basic building block from which to construct more complex devices, including dynamical networks of the Little-Hopfield type and multilayered feed-forward architectures. Consequently, it has served as a prototype system for the investigation of learning processes, in general. This applies in particular to the framework of on-line learning. Results are directly applicable to multilayer networks with non-overlapping receptive fields. The following are just a few major questions addressed in this context:

M. Biehl, P. Riegler, and M. Stechert, Physical Review E 52 Rapid Comm. (1995) R4624.

M. Biehl and P. Riegler, Europhysics Letters 28 (1994) 525.

M. Biehl and H. Schwarze, Europhysics Letters 20 (1992) 733.

As opposed to problems of classification and regression, example data are unlabelled in unsupervised learning. Here the aim is to extract relevant inforamtion from input data only. Potential applications of unsupervised learning include feature detection for low-dimensional representation, data compression, or as a pre-processing step in supervised learning. Both, on-line and off-line unsupervised learning can be modelled and analysed along the same lines as supervised training. Example topics include

E. Schlösser, D. Saad, M. Biehl, Journal of Physics A 32 (1999) 4061.

M. Biehl, A. Freking, G. Reents, Europhysics Letters 38 (1997) 427.

M. Biehl and A. Mietzner, Europhysics Letters 24 (1993) 421.

In regression problems it is possible to base learning on a continous error measure and use gradient based methods for the choice of, for instance, the weights in a neural network. Our studies of on-line gradient descent initiated the first analytic treatment of the learning dynamics in multilayered neural networks. A topic of major importance in this context is the discussion of quasistationary states in the learning dynamics. These so-called plateaus are observed in a variety of learning scenarios and are well known from practical applications of machine learning. Insights into the nature of the plateau problem were the basis for the later development of novel, more efficient training prescriptions for, both, regression and classification.

M.Biehl, P. Riegler, and C. Wöhler, Journal of Physics A 29 (1996) 4769.

M. Biehl and H. Schwarze, Journal of Physics A 28 (1995) 643.

P. Riegler and M. Biehl, Journal of Physics A 28 (Letters) (1995) L507.

The above mentioned plateaus in on-line training are just one example for specialization process which reflect inevitable symmetries of the learning scenario. Often, elements of the trained device can be exchanged with no effect on the performance. A permutation symmetry holds, for instance, for the branches of a multilayered neural network or for the prototypes in unsupervised competitive learning. Successful learning requires the breaking of such symmetries or, in other words, the specialization of sub-systems, e.g. the network branches.

The counterpart of on-line plateaus in off-line or batch training are

C. Bunzmann, M. Biehl, and R. Urbanczik, Physical Review Letters 86 (2001) 2166.

M. Ahr, M. Biehl, R. Urbanczik, European Physics Journal B 10 (1999) 583.

M. Biehl, A. Freking, G. Reents, and E. Schlösser, Philosophical Magazine B 77 (1998) 1487.

(back to the top) last update February 2005