Sprecher
Beschreibung
Occam’s razor appears in several perspectives across statistics, information theory, and learning theory. In Bayesian model selection, it emerges through the marginal likelihood, which automatically trades off goodness of fit with the volume of parameter space supported by the data. Closely related ideas arise in the Minimum Description Length (MDL) framework, where model selection is interpreted as data compression, and in PAC-Bayes theory, where generalization guarantees depend on how far a learned predictor deviates from a prior distribution.
In this talk, I discuss these three perspectives on Occam’s razor—Bayesian evidence, compression, and generalization—and highlight their common structure. In each case, model complexity can be understood in terms of the information required to move from prior assumptions to a data-explaining predictor. I will discuss when these perspectives agree on model selection, when they differ, and what this reveals about the role of priors, representation, and information in modern Bayesian inference.