[Featured image taken from a painting by Marlijn Bouwman under a CC-BY license (and available from the artwork library on this site)]
In the 1930s, the polymath Harold Jeffreys developed a general Bayesian philosophy on hypothesis testing. Essentially, Jeffreys wanted to formalize the idea of scientific caution in statistical reasoning. Jeffreys argued that “variation must be taken as random until there is positive evidence to the contrary” (Jeffreys, p. 414, who referred to this as “Ockham’s principle”; cf. p. 342). What this meant concretely is that (a) Jeffreys assumed that for the purpose of hypothesis testing, the point-null hypothesis deserves to be taken seriously; (2) if the point-null hypothesis H0 predicted the data better than the alternative hypothesis H1, it would be statistically reckless to adopt the maximum likelihood estimator from H1; instead, it would be prudent (and expected to yield better predictions for future data) to stick with the null value for the parameter. This argument does not really depend on the null-hypothesis being true in some abstract sense. For instance, Jeffreys states:
The test required, in fact, is not whether the null hypothesis is altogether satisfactory, but whether any suggested alternative is likely to give an improvement in representing future data.” (Jeffreys, 1961, p. 391)
I might return to this issue in a later post, since the argument is crucial, central to scientific practice, and yet often forgotten in philosophical discussions on the pros and cons of (Bayesian) hypothesis testing. For now, I have tried to distill Jeffreys’s key insight in the following slogan:
Of course the razor on the tile belongs to William of Ockham (tile courtesy of Viktor Beekman). NB. Jeffreys was well aware of the idea of Bayesian model averaging, and may even have been the first to introduce the idea.
References
Jeffreys, H. (1939/1948/1961). Theory of Probability. Oxford: Oxford University Press.