Powered by JASP

The Principle of Predictive Irrelevance, or Why Intervals Should Not be Used for Model Comparison Featuring a Point Null Hypothesis

This post summarizes Wagenmakers, E.-J., Lee, M. D., Rouder, J. N., & Morey, R. D. (2019). The principle of predictive irrelevance, or why intervals should not be used for model comparison featuring a point null hypothesis. Manuscript submitted for publication. Preprint available on PsyArXiv: https://psyarxiv.com/rqnu5


The principle of predictive irrelevance states that when two competing models predict a data set equally well, that data set cannot be used to discriminate the models and –for that specific purpose– the data set is evidentially irrelevant. To highlight the ramifications of the principle, we first show how a single binomial observation can be irrelevant in the sense that it carries no evidential value for discriminating the null hypothesis \theta = 1/2 from a broad class of alternative hypotheses that allow \theta to be between 0 and 1. In contrast, the Bayesian credible interval suggest that a single binomial observation does provide some evidence against the null hypothesis. We then generalize this paradoxical result to infinitely long data sequences that are predictively irrelevant throughout. Examples feature a test of a binomial rate and a test of a normal mean. These maximally uninformative data (MUD) sequences yield credible intervals and confidence intervals that are certain to exclude the point of test as the sequence lengthens. The resolution of this paradox requires the insight that interval estimation methods –and, consequently, p values— may not be used for model comparison involving a point null hypothesis.

Preprint: Teaching Good Research Practices: Protocol of a Research Master Course

This post is an extended synopsis of Sarafoglou A., Hoogeveen S., Matzke D., & Wagenmakers, E.-J. (in press). Teaching Good Research Practices: Protocol of a Research Master Course. Preprint available on PsyArXiv: https://psyarxiv.com/gvesh/


The current crisis of confidence in psychological science has spurred on field-wide reforms to enhance transparency, reproducibility, and replicability. To solidify these reforms within the scientific community, student courses on open science practices are essential. Here we describe the content of our Research Master course “Good Research Practices” which we have designed and taught at the University of Amsterdam. Supported by Chambers’ recent book The 7 Deadly Sins of Psychology, the course covered topics such as QRPs, the importance of direct and conceptual replication studies, preregistration, and the public sharing of data, code, and analysis plans. We adopted a pedagogical approach that (1) reduced teacher-centered lectures to a minimum; (2) emphasized practical training on open science practices; (3) encouraged students to engage in the ongoing discussions in the open science community on social media platforms. In this course, we alternated regular classes with classes organized by students. For each of these, an example is given below. In addition, Table 1 displays a selection of further topics discussed in the course.

An In-Class Demonstration of Bayesian Inference

This post is an extended synopsis of van Doorn, J. B., Matzke D., & Wagenmakers, E.-J. (in press). An In-Class Demonstration of Bayesian Inference. Psychology Learning and Teaching (https://doi.org/10.1177/14757
). Preprint available on PsyArXiv:https://psyarxiv.com/d8bvn/


Over 80 years ago, Sir Ronald Fisher conducted the famous experiment “The Lady Tasting Tea” in order to test whether his colleague, Dr. Muriel Bristol, could taste if the tea infusion or the milk had been added to the cup firs. Dr. Bristol was presented with eight cups of tea and the knowledge that four of these had the milk poured in first. Dr. Bristol was then asked to identify these four cups. We revisit Fisher’s experimental paradigm and demonstrate how a similar tasting experiment, conducted in a classroom setting, can familiarize students with several key concepts of Bayesian inference, such as the prior distribution, the posterior distribution, the Bayes factor, and sequential analysis.

Informed Bayesian Inference for the A/B Test

This post is an extended synopsis of a preprint that is available on arXiv: http://arxiv.org/abs/1905.02068


Booming in business and a staple analysis in medical trials, the A/B test assesses the effect of an intervention or treatment by comparing its success rate with that of a control condition. Across many practical applications, it is desirable that (1) evidence can be obtained in favor of the null hypothesis that the treatment is ineffective; (2) evidence can be monitored as the data accumulate; (3) expert prior knowledge can be taken into account. Most existing approaches do not fulfill these desiderata. Here we describe a Bayesian A/B procedure based on Kass and Vaidyanathan (1992) that allows one to monitor the evidence for the hypotheses that the treatment has either a positive effect, a negative effect, or, crucially, no effect. Furthermore, this approach enables one to incorporate expert knowledge about the relative prior plausibility of the rival hypotheses and about the expected size of the effect, given that it is non-zero. To facilitate the wider adoption of this Bayesian procedure we developed the abtest package in R. We illustrate the package options and the associated statistical results with a synthetic example.

A Fix for the Kubbel Study

WARNING: this post deals exclusively with a chess endgame study.

A previous post discussed the Bristol theme from chess endgame study composition. One of the featured studies was created by the great Leonid Kubbel. This is what I wrote:

“Since its inception, the Bristol theme has appealed to several composers. One of the most famous, Leonid Kubbel, created the following work of art:

After some foreplay that does not concern us here, the position in the diagram was reached. Black has a huge material advantage (queen and rook versus a lone bishop) but his pieces are boxed in and White has the terrible threat of transferring his bishop to h7 via e4, delivering checkmate. However, the Bristol theme comes to the rescue: 1… Ra1 tucks the rook into the corner, such that, after White initiates his intended manoeuvre 2. Be4 (threatening mate on h7), black counters with 2…Qb1!! Black offers the queen in order to prevent mate, a gift that White cannot accept, for after 3. Bxb1?? Rxb1 Black is a full rook to the good. (I mention this line because it shows that the Bristolian rook actually fulfills a function in this study, namely to defend the queen once it has arrived on b1) Suddenly it looks as if Black is completely winning. In dire straits, White comes up with a miraculous save: 3. Bf5!!, offering White’s only remaining piece. Black has no choice but to accept, yet after 3…Qxf5 the results is stalemate, and consequently a draw.

The Future of the Earth


Most statisticians know Sir Harold Jeffreys as the conceptual father and tireless promotor of the Bayesian hypothesis test. However, Jeffreys was also a prominent geophysicist. For instance, Jeffreys is credited with the discovery that the earth has a liquid core. Recently, I read Jeffreys’s 1929 book “The Future of the Earth”, which is a smaller and more accessible version of his major work “The Earth” (Jeffreys, 1924). Indeed, the book is literally “small”. Below is a photo meant to convey the book’s size:

The book’s contents is organized along three short chapters covering topics of fundamental importance:

  1. The Future of the Sun
  2. The Cooling of the Earth
  3. The Future of the Moon


« Previous Entries

Powered by WordPress | Designed by Elegant Themes