# If Your First Baby is Early, Will Your Second Baby be Early?

This post was written the week before the birth of our daughter Leanne, who finally decided to make an appearance on Valentine’s Day this year.

Nataschja is lying on our sofa, watching a series on Netflix. It is Tuesday afternoon and I am working from home. In fact, I have been working from home for about a week now, awaiting the birth of our daughter. Both Nataschja and I are getting impatient, and the uncertainty surrounding the timing of the birth isn’t helping. Non-medical labor can start anywhere from week 37 to week 42, and that is a pretty wide time window.
(more…)

# An Inconvenient Truth

On June 6th I gave a one-hour lecture for SIOS, the Student Initiative for Open Science in Amsterdam (you can follow them on Twitter @StudentIOS). The slides are at https://osf.io/5s9uq/, and a YouTube video of the entire lecture is at https://t.co/u7bkqaC6Ko.

## Abstract

This presentation consists of three parts. In the first, I will present a whirlwind tour of the p-value’s many statistical peculiarities. In the second, I will list reasons for the p-value’s continued dominance across the empirical sciences. One such reason is that the p-value can be used to silence your skeptics — that is, to discredit the null hypothesis that the experimental treatment was utterly ineffective. In the third part I will demonstrate how Bayesian hypothesis testing with JASP (jasp-stats.org) can provide a practical, principled, and easy-to-use alternative.
(more…)

# The Bayesian Methodology of Sir Harold Jeffreys as a Practical Alternative to the P-value Hypothesis Test

This post is an extended synopsis of Ly et al. (2019). The Bayesian methodology of Sir Harold Jeffreys as a practical alternative to the p-value hypothesis test. Preprint available on PsyArXiv: https://psyarxiv.com/dhb7x

## Abstract

Despite an ongoing stream of lamentations, many empirical disciplines still treat the p-value as the sole arbiter to separate the scientific wheat from the chaff. The continued reign of the p-value is arguably due in part to a perceived lack of workable alternatives. In order to be workable, any alternative methodology must be (1) relevant: it has to address the practitioners’ research question, which –for better or for worse– most often concerns the test of a hypothesis, and less often concerns the estimation of a parameter; (2) available: it must have a concrete implementation for practitioners’ statistical workhorses such as the t-test, regression, and ANOVA; and (3) easy to use: methods that demand practitioners switch to the theoreticians’ programming tools will face an uphill struggle for adoption. The above desiderata are fulfilled by Harold Jeffreys’s Bayes factor methodology as implemented in the open-source software JASP. We explain Jeffreys’s methodology and showcase its practical relevance with two examples.
(more…)

# The Principle of Predictive Irrelevance, or Why Intervals Should Not be Used for Model Comparison Featuring a Point Null Hypothesis

This post summarizes Wagenmakers, E.-J., Lee, M. D., Rouder, J. N., & Morey, R. D. (2019). The principle of predictive irrelevance, or why intervals should not be used for model comparison featuring a point null hypothesis. Manuscript submitted for publication. Preprint available on PsyArXiv: https://psyarxiv.com/rqnu5

## Abstract

The principle of predictive irrelevance states that when two competing models predict a data set equally well, that data set cannot be used to discriminate the models and –for that specific purpose– the data set is evidentially irrelevant. To highlight the ramifications of the principle, we first show how a single binomial observation can be irrelevant in the sense that it carries no evidential value for discriminating the null hypothesis $\theta = 1/2$ from a broad class of alternative hypotheses that allow $\theta$ to be between 0 and 1. In contrast, the Bayesian credible interval suggest that a single binomial observation does provide some evidence against the null hypothesis. We then generalize this paradoxical result to infinitely long data sequences that are predictively irrelevant throughout. Examples feature a test of a binomial rate and a test of a normal mean. These maximally uninformative data (MUD) sequences yield credible intervals and confidence intervals that are certain to exclude the point of test as the sequence lengthens. The resolution of this paradox requires the insight that interval estimation methods –and, consequently, p values— may not be used for model comparison involving a point null hypothesis.
(more…)

# Preprint: Teaching Good Research Practices: Protocol of a Research Master Course

This post is an extended synopsis of Sarafoglou A., Hoogeveen S., Matzke D., & Wagenmakers, E.-J. (in press). Teaching Good Research Practices: Protocol of a Research Master Course. Preprint available on PsyArXiv: https://psyarxiv.com/gvesh/

## Summary

The current crisis of confidence in psychological science has spurred on field-wide reforms to enhance transparency, reproducibility, and replicability. To solidify these reforms within the scientific community, student courses on open science practices are essential. Here we describe the content of our Research Master course “Good Research Practices” which we have designed and taught at the University of Amsterdam. Supported by Chambers’ recent book The 7 Deadly Sins of Psychology, the course covered topics such as QRPs, the importance of direct and conceptual replication studies, preregistration, and the public sharing of data, code, and analysis plans. We adopted a pedagogical approach that (1) reduced teacher-centered lectures to a minimum; (2) emphasized practical training on open science practices; (3) encouraged students to engage in the ongoing discussions in the open science community on social media platforms. In this course, we alternated regular classes with classes organized by students. For each of these, an example is given below. In addition, Table 1 displays a selection of further topics discussed in the course.
(more…)

# An In-Class Demonstration of Bayesian Inference

This post is an extended synopsis of van Doorn, J. B., Matzke D., & Wagenmakers, E.-J. (in press). An In-Class Demonstration of Bayesian Inference. Psychology Learning and Teaching (https://doi.org/10.1177/14757
25719848574
). Preprint available on PsyArXiv:https://psyarxiv.com/d8bvn/

## Abstract

Over 80 years ago, Sir Ronald Fisher conducted the famous experiment “The Lady Tasting Tea” in order to test whether his colleague, Dr. Muriel Bristol, could taste if the tea infusion or the milk had been added to the cup firs. Dr. Bristol was presented with eight cups of tea and the knowledge that four of these had the milk poured in first. Dr. Bristol was then asked to identify these four cups. We revisit Fisher’s experimental paradigm and demonstrate how a similar tasting experiment, conducted in a classroom setting, can familiarize students with several key concepts of Bayesian inference, such as the prior distribution, the posterior distribution, the Bayes factor, and sequential analysis.
(more…)