Preprint: A Cautionary Note on Estimating Effect Size

This post is a teaser for van den Bergh, D., Haaf, J. M., Ly, A., Rouder, J. N., & Wagenmakers, E.-J. (2019). A cautionary note on estimating effect size. Preprint available on PsyArXiv: https://psyarxiv.com/h6pr8/

Abstract

“An increasingly popular approach to statistical inference is to focus on the estimation of effect size while ignoring the null hypothesis that the effect is absent. We demonstrate how this common “null hypothesis neglect” may result in effect size estimates that are overly optimistic. The overestimation can be avoided by incorporating the plausibility of the null hypothesis into the estimation process through a “spike-and-slab model”.”

A Concrete Example

“Consider the following hypothetical scenario: a colleague from the biology department has just conducted an experiment and the analysis yields p < 0.05. Your colleague believes that this is grounds to reject the null hypothesis and reports the result as follows: “Cohen’s d = 0.30, CI = [0.02, 0.58]”. Based on these results, what would be a reasonable point estimate of effect size? A straightforward and intuitive answer is “0.30”. However, your colleague now informs you of the hypothesis that the experiment was designed to assess: “plants grow faster when you talk to them”. Suddenly, a population effect size of 0 appears eminently plausible.”

When Are Effect Sizes Overestimated?

“Standard point estimates and confidence intervals ignore the possibility that the effect is spurious (i.e., the null hypothesis ). This is not problematic when is deeply implausible, either because was highly unlikely a priori or because the data decisively undercut . But when the data fail to undercut , or when is highly likely a priori (i.e., “plants do not grow faster when you talk to them”), then is not ruled out as a plausible account of the data. Effect size estimates that ignore a plausible are generally overconfident: the fact that provides an acceptable account of the data should shrink effect size estimates towards zero.”

A Spike-and-Slab Perspective

“The Spike-and-Slab model consists of two components. The first component, the spike, corresponds to the position that talking to plants does not affect their growth (i.e., δ = 0), whereas the second component, the slab, corresponds to the position that speaking to plants does affect their growth (i.e., δ ≠ 0). Both components are deemed a priori equally likely.” The main results of the Spike-and-Slab model are shown in Figure 1. The point estimate of effect size and the corresponding credible interval are shrunken towards zero compared to inference based on only the slab. For details see the preprint https://psyarxiv.com/h6pr8/.

References

Van den Bergh, D., Haaf, J. M., Ly, A., Rouder, J. N., & Wagenmakers, E.-J. (2019). A cautionary note on estimating effect size. Preprint available on PsyArXiv:https://psyarxiv.com/h6pr8/