Posted on Oct 26th, 2017

*“By seeking and blundering we learn.” *

– Johann Wolfgang von Goethe

Bayesian methods have never been more popular than they are today. In the field of statistics, Bayesian procedures are mainstream, and have been so for at least two decades. Applied fields such as psychology, medicine, economy, and biology are slow to catch up, but in general researchers now view Bayesian methods with sympathy rather than with suspicion (e.g., McGrayne 2011).

The ebb and flow of appreciation for Bayesian procedures can be explained by a single dominant factor: *pragmatism*. In the early days of statistics, the only Bayesian models that could be applied to data were necessarily simple – the more complex, more interesting, and more appropriate models escaped the mathematically demanding derivations that Bayes’ rule required. This meant that unwary researchers who accepted the Bayesian theoretical outlook effectively painted themselves into a corner as far as practical application was concerned. How convenient then that the Bayesian paradigm was “absolutely disproved” (Peirce 1901, as reprinted in Eisele 1985, p. 748); how reassuring that it would “break down at every point” (Venn 1888, p. 121); and how comforting that it was deemed “utterly unacceptable” (Popper 1959, p. 150).

Posted on Oct 21st, 2017

I (Alex Etz) recently attended the American Statistical Association’s “Symposium on Statistical Inference” (SSI) in Bethesda Maryland. In this post I will give you a summary of its contents and some of my personal highlights from the SSI.

The purpose of the SSI was to follow up on the historic ASA statement on p-values and statistical significance. The ASA statement on p-values was written by a relatively small group of influential statisticians and lays out a series of principles regarding what they see as the current consensus about p-values. Notably, there were mainly “don’ts” in the ASA statement. For instance: “P-values **do not** measure the probability that the studied hypothesis is true, nor the probability that the data were produced by random chance alone”; “Scientific conclusions and business or policy decisions **should not** be based only on whether a p-value passes a specific threshold”; “A p-value, or statistical significance, **does not** measure the size of an effect or the importance of a result” (emphasis mine).

Posted on Oct 11th, 2017

Bayesian inference offers the pragmatic researcher a series of perks (Wagenmakers, Morey, & Lee, 2016). For instance, Bayesian hypothesis tests can quantify support in favor of a null hypothesis, and they allow researchers to track evidence as data accumulate (e.g., Rouder, 2014).

However, Bayesian inference also confronts researchers with new challenges, for instance concerning the planning of experiments. Within the Bayesian paradigm, is there a procedure that resembles a frequentist power analysis? (yes, there is!)

Posted on Oct 5th, 2017

In our previous post, we discussed the paper “Abandon Statistical Significance”, which is a response to the paper “Redefine Statistical Significance” that has dominated the contents of this blog so far. The *Abandoners* include Andrew Gelman and Christian Robert, and on their own blogs they’ve each posted a reaction to our Bayesian Spectacles post. Below is a short response to their reaction to the discussion of the reply to the original paper. 🙂

Posted on Sep 29th, 2017

Andrew Gelman and Christian Robert are two of the most opinionated and influential statisticians in the world today. Fear and anguish strike into the heart of the luckless researchers who find the fruits of their labor discussed on the pages of the duo’s blogs: how many fatal mistakes will be uncovered, how many flawed arguments will be exposed? Personally, we celebrate every time our work is put through the Gelman-grinder or meets the Robert-razor and, after a thorough evisceration, receives the label “not completely wrong”, or –thank the heavens– “Meh”. Whenever this occurs, friends send us enthusiastic Emails along the lines of “Did you see that? Your work is discussed on the Gelman/Robert blog and he did not hate it!” (true story).

Posted on Sep 19th, 2017

The key point of the paper “Redefine Statistical Significance” is that p-just-below-.05 results should be approached with care. They should perhaps evoke curiosity, but they should *not* receive the blanket endorsement that is implicit in the bold claim “we reject the null hypothesis”. The statistical argument is straightforward and has been known for over half a century: for p-just-below-.05 results, the alternative hypothesis does not convincingly outpredict the null hypothesis, not even when we *cheat* and cherry-pick the alternative hypothesis that is inspired by the data.

The claim that p-just-below-.05 results are evidentially weak was recently echoed by the *American Statistical Association* when they stated that “a p-value near 0.05 taken by itself offers only weak evidence against the null hypothesis” (Wasserstein and Lazar, 2016, p. 132). Extensive mathematical arguments are provided in Berger and Delampady, 1987; Berger & Sellke, 1987; Edwards, Lindman, and Savage, 1963; Johnson, 2013; and Sellke, Bayarri, and Berger, 2001 — these papers are relevant and influential; in our opinion, anybody who critiques or praises the p-value ought to be intimately aware of their contents.