Powered by JASP

Posted on Apr 9th, 2020

I was exhausted and expecting my newborn to wake up any moment, but I wanted to look at the data. I had stopped data collection a month prior, and wasn’t due back at work for weeks, so it could have waited, but my academic brain was beginning to stir after what seemed like eons of pregnancy leave. Sneaking a peek at my still sleeping daughter, I downloaded the .csv from Qualtrics. I made columns for the independent variables, splitting the 6 conditions in half, and then fed the data into JASP. I had run the Bayesian ANOVA in JASP before, for the pilot study, and used the program for years before that, so I knew the interface by heart. I had my results, complete with a plot, within seconds.

The output wasn’t what I had expected or hoped for. It certainly wasn’t what our pilot had predicted. The inclusion Bayes factors were hovering around 1 and the plot with its huge error bars and strangely oriented lines were all wrong. Maybe I’d made a mistake. I had been in a rush after all, I reasoned, and could have easily mixed up the conditions. Several checks later, I was looking at the same wrong results through tear-filled eyes.

From the beginning, I had believed so completely in the effect we were attempting to capture. I thought it was a given that people would find the results of a registered report (RR) more trustworthy than those of a preregistration (PR), and that the PR results would be yet more trustworthy than those published `traditionally’ with no registration at all. Adding a layer of complexity to the design, we had considered familiarity for each level of registration. We expected that results reported by a familiar colleague would be more trustworthy than those of an unfamiliar person. Logical hypotheses, right? To me they were.

(more…)

Posted on Apr 2nd, 2020

Below is a summary of a preprint that features a Bayesian reanalysis of the famous/infamous Gautret et al. data. What I like about this preprint is (a) the multiverse analysis; (b) the Bayesian conclusions — they are so easy to obtain with JASP, and provide much more information then just “p<.05” or “p>.05”; but what I like most of all is (c) the emphasis on the fact that in the end, *design always beats analysis* — the Gautret et al. case strikes me as a schoolbook example of this principle. The preprint is hosted on the Open Science Framework with materials. The work is explained in a series of tweets.

(more…)

Posted on Mar 27th, 2020

As the corona-crisis engulfs the world, politicians left and right are accused of “politicizing” the pandemic. In order to follow suit I will try to weaponize the pandemic to argue in favor of Bayesian inference over frequentist inference.

In recent months it has become clear that the corona pandemic is not just fought by doctors, nurses, and entire populations as they implement social distancing; it is also fought by statistical modelers, armed with data. As the disease spreads, it becomes crucial to study it statistically: how contagious it is, how it may respond to particular policy measures, how many people will get infected, and how many hospital beds will be needed. Fundamentally, one of the key goals is *prediction*. Good predictions come with a measure of uncertainty, or at least present different scenarios ranging from pessimistic to optimistic.

So how do statistical models for corona make their predictions? I am not an epidemiologist, but the current corona modeling effort is clearly a process that unfolds as more data become available. Good models will continually consume new data (i.e., new corona cases, information from other countries, covariates, etc.) in order to update their predictions. In other words, the models learn from incoming data in order to make increasingly accurate predictions about the future. This process of continual learning, without post-hoc and ad-hoc corrections for “data snooping”, is entirely natural — to the best of my knowledge, nobody has yet proposed that predictions be corrected for the fact that the models were estimated on a growing body of data.

(more…)

Posted on Mar 19th, 2020

*This post summarizes Dablander, F. ^{⭑}, van den Berg, D.^{⭑}, Ly, A., Wagenmakers, E.-J. (2020). Default Bayes Factors for Testing the (In)equality of Several Population Variances. Preprint available on ArXiv: https://arxiv.org/abs/2003.06278.*

“Testing the (in)equality of variances is an important problem in many statistical applications. We develop default Bayes factor tests to assess the (in)equality of two or more population variances, as well as a test for whether the population variance equals a specific value. The resulting test can be used to check assumptions for commonly used procedures such as the *t*-test or ANOVA, or test substantive hypotheses concerning variances directly. We further extend the Bayes factor to allow H_{0} to have a null-region. Researchers may have directed hypotheses such as > , or want to combine hypotheses about equality with hypotheses about inequality, for example = > (, ). We generalize our Bayes factor to accommodate such hypotheses for K > 2 groups. We show that our Bayes factor fulfills a number of desiderata, provide practical examples illustrating the method, and compare it to a recently proposed fractional Bayes factor procedure by Böing-Messing and Mulder (2018). Our procedure is implemented in the R package *bfvartest*.”

(more…)

Posted on Mar 12th, 2020

In a recent blog post, Bayesian icon David Spiegelhalter proposes a new analysis of the results from the ANDROMEDA-SHOCK randomized clinical trial. This trial was published in JAMA under the informative title “Effect of a Resuscitation Strategy Targeting Peripheral Perfusion Status vs Serum Lactate Levels on 28-Day Mortality Among Patients With Septic Shock”.

In JAMA, the authors summarize their findings as follows: “In this randomized clinical trial of 424 patients with early septic shock, 28-day mortality was 34.9% [74/212 patients] in the peripheral perfusion–targeted resuscitation [henceforth PPTR] group compared with 43.4% [92/212] in the lactate level–targeted resuscitation group, a difference that did not reach statistical significance.” The authors conclude that “These findings do not support the use of a peripheral perfusion–targeted resuscitation strategy in patients with septic shock.”

(more…)

Posted on Mar 5th, 2020

The relative belief ratio (e.g., Evans 2015, Horwich 1982/2016) equals the marginal likelihood.

The relative belief ratio is *proportional to* the marginal likelihood. Dividing two marginal likelihoods (i.e., computing a Bayes factor) cancels the constant of proportionality, such that the Bayes factor equals the ratio of two complementary relative belief ratios (Evans 2015, p.109, proposition 4.3.1).

In the highly recommended book *Measuring statistical evidence using relative belief*, Evans (2015) defines evidence as follows (see also Carnap 1950, pp. 326-333; Horwich 1982/2016, p. 48; Keynes 1921, p. 170):

where represents a parameter (or, more generally, a model, a hypothesis, a claim, or a proposition). In other words, data provide evidence for a claim to the extent that they make more likely than it was before. This is a sensible axiom; who would be willing to argue that data provide evidence for a claim when they make that claim *less* plausible than it was before?

(more…)