JASP_logo

“Bayesian Inference Without Tears” at CIRM

Today I am presenting a lecture for the “Masterclass in Bayesian Statistics” that takes place from October 22 to October 26th 2018, at CIRM (Centre International de Rencontres Mathématiques) in Marseille, France. The slides of my talk,“Bayesian Inference Without Tears” are here. Unfortunately the slides cannot convey the JASP demo work, but the presentations are taped so I hope to be able to provide a video link at some later point in time.

(more…)


A Bayesian Perspective on the Proposed FDA Guidelines for Adaptive Clinical Trials

The frequentist food and drug administration (FDA) has circulated a draft version of new guidelines for adaptive designs, with the explicit purpose of soliciting comments. The draft is titled “Adaptive designs for clinical trials of drugs and biologics: Guidance for industry” and you can find it here. As summarized on the FDA webpage, this draft document

 
 
 
 

“(…) addresses principles for designing, conducting and reporting the results from an adaptive clinical trial. An adaptive design is a type of clinical trial design that allows for planned modifications to one or more aspects of the design based on data collected from the study’s subjects while the trial is ongoing. The advantage of an adaptive design is the ability to use information that was not available at the start of the trial to improve efficiency. An adaptive design can provide a greater chance to detect the true effect of a product, often with a smaller sample size or in a shorter timeframe. Additionally, an adaptive design can reduce the number of patients exposed to an unnecessary risk of an ineffective investigational treatment. Patients may even be more willing to enroll in these types of trials, as they can increase the probability that subjects will be assigned to the more effective treatment.”

(more…)


Bayesian Advantages for the Pragmatic Researcher: Slides from a Talk in Frankfurt

This Monday in Frankfurt I presented a keynote lecture for the 51th Kongress der Deutschen Gesellschaft fuer Psychologie. I resisted the temptation to impress upon the audience the notion that they were all Statistical Sinners for not yet having renounced the p-value. Instead I outlined five concrete Bayesian data-analysis projects that my lab had conducted in recent years. So no p-bashing, but only Bayes-praising, and mostly by directly demonstrating the practical benefits in concrete application.

The talk itself went well, although at the beginning I believe the audience was fearful that I would just drone on and on about the theory underlying Bayes’ rule. Perhaps I’m just too much in love with the concept. Anyway, it seemed the audience was thankful when I switched to the concrete examples. I could show a new cartoon by Viktor Beekman (“The Two Faces of Bayes’ Rule”, also in our Library; concept by myself and Quentin Gronau), and I showed two pictures of my son Theo (not sure whether the audience realized that, but it was not important anyway).

(more…)


Redefine Statistical Significance XVII: William Rozeboom Destroys the “Justify Your Own Alpha” Argument…Back in 1960

Background: the recent paper “Redefine Statistical Significance” suggested that it is prudent to treat p-values just below .05 with a grain of salt, as such p-values provide only weak evidence against the null. The counterarguments to this proposal were varied, but in most cases the central claim (that p-just-below-.05 findings are evidentially weak) was not disputed; instead, one group of researchers (the Abondoners) argued that p-values should simply be undervalued or replaced entirely, whereas another group (the Justifiers) argued that instead of employing a pre-defined threshold α for significance (such as .05, .01, or .005), researchers should justify the α used.

The argument from the Justifiers sounds appealing, but it has two immediate flaws (see also the recent paper by JP de Ruiter). First, it is somewhat unclear how exactly the researcher should go about the process of “justifying” an α (but see this blog post). The second flaw, however, is more fundamental. Interestingly, this flaw was already pointed out by William Rozeboom in 1960 (the reference is below). In his paper, Rozeboom discusses the trials and tribulations of “Igor Hopewell”, a fictional psychology grad student whose dissertation work concerns the study of the predictions from two theories, T_0 and T_1. Rozeboom then proceeds to demolish the position from the Justifiers, almost 60 years early:

“In somewhat similar vein, it also occurs to Hopewell that had he opted for a somewhat riskier confidence level, say a Type I error of 10% rather than 5%, d/s would have fallen outside the region of acceptance and T_0 would have been rejected. Now surely the degree to which a datum corroborates or impugns a proposition should be independent of the datum-assessor’s personal temerity. [italics ours] Yet according to orthodox significance-test procedure, whether or not a given experimental outcome supports or disconfirms the hypothesis in question depends crucially upon the assessor’s tolerance for Type I risk.” (Rozeboom, 1960, pp. 419-420)

(more…)


Redefine Statistical Significance Part XVI: The Commentary by JP de Ruiter

Across virtually all of the empirical disciplines, the single most dominant procedure for drawing conclusions from data is “compare-your-p-value-to-.05-and-declare-victory-if-it-is-lower”. Remarkably, this common strategy appears to create about as much enthusiasm as forcefully stepping in a fresh pile of dog poo.

For instance, In a recent critique of the “compare-your-p-value-to-.05-and-declare-victory-if-it-is-lower” procedure, 72 researchers argued that p-just-below-.05 results are evidentially weak, and therefore ought to be interpreted with caution; in order to make strong claims, a threshold of .005 is more appropriate. Their approach is called “Redefine Statistical Significance” (RSS). In response, 88 other authors argued that statistical thresholds ought to be chosen not by default, but by judicious argument: these authors argued that one should justify one’s alpha. Finally, another group of authors, the Abandoners, argued that p-values should never be used to declare victory, regardless of the threshold. In sum, several large groups of researchers have argued, each with considerable conviction, that the popular “compare-your-p-value-to-.05-and-declare-victory-if-it-is-lower” procedure is fundamentally flawed.

(more…)


Replaying the Tape of Life

In his highly influential book ‘Wonderful Life’, Harvard paleontologist Stephen Jay Gould proposed that evolution is an unpredictable process that can be characterized as

“a staggeringly improbable series of events, sensible enough in retrospect and subject to rigorous explanation, but utterly unpredictable and quite unrepeatable. Wind back the tape of life to the early days of the Burgess Shale; let it play again from an identical starting point, and the chance becomes vanishingly small that anything like human intelligence would grace the replay.” (Gould, 1989, p. 45)

According to Gould himself, the Gedankenexperiment of ‘replaying life’s tape’ addresses “the most important question we can ask about the history of life” (p. 48):

“You press the rewind button and, making sure you thoroughly erase everything that actually happened, go back to any time and place in the past–say, to the seas of the Burgess Shale. Then let the tape run again and see if the repetition looks at all like the original. If each replay strongly resembles life’s actual pathway, then we must conclude that what really happened pretty much had to occur. But suppose that the experimental versions all yield sensible results strikingly different from the actual history of life? What could we then say about the predictability of self-conscious intelligence? or of mammals?” (Gould,1989, p. 48)

(more…)


« Previous Entries

Powered by WordPress | Designed by Elegant Themes