![]() ![]() P_value_bonferroni = pmin(p_value * 4, 1), Here the \(p\)-values are adjusted by multiplying them by the number of comparisons and keeping the 0.05 significance threshold.Įquivalently, we can think of this correction as reducing the significant threshold from 0.05 to \(0.05 / 4 = 0.0125\).įor example, if a \(p\)-value is 0.02 then this is greater than 0.05 when multiplied by 4, or greater than the 0.0125 adjusted threshold. If you make lots of comparisons, but don’t correct for it, your error rates are inflated 1.Ī simple way to control error rates is Bonferonni correction. This is just another flavour of the multiple comparison problem. In this case, 12% of tests have at least 1 week where \(p\) < 0.05.īy running 4 significance tests on each experiment, the chances of a false-positive increase. However, once we add early stopping into the mix the error rate goes up. Overall across all the weeks and tests 5% of \(p\)-values are less than 0.05. Mutate(stop_p_value = first(p_value, desc(stop))) %>% # all weeks this will be the p-value for the first week # first p-value stopped at, if stop = 0 for # flag for whether p-value is less than 0.05 Here a flag is added to indicate weeks where the \(p\)-value for a prop test is % Now there’s a set of tests, some naive early stopping can be added. The base-rate in the control group is set to 50%, and there is no difference between treatment and control ( treatment_effect = 0). We’ll simulate 1,000 experiments each lasting 4 weeks, with 500 new people each week. ![]() To begin with, let’s simulate a set of experiments. These functions are used here, so if you’re interested in how they work, go read the previous post. Previously I outlined a set of functions to simulate A/B tests with a binary outcome. This post will focus on group sequential designs as a method for sequential analysis.Īn outline for these methods is provided below, after introducing the problem sequential analysis addresses. Sequential analysis is broad topic covering a wide range of techniques ( Whitehead, 2005 provides a nice historical overview). The point of sequential analysis techniques is that they allow you to test sequentially in a principled way, rather than randomly peeking at the data (see Albers, 2019 for an introduction to sequential testing). Sequential analysis describes a range of techniques that allow researchers to carry out interim analyses.Īn experiment may then be stopped by these interim analyses, meaning the sample size is variable, rather than fixed. This makes it harder to divert from your plan and not have people notice.Īnother option is to allow interim analyses as the data accumulates. One way to address this issue is to go down the pre-registration route, where the full experimental protocol is specified in advance and published for others to see. There’s a lot of potential for error here if experiments are stopped or changed based on ad hoc peeks at the data. It’s common for scientists and analysts alike to check in on an experiment as the data is coming in ( Albers, 2019, Miller, 2010). This is an ideal, which people often fall short of in practice. We collect all our data then analyse it once with the appropriate statistical model. Typically the sample size of a study is fixed, ideally based on a power analysis. ![]()
0 Comments
Leave a Reply. |