Notes on using Bayesian statistics in outbreak investigation

  • recently had the opportunity to try out Bayesian methods in a large gastrointestinal disease outbreak
  • case-case study, comparing cases of outbreak infection with cases of another gastrointestinal infection
  • data collected from different regions/countries, in different ways
    • no sampling; just all the data we could get (standard approach)
  • Bayesian analysis conducted in parallel with conventional frequentist analysis
    • supposedly blinded, though I ended up being involved in both analyses, and the Bayesian analysis was presented before the frequentist analysis was complete
    • agreed in advance to use models with random intercepts for geographies
    • agreed in advance to do certain separate models based on subsets of the data (e.g. adults, children)
    • used the same cleaned data set for analysis
  • The aim was to assess the value of non-expert Bayesian analysis in this scenario (while I know how to use Bayesian methods, I would call myself an enthusiast rather than than an expert)
    • so in principle we would use default options for (e.g. priors, samples, burn-in) unless adjustments were required to fit models, but then do some sensitivity analysis and checks
  • ended up with data from cases in the low hundreds in each group
  • first impressions of Bayesian methods were positive
    • though models took longer to fit (several minutes on an average laptop), results were available fairly quickly, and quicker than the frequentist analysis
    • no problems of note with convergence (see below re initial values) but weirdly RStudio would occasionally crash (seemingly irrespective of what you were doing - e.g. sometimes while running a ggplot2 command)
    • the data had some issues of quasi-complete separation (where one level of an exposure variable perfectly predicts the outcome) and probably a degree of collinearity, which bedevilled the frequentist analysis (requiring abstruse tweaks to lme4 options), but was easier to handle with a Bayesian approach
      • there were also a predominance of zeroes in most of the exposure variables
    • all but one of the models seemed to fit well (see below)
    • there seem to be time advantages to using shrinkage rather than stepwise methods (where if you make one change to the data you may have to re-run everything)
    • epidemiologists seemed to find the results intuitive (and there were no complaints about the absence of p values)
      • however there was some tendency to use inappropriate interpretations (e.g. describing an effect as “significant” if the credible interval did not cross the null)
      • credible intervals were understood (they are what they thought confidence intervals were after all)
      • Bayes factors seemed to be easily understood (though perhaps in a p value-like way, with >10 as the cut-off - this was probably my fault)
      • weakly informative priors seemed to be understood
      • the difference between point and ROPE null hypotheses was possibly understood
    • the various packages give you some nice graphics (perhaps they all do this these days)
  • Practical points
    • as we had data on I think 100s of possible exposures, we did an initial filtering
      • we dropped variables with <5 (or was it <10?) positive responses, arguing that those could not explain more than a tiny fraction of cases
      • we also dropped any duplicative/overlapping variables, retaining the more specific variable (e.g. dropping “exposure to pets” when we also had “exposure to dogs” and “exposure to cats”)
      • we also looked for identical exposure variables which could cause fitting problems
        • caret::findCorrelation was also useful as a check
    • we used the rstanarm package throughout for the regression analysis
      # Example code
      library(rstanarm)
      options(mc.cores = parallel::detectCores(),
      				scipen = 99999)
      model1 <- 
      	case ~ ns(age, df = 4) +
      	sex +
      	exposure1 +
      	exposure2 +
      	(1 | geography)
      fit1 <- stan_glmer( 
        model1,
      	data = thedata,
      	family = binomial('logit'),
      	QR = TRUE,
      	init = 0,
      	seed = 51186)
    • we used default weakly informative priors initially (though these were perhaps too diffuse - see below)
    • we found that we had to use init=0 (setting the chains to start at the null value), as the default random values refused to work
    • we initially used QR = TRUE (QR decomposition) as we had so many exposure variables and were concerned about multicollinearity, but it gave similar (perhaps slightly larger) estimates to QR = FALSE
    • most of our variables were binary (0/1 coded); the only continuous variable was age
      • we discussed centring/scaling age, but weren’t convinced we needed to
      • we used a cubic spline to fit age (not Bayesian; just something I have always wanted to do): ns(age, df = 4) (cubic spline with 5 knots based on quantiles)
    • did the model checks with the bayesplot package:
      • rhat to check convergence
      • mcmc_trace for trace plots
      • pp_check for posterior predictive checks (good fit apparent for all the models except one (non-mixed) logistic regression model we did for one particular geography - still pondering that)
      • used check_collinearity (performance package) to, er, check for collinearity
      • there are a lot more checks in ShinyStan
    • once you have fitted a model you can get a nice regression table with parameters from the parameters package
      • there are other functions for looking at estimates and credible intervals (but don’t assume they give 95% intervals without checking - some give e.g. 90%; we stuck with 95% for familiarity)
      • you can plot your model (use pars argument to specify which estimates you want to show)
      • mcmc_areas (bayesplot package) gives you some nice visualisations of the posterior distributions
    • you can get Bayes factors with the bayestestR package:
      • bf_rope for ROPE Bayes factors (where the null hypothesis is that the parameter is zero or trivially different from zero)
    • for sensitivity analysis we tried different priors
      • a less weakly informative prior: prior=normal(0, 1) - the results of this were more in the ballpark expected (we were not expecting to see large effects)
      • a “half-Cauchy” prior: prior=student_t(1, 0, 2.5) - no major differences observed
      • we didn’t try different priors for the random intercepts, or the overall intercept, but we could have
    • note that rstanarm does “autoscaling” where you haven’t specified the scale on your priors (i.e. it guesses based on the data - some see this as empirical Bayesian, but that didn’t bother us)

Next steps are to compare the Bayesian with the frequentist results…