Bayesian Data Analysis

What Does My Model Say?

Joshua Wilson Black

Te Kāhui Roro Reo | New Zealand Institute of Language, Brain and Behaviour

Te Whare Wānanga o Waitaha | University of Canterbury

Overview

Overview

  1. Extracting and plotting estimates
  2. Evaluating model fit

Extracting and plotting estimates

The posterior

  • Bayesian models produce probability distributions for every ingredient of the model.
  • The probability distribution after the model has been updated by the data is called the posterior distribution
  • We use the posterior distribution to make predictions, resulting in a posterior predicitive distribution

🎶 glorious things of thee are spoken… 🎶

  • Front vowels of Southern Standard British English (SSBE) have lowered over 20th C.
  • Assumption: British choral singing is closely related to SSBE
  • Question: Do front vowels lower in choral singing in both SSBE (Cambridge) and non-SSBE (Glasgow) areas (Marshall et al. 2024)
  • Method: Bayesian linear mixed effects models

Model

  • Glasgow and King’s College choirs modelled separately:
Vowel Formant ∼ Vowel + Time + Duration + FollowingSegment + 
  Genre + Vowel:Time + Vowel:Duration + Time:Duration + 
  Genre:Duration + (1|Album) + (1|Song) + (1|Word) + 
  (1|Album:Song) + (1|Song:Word)
  • Vowel: fleece, kit, dress, trap
  • Time: Identified with periods of musical directors
  • (I’ll tell you about the priors next week)

Drawing from the Posterior Distribution

  • We can draw samples from any probability distribution and summarise the sample.
  • This is how we learn about model parameters
  • No meaningful distinction between model parameters and ‘post-hoc’ comparisons
    • What is ‘post’ here? post statistical test. Doesn’t apply here. (cf. Marshall et al.)

Parameters

  • Let’s look at where kit, dress, trap sit by sampling from the posterior.
  • NB: they are ‘sum coded’.
    • i.e., they represent differences from the grand mean.
library(tidybayes)
vowel_post <- f1_mod |> 
  gather_draws(regex = TRUE, `b_vowel[A-Z]+`) 
  
vowel_post |> 
  ggplot(
    aes(
      y = .variable,
      x = .value
    )
  ) +
  stat_halfeye(.width = c(0.5, 0.7, 0.95))

Parameters

I mean everything has a distribution

  • Where does the F1 of ‘lowing’ in Away in a Manger sit after taking into account the main effects?
a_word_post <- f1_mod |> 
  gather_draws(`r_song:text[away.in.a.manger_lowing,Intercept]`) |> 
  mutate(.variable = "lowing")
  
a_word_post |> 
  ggplot(
    aes(
      y = .variable,
      x = .value
    )
  ) +
  stat_halfeye(.width = c(0.5, 0.7, 0.95))

I mean everything has a distribution

Point and interval summaries

  • Pick a point (mean, median, mode) and an interval.
  • Highest density interval: the shortest interval containing a given probability mass (e.g. 87%)
a_word_post |> median_hdi(.width= .87)
# A tibble: 1 × 7
  .variable .value   .lower .upper .width .point .interval
  <chr>      <dbl>    <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 lowing     0.362 -0.00986  0.707   0.87 median hdi      
  • Quantile interval: interval centred on the median, containing a given probability mass.
a_word_post |> median_qi(.width= .95)
# A tibble: 1 × 7
  .variable .value .lower .upper .width .point .interval
  <chr>      <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 lowing     0.362 -0.105  0.826   0.95 median qi       
  • i.e.: from the 0.025 to the 0.975 quantile (with .width=0.95)

Exploring an interaction

  • Marshall et. al. use the following plot (my code):
library(modelbased)
f1_means <- estimate_means(
  f1_mod, by = c("vowel", "director"),
  backend = "emmeans",
  ci = 0.9
)

f1_means |> 
  mutate(
    region = if_else(
      str_detect(director, "BO|DW|PL|SC"), "Cambridge", "Glasgow"
    )
  ) |> 
  ggplot(
    aes(
      x = vowel,
      y = Mean,
      ymin = CI_low,
      ymax = CI_high,
      colour = director,
      shape = region
    )
  ) +
  geom_errorbar(position=position_dodge(width=0.6)) +
  geom_point(position=position_dodge(width=0.6)) +
  scale_y_reverse() +
  labs(
    x = NULL,
    y = "F1",
    colour = "Director",
    shape = "Region"
  )

Exploring an interaction

Plotting predictions

  • The posterior predictive distribution is what happens when we draw predictions from the posterior.
  • This includes all of the uncertainty in the parameters.
    • e.g., some predictions for trap will be ~1.05 above the grand mean, more will be ~0.95.
  • The posterior distribution describes the model parameters which best match the data.
  • The posterior predictive distribution describes the predictions we expect from the model given the data.

Get draws from PPD

mod_preds <- estimate_prediction(f1_mod, data="grid", iterations=100)

Plot predicted F1s

mod_preds |> 
  filter(
    director %in% c("HSR (1925-1957)", "MJS (1987-2019)")
  ) |> 
  ggplot(
    aes(
      y = Predicted,
      x = vowel,
      colour = director
    )
  ) +
  geom_jitter(alpha = 0.2) +
  scale_y_reverse()

Plot predicted F1s

Summarise predicted F1

mod_preds |> 
  filter(
    director %in% c("HSR (1925-1957)", "MJS (1987-2019)")
  ) |> 
  ggplot(
    aes(
      y = Predicted,
      x = vowel,
      colour = director
    )
  ) +
  geom_boxplot(position="dodge") +
  scale_y_reverse()

Summarise predicted F1

Evaluating model fit

Posterior predictive checks

  • Simple idea: compare your predictions to the actual data.
  • If your predictions don’t look anything like the data, then your model is in trouble.
  • This is signal to either ditch the model or improve it.
  • These model checks are visual and informal.

ppd()

  • One simple check is to look at the distribution of predicted values against the actual response variable.
  • The pp_check() function comes from brms.
pp_check(
  f1_mod, type="dens_overlay", ndraws = 100
) 

ppd()

Whatever you can think of…

  • What is it most important for your model to capture?
  • Perhaps I wonder if the effect of duration is non-linear…
front_data <- read_rds(here('osf', 'mod_data.rds'))
front_data |> 
  ggplot(
    aes(
      x = log_vdur,
      y = f1_lob
    )
  ) +
  geom_point(alpha=0.1, colour="grey") +
  geom_smooth() +
  geom_smooth(
    aes(
      x = log_vdur,
      y = Predicted
    ),
    data = mod_preds,
    method = "lm",
    colour = "black"
  ) +
  facet_wrap(vars(vowel))

Whatever you can think of…

Now and next time

References

Allaire, JJ, Yihui Xie, Christophe Dervieux, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, et al. 2025. rmarkdown: Dynamic Documents for r. https://github.com/rstudio/rmarkdown.
Bürkner, Paul-Christian. 2017. brms: An R Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80 (1): 1–28. https://doi.org/10.18637/jss.v080.i01.
———. 2018. “Advanced Bayesian Multilevel Modeling with the R Package brms.” The R Journal 10 (1): 395–411. https://doi.org/10.32614/RJ-2018-017.
———. 2021. “Bayesian Item Response Modeling in R with brms and Stan.” Journal of Statistical Software 100 (5): 1–54. https://doi.org/10.18637/jss.v100.i05.
Dahl, David B., David Scott, Charles Roosen, Arni Magnusson, and Jonathan Swinton. 2019. xtable: Export Tables to LaTeX or HTML. https://doi.org/10.32614/CRAN.package.xtable.
Karreth, Johannes, Shana Scogin, Rob Williams, and Andreas Beger. 2025. BayesPostEst: Generate Postestimation Quantities for Bayesian MCMC Estimation. https://doi.org/10.32614/CRAN.package.BayesPostEst.
Kay, Matthew. 2024. tidybayes: Tidy Data and Geoms for Bayesian Models. https://doi.org/10.5281/zenodo.1308151.
Lenth, Russell V., and Julia Piaskowski. 2025. emmeans: Estimated Marginal Means, Aka Least-Squares Means. https://doi.org/10.32614/CRAN.package.emmeans.
Makowski, Dominique, Mattan S. Ben-Shachar, Brenton M. Wiernik, Indrajeet Patil, Rémi Thériault, and Daniel Lüdecke. 2025. modelbased: An R Package to Make the Most Out of Your Statistical Models Through Marginal Means, Marginal Effects, and Model Predictions.” Journal of Open Source Software 10 (109): 7969. https://doi.org/10.21105/joss.07969.
Marshall, Edward J., Jane Stuart-Smith, John Butt, and Timothy Dean. 2024. “Variation and Change over Time in British Choral Singing (1925–2019).” Laboratory Phonology 15 (1). https://doi.org/10.16995/labphon.10125.
Müller, Kirill. 2025. here: A Simpler Way to Find Your Files. https://doi.org/10.32614/CRAN.package.here.
R Core Team. 2025. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Xie, Yihui, J. J. Allaire, and Garrett Grolemund. 2018. R Markdown: The Definitive Guide. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown.
Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook.