Building Statistical Applications

Mark Andrews

From toy examples to real analyses

The value of Shiny for researchers is wrapping actual statistical computations in interactive interfaces. This session: how to structure applications that do real work.

The problem with repeated computation

output$plot  <- renderPlot({ x <- rnorm(input$n); hist(x) })
output$stats <- renderPrint({ x <- rnorm(input$n); summary(x) })
  • rnorm(input$n) runs twice when input$n changes
  • Two different samples — plot and stats are inconsistent
  • For expensive operations this is a serious waste

Reactive expressions

samples <- reactive({ rnorm(input$n) })

output$plot  <- renderPlot({ hist(samples()) })
output$stats <- renderPrint({ summary(samples()) })
  • reactive({...}) creates a cached, reactive value
  • Called with () like a function
  • Re-runs only when its inputs change; caches result in between

The reactive graph with a shared expression

input$n  ──►  samples()  ──►  renderPlot   ──►  output$plot
                         └──►  renderPrint  ──►  output$stats
  • Diamond shape: one reactive expression feeds two outputs
  • samples() re-runs once; both outputs get the same sample

Exploring a likelihood function

ll <- reactive({
  sum(dnorm(observed, mean = input$mu, sd = input$sig, log = TRUE))
})
output$llvalue <- renderPrint({ cat(sprintf("log-lik: %.3f\n", ll())) })
output$llplot  <- renderPlot({
  mu_grid <- seq(-2, 6, length.out = 100)
  ll_grid <- sapply(mu_grid, function(m)
    sum(dnorm(observed, mean = m, sd = input$sig, log = TRUE)))
  plot(mu_grid, ll_grid, type = "l",
       xlab = expression(mu), ylab = "Log-likelihood")
  abline(v = input$mu, lty = 2, col = "red")
})
  • Slider for mu tracks where you are on the likelihood curve
  • Red line shows current parameter value

ggplot2 inside renderPlot

output$hist <- renderPlot({
  df <- data.frame(x = rnorm(input$n))
  ggplot(df, aes(x = x)) +
    geom_histogram(bins = input$bins, fill = "steelblue", colour = "white") +
    theme_minimal()
})
  • input$bins is read inside the ggplot call
  • No special treatment needed — ggplot is just R code inside a reactive context
  • renderPlot captures whatever is printed or returned

Sampling distributions

\[ \bar{X} \sim \mathcal{N}\!\left(\mu,\ \frac{\sigma^2}{n}\right) \]

  • Shiny makes it easy to show this empirically
  • Vary \(n\) and watch the distribution of sample means narrow
  • Vary the population shape and watch the CLT take effect

Reactive linear regression

model <- reactive({
  fmla <- as.formula(paste("mpg ~", input$xvar))
  lm(fmla, data = mtcars)
})
output$regplot    <- renderPlot({ pred <- predict(model(), ...) ... })
output$coeftable  <- renderTable({ coef(summary(model())) })
  • model() is fitted once; both outputs read from it
  • Selecting a new predictor refits the model and updates both outputs simultaneously

Structuring server code

For applications beyond a few outputs, keep the server organised:

  1. Define reactive expressions first (data, models)
  2. Then define render blocks (plots, tables, text)
  3. Separate observeEvent blocks for side effects

Clear names like data_clean, model_fitted, plot_main make the flow obvious.

When to use reactive expressions

Use a reactive expression when:

  • The same computation is needed by more than one output
  • The computation is expensive (fitting a model, reading a file, running a simulation)
  • You want to isolate and name an intermediate result for clarity

For cheap single-use computations, inline code inside renderPlot is fine.

Summary

  • reactive({...}) caches a computation and shares it across outputs
  • Call reactive expressions with ()
  • Structure the server: reactive expressions first, render blocks second
  • ggplot2 works inside renderPlot without any special adaptation
  • The reactive graph makes the data flow explicit and efficient