mean_mass_by <- function(data, group_var) {
data |>
group_by({{ group_var }}) |>
summarise(mean_mass = mean(body_mass_g, na.rm = TRUE))
}
mean_mass_by(penguins, species)Practical #3
Advanced Statistical Programming using R - Debugging
Quiz
Before starting, work through this QUIZ to check your understanding of the concepts covered in this week’s lecture on debugging and on using LLMs (large language models) in a statistical programming workflow.
General Remarks
Two practicals in, you now have a small but growing code base of things that can — and will — break. This session is about what to do when they do, plus your first hands-on session with an LLM assistant as part of the workflow.
We’ll be using the rAI learning space by aihorizon R&D. in the course — a web-based platform that gives you access to several state-of-the-art LLMs through one interface, including OpenAI’s GPT family (GPT-5.2, GPT-4o, o3-mini), Microsoft’s Phi and MAI models, and locally-hosted models. You can pick the model that fits each task.
- We have arranged a free premium subscription for the class at least until the end of the semester. That’s free access to state-of-the-art models that would otherwise cost around €20/month each. Use it for this course, for other courses, or for personal projects — it’s yours to use however you like.
- Using it is optional, but strongly suggested. We’ll use it in the lectures and practicals moving forwards.
- There is a consent form and an intro survey in the platform when you first log in. Please read it before ticking through and complete the survey before using the platform.
- If you’ve never used one of these tools before, don’t worry: we’ll walk you through the setup step by step, and most of the practical today is about learning to use it well for statistical programming.
Exercise 0: Set up the rAI learning space by aihorizon R&D.
It’s the first time you’re using the rAI learning space, and the habits you form here (system prompt, how you prompt, how you read output) will shape how useful the tool is to you for the rest of the course.
0.1 Log in, consent, and survey
Navigate to the rAI learning space platform (link on the course Moodle page) and create an account as instructed. Your account comes with a premium subscription for the duration of the course.
On first login you will see a consent form. Read it carefully and tick your choice. Either answer is fine — it has no effect on your grade, your access to the platform, or anything you do in this class.
Complete the intro survey on the platform. It asks about your prior experience with LLMs and your expectations for the semester.
0.2 Configure your system prompt
In the platform’s Model Settings (on the left), find the System prompt field. A system prompt is a standing instruction that the model sees before every message you send — think of it as persistent preferences, not a one-off request.
Start from the template below and edit it to reflect how you actually work. At minimum, change the code style section to match your own preferences (e.g.
%>%vs|>, tidyverse vs base vsdata.table). If you have package preferences from Practical #2 (e.g.herefor paths,readrover baseread.csv), put them here.
You are helping me with an advanced statistical programming course in R.
Code style:
- Use the tidyverse (dplyr, tidyr, ggplot2, purrr) where possible.
- Use the native pipe |> rather than %>%.
- Follow the tidyverse style guide: snake_case, <- for assignment, two-space indent.
- Prefer small, pure functions over long scripts.
- For file paths use the `here` package.
When I ask for help debugging:
- First explain what the error message means, in plain English.
- Then suggest where to look in my code, BEFORE proposing a fix.
- Only write a full rewrite if I ask for one.
General:
- Do not apologise. Do not pad with filler. If you are uncertain, say so.
- If I paste in code, assume it is real code I care about — do not silently
rename variables or change the pipe style.
- Save the system prompt by clicking Model Settings again. Be aware that the System Prompt (or any of the other model settings) don’t save when you reload or close the page, so make sure to copy and paste it somewhere else.
- BONUS: Find out what it means to change the temperature value (or any of the other model settings). Where would you like to put the temperature when e.g. writing code, proof-reading an email, creating a game for your next meeting with friends…?
0.3 Calibrate: does it follow your prompt?
Start a new conversation. Ask the model to generate a small function — something like:
Write an R function that takes a numeric vector and returns a tibble with the mean, median, and standard deviation. Include one worked example.
Read the reply carefully. Does it follow your style preferences? (Pipe style, naming, assignment operator.) If not, tweak your system prompt and try again. This calibration loop is important — a system prompt that the model ignores is worse than no system prompt, because it gives you false confidence.
When you’re satisfied with the output, save the session.
0.4 Revisit a previous conversation
Reload your page and navigate to the History tab (on the left)
Locate your previous session and click the Duplicate icon (next to the trash bin). This restores your conversation and you can continue writing. Note that the Model settings are still set to default, so these need to be updated manually.
0.5 A ground rule for the rest of the session
For every LLM interaction you have today, follow this protocol:
- Read the code yourself first (at least 30 seconds).
- Write down what you think is wrong before you prompt.
- Ask for an explanation, not a rewrite on the first turn.
This is not busywork — it’s the difference between using the LLM as a tool and being used by it.
Exercise 1: Using the LLM deliberately
The goal here is to practice how to prompt, not just whether to prompt. Work with the system prompt you configured in Exercise 0 and follow the protocol from 0.5.
- Copy the following buggy snippet into a new R script. Do not run it yet.
library(palmerpenguins)
library(dplyr)
mean_mass_by <- function(data, group_var) {
data |>
group_by(group_var) |>
summarise(mean_mass = mean(body_mass_g, na.rm = TRUE))
}
mean_mass_by(penguins, species)Before asking the LLM anything, read the code carefully and write down (on paper or in a comment) what you expect it to do and what you think might go wrong. This 30-second pause is the single most effective debugging habit you can build.
Now run the code. Copy the full error message.
Go back to the platform, start a new conversation and check that your model settings are set as you want them to be. Ask the LLM to explain the error message, not to fix it. A good prompt template:
Here is my code:
[code]. Here is the error:[error]. Please explain what R is complaining about, and point me to the line where the problem originates. Do not rewrite my code yet.Once you understand why it fails, fix the function yourself. Then ask the LLM to review your fix.
This is a non-standard evaluation (NSE) bug. Inside the function, group_var is the literal string "group_var" — not the column species you passed in. Fix with the embrace operator { }:
- Reflect for a minute with your neighbour: what did the LLM get right? Did it hallucinate anything? Would you have found the bug faster without it?
Exercise 2: Locating errors in R code
When R throws an error, the message alone often isn’t enough — you need to know where in the call stack it came from.
- Copy the following into a new R script and run it. You should get an error.
library(palmerpenguins)
library(dplyr)
summarise_species <- function(data) {
data |>
group_by(species) |>
summarise(mean_mass = mean_body_mass(body_mass_g))
}
mean_body_mass <- function(x) {
mean(x, na.rm = TREU) # typo!
}
summarise_species(penguins)Read the error message first. dplyr errors are short but structured — each line tells you something different. Before you touch anything else, work through it line by line: where is the error, and what is it?
Now run
rlang::last_trace()(ortraceback()if you prefer the older format). You will see a long stack — most of it is dplyr internals you can ignore. Scan bottom-up for frames that name your own functions. Which is the innermost frame that belongs to code you wrote?
Read the error message line by line:
Error in `summarise()`:
ℹ In argument: `mean_mass = mean_body_mass(body_mass_g)`.
ℹ In group 1: `species = Adelie`.
Caused by error in `mean_body_mass()`:
! object 'TREU' not found
Run `rlang::last_trace()` to see where the error occurred.
Each line adds one piece of information:
Error in summarise()— the dplyr verb where it surfaced.In argument: ...— which argument ofsummarise()was being computed.In group 1: species = Adelie— which group was being processed.Caused by error in mean_body_mass()— your helper function.object 'TREU' not found— the actual R-level problem.Run rlang::last_trace() ...— dplyr tells you what to do next.
You already know the bug is in mean_body_mass() before running last_trace(). The backtrace just confirms the exact line.
The backtrace. rlang::last_trace() shows ~14 frames. Most are dplyr internals (summarise.grouped_df, summarise_cols, map, lapply, mask$eval_all_summarise, …) — safe to skip. The frames that matter are:
- Frame 1:
summarise_species(penguins)— your top-level call. - Frame 11:
mean_body_mass(body_mass_g)— your function, innermost. This is where the bug lives. - Frames 12–14:
mean()→mean.default()→isTRUE(na.rm)— base R tries to evaluateTREUand can’t find it.
Fix: replace TREU with TRUE on the na.rm line of mean_body_mass().
Lesson: the error message usually gets you close; the backtrace pins down the line. Scan for your own function names and stop at the innermost one.
Replace
TREUwith a value that does exist but makes no sense, e.g.na.rm = "banana". Re-run. How does the error message change? This is a classic “the error is not where you think” situation — the traceback still points tomean_body_mass, but the underlying message is now about argument types, coming from deeper insidemean.default().Fix both bugs.
Exercise 3: browser()
browser() drops you into an interactive session inside a running function, so you can inspect variables at the moment things go wrong. The six commands you need are:
| Command | Effect |
|---|---|
n |
Run the next line |
s |
Step into a function call on the current line |
f |
Finish the current loop / function |
c |
Continue until the next browser() or the end |
Q |
Quit the debugger |
where |
Print the call stack |
- Consider this buggy recursive factorial:
my_factorial <- function(n) {
if (n == 1) return(1)
return(n * my_factorial(n - 1))
}
my_factorial(5) # returns 120 — correct
my_factorial(0) # hangs / errors
my_factorial(3.5) # also wrong- Insert
browser()as the first line of the function body and callmy_factorial(0). Usento step through. At each step, check the value ofn. What is happening?
The base case is n == 1, but with n = 0 we never hit it — we recurse to -1, -2, … and either blow the stack or (with 3.5) never reach an integer. Fix by broadening the base case:
my_factorial <- function(n) {
stopifnot(n >= 0, n == as.integer(n))
if (n <= 1) return(1)
return(n * my_factorial(n - 1))
}Note that inserting a stopifnot() upfront is a debugging-prevention tool: it fails loudly on bad input instead of hanging silently.
- Remove the
browser()call once you have fixed the function.
Exercise 4: debug() and debugonce() (~10 min)
browser() requires editing the function body, which is inconvenient when the function lives in a package. debug() and debugonce() attach a debugger to a function from the outside.
debug(f)— enter the debugger on every subsequent call tof(), until you runundebug(f).debugonce(f)— enter the debugger on the next call tof(), then detach automatically.
- Take the following (silently) buggy function:
standardise <- function(x) {
(x - mean(x)) / sd(x)
}
standardise(c(1, 2, 3, 4, 5)) # fine
standardise(c(1, 2, NA, 4, 5)) # returns all NAs — why?- Run
debugonce(standardise)and thenstandardise(c(1, 2, NA, 4, 5)). Step through withn. At each line, printx,mean(x), andsd(x). Which value is the NA coming from?
mean(x) and sd(x) both return NA by default when x contains NA, which propagates to the result. Fix with na.rm = TRUE:
standardise <- function(x) {
(x - mean(x, na.rm = TRUE)) / sd(x, na.rm = TRUE)
}Note that debugonce() detaches itself after one call — try running standardise() again and you will not be dropped into the debugger. Compare with debug(standardise), which would keep re-entering until you call undebug(standardise).
- When would you prefer
debug()overdebugonce()? Jot down one scenario.
Exercise 5: Debugging Quarto (~10 min)
Not every error comes from R. When quarto render fails, the first job is to work out which tool is complaining: R (your code), knitr (the engine that runs your code), Pandoc (the renderer), or LaTeX (only for PDF output). The console output usually tells you, but you have to read it carefully.
- Create a new file
broken.qmdin your course folder and paste in:
---
title: "A broken document"
format: html
execute:
echo: true
---
## First chunk
::: {.cell}
```{.r .cell-code}
library(palmerpenguins)
head(penguins)
```
:::
## Second chunk
::: {.cell}
```{.r .cell-code}
penguins |>
filter(species = "Adelie") |>
ggplot(aes(x = bill_length_mm, y = bill_depth_mm)) +
geom_point()
```
:::
## Third chunk
::: {.cell}
```{.r .cell-code}
1 + 1
```
:::- From the command line, run
quarto render broken.qmd. You should see it fail. Before fixing anything, answer: does the error come from R, from knitr, or from Quarto/Pandoc? How can you tell?
There are three separate problems, each surfacing from a different layer of the stack. Chunks knit top-to-bottom, so you will encounter them in this order:
- R code error in the second chunk:
filter(species = "Adelie")uses=(argument assignment) instead of==(equality). dplyr raises an error like “Problem while computing..1 = species = \"Adelie\"”. Fix:filter(species == "Adelie"). - Missing library (still second chunk, revealed after you fix bug 1):
ggplot2is not loaded, soggplot()is not found. Addlibrary(ggplot2)(orlibrary(tidyverse)). - Knitr chunk-option error in the third chunk:
echo=TREUis evaluated by knitr before the chunk body runs.TREUis not a defined object, so knitr aborts withobject 'TREU' not found. Fix:echo=TRUE.
The pedagogical point: fix one error, re-render, read the next error. The error message changes layer each time — a dplyr runtime error, then a namespace lookup error, then a knitr option-parse error. Learning to read which tool is complaining is half the skill.
Fix the errors one at a time, re-rendering after each fix. Notice how the error message changes as you peel back the layers.
Change
format: htmltoformat: pdfand try to render again. If you do not have a LaTeX distribution installed you will get a different class of error entirely, from LaTeX. Installtinytex::install_tinytex()only if you want to explore this; otherwise revert tohtml.
Exercise 6 (main activity): Create a debugging challenge (~10 min to start)
You will now create a small debugging challenge that a classmate will solve next week. This mirrors how bugs arrive in real life: someone else’s code, some context, and a vague sense that it doesn’t work.
Starting material: use the data-loading function and the ggplot2 wrapper function you wrote in Practical #2. They are realistic, non-trivial, and yours — exactly the kind of code that develops interesting bugs in the real world. You will break them on purpose.
Deliverable: a folder containing
- A
.qmdor.Rfile with exactly three bugs planted in working code. - A
README.mddescribing what the code is supposed to do (but not revealing the bugs). - A
solutions.mdfile, committed in a separate branch calledsolutions, which lists each bug, the line it is on, and one-sentence explanation of the fix.
Constraints on the bugs (one from each category):
| Category | Examples |
|---|---|
| A syntax / parse error | typo in a keyword, unmatched brace, bad chunk option |
| A runtime error | wrong argument type, missing na.rm, NSE mistake ({ } forgotten in a wrapper), off-by-one loop bound |
| A silent / logical error | code runs, but returns the wrong answer (wrong variable, wrong grouping, mean instead of median, = vs ==) |
Constraints on the code:
- Must use the
palmerpenguinsdataset (for continuity with Practicals #1–#2). - Must include at least one user-defined function — ideally one of yours from Practical #2 (the data-loader or the ggplot2 wrapper).
- Must render (once fixed) as a Quarto HTML document.
- Keep it short: ≤ 40 lines of code.
Sketch your three bugs on paper first. Then start from your working code from Practical #2 and introduce the bugs — don’t write buggy code from scratch, it’s too easy to accidentally make it unfixable.
Initialise a git repo in the folder, commit the buggy version on
main, and commit the fixed version +solutions.mdon asolutionsbranch. Push to GitHub.You will only have time in class today to plan your bugs and get the repo set up. Finish the challenge before next week’s practical — upload the link to Moodle by Thursday evening.
In next week’s practical you will be assigned a classmate’s repo. You will have 30 minutes to find and fix all three bugs using the tools from this session (error messages,
traceback(),browser(),debug(), and — deliberately — the LLM). The author will then walk you through the intended solution.
The best challenges are realistic — bugs you might actually write — not obscure gotchas designed to humiliate your classmate. Aim for “oh, of course” moments, not “how was I supposed to know that” moments.
my-debugging-challenge/
├── README.md # context + what the code should do
├── analysis.qmd # buggy code, on main branch
└── solutions.md # on `solutions` branch only
A minimal README.md:
# Penguin bill-ratio analysis
This analysis computes the bill-length-to-bill-depth ratio for each penguin
species and plots the distribution. It is intended to render as an HTML
Quarto document.
There are exactly **three** bugs: one parse error, one runtime error,
one silent logical error. Find and fix all three.
Wrap-up discussion
With ~10 minutes to go, we will reconvene briefly to discuss:
- Which debugging tool did you reach for first? Which did you reach for last? Why?
- Where in the workflow did the LLM help you? Where did it mislead you?
- Did anything about the rAI learning space change how you prompted, compared to how you’d normally use e.g. ChatGPT or Claude?
References
- Shannon Pileggi, Debugging in R (NHS-R 2023 workshop) — slides this lecture drew on.
- Jenny Bryan & Jim Hester, What They Forgot to Teach You About R, Chapter 11: Debugging R code.
rstats-wtf/wtf-debugging— worked examples, many adapted in this practical.- Hadley Wickham, R for Data Science (2e), Chapter 5: Workflow — getting help.