StatProg2
  • Home
  • Syllabus
  • Group Project
  • Reflection Prompts
  • Setup

On this page

  • Quiz
  • Overview
  • Part 1: Create the Package Skeleton
    • Exercise 1.1: Initialise the package
    • Exercise 1.2: Edit the DESCRIPTION file
    • Exercise 1.3: Write a README
  • Part 2: Add Data
    • Exercise 2.1: Set up the data-raw folder
    • Exercise 2.2: Write the data-raw script
    • Exercise 2.3: Check the data loaded correctly
    • Exercise 2.4: Document the dataset
  • Part 3: Add a Function
    • Exercise 3.1: Create the plot function
    • Exercise 3.2: Document and export
    • Exercise 3.3: Run devtools::check()
    • Exercise 3.4: Add a test
  • Part 4: Add Functions for a Second Dataset
  • Part 5 (Bonus Task): Vignette
    • Exercise 5.1: Create a vignette
    • Exercise 5.2: Build and preview
  • Part 6: Reflection Log
  • Keyboard Shortcuts
  • Using LLMs for Package Development
  • Resources

Practical #6

Advanced Statistical Programming using R — R Packages

Author

Leonhard Kestel, Lisa Bondo Andersen, Cynthia Huang

Published

May 22, 2026

Quiz

Before starting, work through this QUIZ to check your understanding of the concepts covered in this week’s lecture on R packages. Find a full solution R file HERE.


Overview

This is an online, drop-in consultation practical session. Lisa will be available between 2-4pm on this zoom link. Please wait in the waiting room until you’re invited in to ask your questions.

In this practical you will build a complete R data package from scratch: munichvisitors. By the end you will have a package that ships a cleaned dataset, a documented plot function, and — if you tackle the extension — a vignette.

The practical is structured into five parts:

  1. Create the package skeleton — scaffold, DESCRIPTION, README
  2. Add data — download, clean, and export the museums dataset
  3. Add a function — write, document, and test plot_museums()
  4. Add functions for a second dataset — extend the package with monthly statistics from a separate open data source
  5. Extension: vignette — write a how-to guide for your package
  6. Reflection log — record what you learned this week

At the end you will also find a keyboard shortcuts reference for the package development loop, and notes on using LLMs to scaffold and document packages.

Solutions are available for parts 1–3. Parts 4, 5, and 6 are left for you to complete. Solution suggestions for Part 4 will be available next week.


Part 1: Create the Package Skeleton

Exercise 1.1: Initialise the package

Use usethis::create_package() to scaffold a new package called munichvisitors in a location of your choice. This will open a new RStudio project.

NoteSolution

Run in the R console:

usethis::create_package("munichvisitors")

RStudio should open the new project. The folder structure will look like:

munichvisitors/
├── DESCRIPTION
├── NAMESPACE
├── R/
├── munichvisitors.Rproj
├── .gitignore
└── .Rbuildignore

Exercise 1.2: Edit the DESCRIPTION file

Open DESCRIPTION and fill in the fields below. Use your own name in Authors@R.

Package: munichvisitors
Title: Monthly Visitor Counts for Munich Museums
Version: 0.0.0.9000
Authors@R:
    person("First", "Last", , "you@example.com", role = c("aut", "cre"))
Description: Provides tidy monthly visitor statistics for Munich's museums
    from Munich Open Data (Statistisches Amt München), with a helper
    plot function for exploring trends over time.
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.2
LazyData: true
Imports:
    dplyr,
    ggplot2,
    scales
Suggests:
    janitor,
    readr,
    testthat (>= 3.0.0),
    knitr,
    rmarkdown
Config/testthat/edition: 3
VignetteBuilder: knitr
Tip

RoxygenNote should match your installed version of roxygen2. You can check with packageVersion("roxygen2") in the console. devtools::document() will update it automatically, so don’t worry if it changes.

Add the MIT licence by running in the R console (inside the new munichvisitors project):

usethis::use_mit_license()

Exercise 1.3: Write a README

Create a README template by running in the R console:

usethis::use_readme_rmd()

Open README.Rmd and write a short README following this structure (from R Packages (2e)):

  1. One paragraph describing what the package does
  2. Installation instructions: devtools::install_github("your-username/munichvisitors")
  3. A brief overview of what is included
  4. A short example showing how to use it — come back and fill this in after Exercise 3.1, once you have a working function to show
Tip

Knit README.Rmd to produce README.md — GitHub renders the .md version on your repository page. Re-knit whenever you update it.

NoteSolution: example README
---
output: github_document
---

# munichvisitors

The `munichvisitors` package provides tidy monthly visitor counts for
Munich's public museums, sourced from Munich Open Data (Statistisches
Amt München). It includes a ready-to-use plot function so you can
explore trends with a single line of code.

## Installation

``` r
# install.packages("devtools")
devtools::install_github("your-username/munichvisitors")
```

## What's included

- `museum_visitors` — a data frame of monthly visitor counts per museum,
  with year-on-year comparison figures
- `plot_museums()` — a line chart of visitor trends over time

## Example

``` r
library(munichvisitors)
plot_museums()
```

## Data source

Landeshauptstadt München (2017). Monatszahlen Museen. Statistisches Amt München.
Lizenz: [Datenlizenz Deutschland Namensnennung 2.0](https://www.govdata.de/dl-de/by-2-0).
<https://datengartln.de/datasets/detail/bfb4a286-bea5-4bfe-82ce-b9bd354284a5/>

Part 2: Add Data

The dataset is monthly visitor counts for Munich’s museums from Munich Open Data. You can download the CSV directly from:

👉 https://datengartln.de/datasets/detail/bfb4a286-bea5-4bfe-82ce-b9bd354284a5/

Exercise 2.1: Set up the data-raw folder

Run in the R console:

usethis::use_data_raw("museum-visitors")

This creates data-raw/museum-visitors.R and opens it automatically. Write your download-and-clean script in that file.

Exercise 2.2: Write the data-raw script

Fill in data-raw/museum-visitors.R to:

  1. Download the CSV from the URL below
  2. Clean the column names with janitor::clean_names()
  3. Save the cleaned object with usethis::use_data()

The direct CSV download URL is:

url <- paste0(
  "https://opendata.muenchen.de/dataset/bfb4a286-bea5-4bfe-82ce-b9bd354284a5/",
  "resource/6c6a809e-91ee-4f3e-9268-a8b7bc38311c/download/",
  "monatszahlen2603_museen_16_03_26.csv"
)
Tip

You will need readr and janitor to write the data-raw script. Since they are only used to build the data (not in any package function), declare them as suggested packages rather than imports:

usethis::use_package("readr", type = "Suggests")
usethis::use_package("janitor", type = "Suggests")
NoteSolution

Write the following in data-raw/museum-visitors.R (an R script, not the console):

## code to prepare `museum_visitors` dataset goes here

url <- paste0(
  "https://opendata.muenchen.de/dataset/bfb4a286-bea5-4bfe-82ce-b9bd354284a5/",
  "resource/6c6a809e-91ee-4f3e-9268-a8b7bc38311c/download/",
  "monatszahlen2603_museen_16_03_26.csv"
)

museum_visitors_raw <- readr::read_csv(url)

museum_visitors <- museum_visitors_raw |>
  janitor::clean_names()   # WERT → wert, AUSPRAEGUNG → auspraegung, …

usethis::use_data(museum_visitors, overwrite = TRUE)

Run the entire script (not just the last line) to download and save the data. Use Ctrl+Shift+Enter to run all, or click Source in the top right of the RStudio editor pane.

Exercise 2.3: Check the data loaded correctly

Reload the package and inspect the dataset in the R console:

devtools::load_all()   # Ctrl+Shift+L
museum_visitors
dplyr::glimpse(museum_visitors)

What columns does the cleaned dataset have? Note down the column names — you will need them for the next exercise.

Exercise 2.4: Document the dataset

Data objects need their own documentation. By convention, create R/data.R by running in the R console:

usethis::use_r("data")

Write a roxygen2 documentation block for museum_visitors. Include:

  • A title and description
  • @format describing the data frame and each column
  • @source with the open data URL
  • The dataset name as a string at the end (not a function call)
NoteSolution

Write the following in R/data.R (an R script):

#' Monthly visitor counts for Munich museums
#'
#' Monthly visitor statistics for Munich's public museums, sourced from
#' Munich Open Data (Statistisches Amt München). Data covers all major
#' municipal museums with year-on-year comparison figures.
#'
#' @format A data frame with one row per museum per month:
#' \describe{
#'   \item{monatszahl}{Category label (always "Besucher*innen")}
#'   \item{auspraegung}{Museum name}
#'   \item{jahr}{Year}
#'   \item{monat}{Year-month code (YYYYMM format)}
#'   \item{wert}{Visitor count for that month}
#'   \item{vorjahreswert}{Visitor count in the same month of the prior year}
#'   \item{veraend_vormonat_prozent}{Percentage change vs. previous month}
#'   \item{veraend_vorjahresmonat_prozent}{Percentage change vs. same month prior year}
#'   \item{zwoelf_monate_mittelwert}{12-month rolling average}
#' }
#' @source Landeshauptstadt München (2017). Monatszahlen Museen.
#'   Statistisches Amt München. Lizenz: Datenlizenz Deutschland
#'   Namensnennung 2.0 (dl-by-de).
#'   <https://datengartln.de/datasets/detail/bfb4a286-bea5-4bfe-82ce-b9bd354284a5/>
"museum_visitors"

Then run in the R console:

devtools::document()   # Ctrl+Shift+D
?museum_visitors       # check the rendered help page

Part 3: Add a Function

Exercise 3.1: Create the plot function

Create a new R file for the function by running in the R console:

usethis::use_r("plot_museums")

Write a function plot_museums() that produces a line chart of monthly visitor counts, with one line per museum (auspraegung), using the museum_visitors dataset that is bundled with the package.

Tip

Use ggplot2:: prefixes throughout — never library(ggplot2) inside a package. Declare all dependencies in the R console:

usethis::use_package("ggplot2")
usethis::use_package("dplyr")
usethis::use_package("scales")
NoteSolution

Write the following in R/plot_museums.R (an R script):

#' Annual visitor counts per Munich museum
#'
#' Plots annual visitor totals for each museum in the bundled
#' `museum_visitors` dataset, using the yearly summary rows
#' (where `monat == "Summe"`).
#'
#' @return A `ggplot2` plot object.
#' @export
#' @examples
#' plot_museums()
plot_museums <- function() {
  museum_visitors |>
    dplyr::filter(
      monat == "Summe"
    ) |>
    ggplot2::ggplot(ggplot2::aes(x = jahr, y = wert, colour = auspraegung)) +
    ggplot2::geom_line() +
    ggplot2::labs(x = "Year", y = "Visitors", colour = "Museum") +
    ggplot2::ggtitle("Annual Visitors to Museums in Munich") +
    ggplot2::scale_y_continuous(labels = scales::label_number())
}

You will also need to declare dplyr and scales as dependencies. Run in the R console:

usethis::use_package("dplyr")
usethis::use_package("scales")

And add the variable names used in the function to R/utils.R:

utils::globalVariables(c("museum_visitors", "monat", "wert", "auspraegung", "jahr"))

Exercise 3.2: Document and export

Add a roxygen2 block above plot_museums() (see solution above). Then run in the R console to generate the help files and check the function is accessible:

devtools::document()   # Ctrl+Shift+D
devtools::load_all()   # Ctrl+Shift+L
plot_museums()
?plot_museums
Important

For a function to be accessible after installing the package, it must have @export in its roxygen2 block. Without it, devtools::document() will not add it to NAMESPACE and users will not be able to call it.

Exercise 3.3: Run devtools::check()

Run in the R console:

devtools::check()   # Ctrl+Shift+E

Read through any WARNINGs or NOTEs. A clean check shows 0 errors, 0 warnings, 0 notes. Common issues at this stage:

  • Missing @export tag
  • Functions called without the package:: prefix
  • Undeclared imports in DESCRIPTION
NoteSolution: common fixes

If you see no visible binding for global variable 'monat' (or similar), create R/utils.R (run usethis::use_r("utils") in the R console) and add:

utils::globalVariables(c("museum_visitors", "monat", "wert", "auspraegung", "jahr"))

This tells R CMD check that these names are intentionally used as unquoted column names (tidy evaluation).

Exercise 3.4: Add a test

Set up testing and create a test file by running in the R console:

usethis::use_testthat()
usethis::use_test("plot_museums")

Write at least two expect_* assertions in tests/testthat/test-plot_museums.R (an R script):

NoteSolution
test_that("plot_museums returns a ggplot object", {
  result <- plot_museums()
  expect_s3_class(result, "gg")
})

test_that("museum_visitors has expected columns", {
  expect_true(all(c("auspraegung", "jahr", "monat", "wert") %in%
                    names(museum_visitors)))
})

Run the tests in the R console:

devtools::test()   # Ctrl+Shift+T

Part 4: Add Functions for a Second Dataset

TipDataset

Munich Open Data publishes monthly statistics across many domains. Browse the catalogue here:

👉 https://opendata.muenchen.de/dataset/?tags=Monatszahlen

Choose a dataset that interests you — for example, monthly library loans, traffic counts, or another cultural indicator. You do not need to include the data in the package; a function that downloads and returns it on demand is fine.

Design and implement at least one new function for your chosen dataset. Your function should:

  • Have a clear, descriptive name (use a verb: get_, plot_, summarise_)
  • Accept at least one argument (e.g. a year range, a category filter, or a plot type)
  • Be documented with a full roxygen2 block (@param, @return, @export, @examples)
  • Be declared in DESCRIPTION if it uses external packages
TipInterface design

Before writing the function body, write the call you wish existed. Ask: what does the user want to do? What do they need to specify? What should be returned? Once the interface feels right, filling in the body is easier (outside-in design from the lecture).

After writing your function:

  1. Document it — run devtools::document() in the R console
  2. Reload and try it — run devtools::load_all() in the R console
  3. Write at least one test — run usethis::use_test("your-function-name") in the R console, then write assertions in the created R script
  4. Run the full check — run devtools::check() in the R console

Part 5 (Bonus Task): Vignette

A vignette is a long-form how-to guide that walks a new user through a complete workflow using your package. Unlike a help page (?function), a vignette tells a story.

Exercise 5.1: Create a vignette

Run in the R console:

usethis::use_vignette("munichvisitors")

This creates vignettes/munichvisitors.Rmd. Write your vignette taking the perspective of someone who has never used the package before.

A good vignette covers:

  • Why this package exists and what problem it solves
  • How to install the package
  • The main dataset: what it contains and how to access it
  • The main function(s): what they do and how to call them
  • At least one worked example with rendered output
Tip

Check out the dplyr vignettes for examples of well-written package vignettes. Notice how they lead with a motivating problem, not with function documentation.

Exercise 5.2: Build and preview

Run in the R console:

devtools::build_vignettes()
vignette("munichvisitors", package = "munichvisitors")

Part 6: Reflection Log

Take a few minutes to add this week’s entry to your reflection log. Then commit and push.


Keyboard Shortcuts

These shortcuts cover the most common steps of the package development loop. Learn them — they will save you a lot of time.

Action Windows / Linux Mac
devtools::load_all() — reload the package Ctrl+Shift+L Cmd+Shift+L
devtools::document() — regenerate help files Ctrl+Shift+D Cmd+Shift+D
devtools::test() — run tests Ctrl+Shift+T Cmd+Shift+T
devtools::check() — full R CMD check Ctrl+Shift+E Cmd+Shift+E
Insert a pipe \|> Ctrl+Shift+M Cmd+Shift+M
Insert a roxygen2 skeleton Code → Insert Roxygen Skeleton Code → Insert Roxygen Skeleton

Using LLMs for Package Development

LLMs are genuinely useful at several steps of package development — but they require careful review. From the lecture:

WarningWhat to check when using LLM-generated package code
  • Be specific about the interface — tell the LLM exactly what functions you want, what arguments they take, and what they return. Vague prompts produce vague APIs.
  • Check DESCRIPTION — LLMs often add unnecessary Imports. Every dependency is a liability your users inherit.
  • Check NAMESPACE — missing @export tags mean functions are silently unavailable; extra exports expose internal helpers.
  • Review code style — LLMs mix styles. Enforce consistency with styler::style_pkg() after generation.
  • Run devtools::check() — treat any NOTE or WARNING as a bug, not a suggestion.

Suggested workflow: write the interface yourself (function names, arguments, return values), then ask the LLM to fill in the body and generate the roxygen2 block. Review both carefully before committing.


Resources

R Package Development

  • R Packages (2e) — Hadley Wickham & Jenny Bryan; the definitive reference
  • Introduction to R Packages — N.J. Tierney — a step-by-step walkthrough used in preparing this practical
  • usethis documentation
  • devtools documentation
  • roxygen2 documentation

Data sources used in this practical

  • Munich Museums open data — the museums CSV
  • Munich Open Data — Monatszahlen catalogue — further monthly statistics for Part 4

Testing

  • testthat documentation
  • Testing chapter — R Packages (2e)

LLMs and package development

  • JOSS guidance on AI-assisted code — how to acknowledge LLM usage in published software
Source Code
---
title: "Practical #6"
subtitle: "Advanced Statistical Programming using R — R Packages"
author: "Leonhard Kestel, Lisa Bondo Andersen, Cynthia Huang"
date: "May 22, 2026"
format: 
  html:
    theme: default
    toc: true
    toc-depth: 2
    code-tools: true
    highlight-style: github
execute:
  eval: false
  message: false
  warning: false
draft: false
---

## Quiz

Before starting, work through this [QUIZ](quiz.qmd){target="_blank"} to check your understanding of the concepts covered in this week's lecture on R packages. Find a full solution R file [HERE](solution_practical.R){target="_blank"}.

------------------------------------------------------------------------

# Overview

This is an **online, drop-in consultation practical session**. Lisa will be available between 2-4pm on [this zoom link](https://lmu-munich.zoom-x.de/j/65274156117?pwd=heRGyyRHdtCcnzyZcyKjv69HchWxKW.1). Please wait in the waiting room until you're invited in to ask your questions.

In this practical you will build a complete R data package from scratch: `munichvisitors`. By the end you will have a package that ships a cleaned dataset, a documented plot function, and — if you tackle the extension — a vignette.

The practical is structured into five parts:

1.  **Create the package skeleton** — scaffold, DESCRIPTION, README
2.  **Add data** — download, clean, and export the museums dataset
3.  **Add a function** — write, document, and test `plot_museums()`
4.  **Add functions for a second dataset** — extend the package with monthly statistics from a separate open data source
5.  **Extension: vignette** — write a how-to guide for your package
6.  **Reflection log** — record what you learned this week

At the end you will also find a [keyboard shortcuts reference](#keyboard-shortcuts) for the package development loop, and [notes on using LLMs](#using-llms-for-package-development) to scaffold and document packages.

Solutions are available for parts 1–3. Parts 4, 5, and 6 are left for you to complete. Solution suggestions for Part 4 will be available next week.

------------------------------------------------------------------------

# Part 1: Create the Package Skeleton

## Exercise 1.1: Initialise the package

Use `usethis::create_package()` to scaffold a new package called `munichvisitors` in a location of your choice. This will open a new RStudio project.

::: {.callout-note title="Solution" collapse="true"}
Run in the **R console**:

``` r
usethis::create_package("munichvisitors")
```

RStudio should open the new project. The folder structure will look like:

```         
munichvisitors/
├── DESCRIPTION
├── NAMESPACE
├── R/
├── munichvisitors.Rproj
├── .gitignore
└── .Rbuildignore
```
:::

## Exercise 1.2: Edit the DESCRIPTION file

Open `DESCRIPTION` and fill in the fields below. Use your own name in `Authors@R`.

```         
Package: munichvisitors
Title: Monthly Visitor Counts for Munich Museums
Version: 0.0.0.9000
Authors@R:
    person("First", "Last", , "you@example.com", role = c("aut", "cre"))
Description: Provides tidy monthly visitor statistics for Munich's museums
    from Munich Open Data (Statistisches Amt München), with a helper
    plot function for exploring trends over time.
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.2
LazyData: true
Imports:
    dplyr,
    ggplot2,
    scales
Suggests:
    janitor,
    readr,
    testthat (>= 3.0.0),
    knitr,
    rmarkdown
Config/testthat/edition: 3
VignetteBuilder: knitr
```

::: callout-tip
`RoxygenNote` should match your installed version of `roxygen2`. You can check with `packageVersion("roxygen2")` in the console. `devtools::document()` will update it automatically, so don't worry if it changes.
:::

Add the MIT licence by running in the **R console** (inside the new `munichvisitors` project):

``` r
usethis::use_mit_license()
```

## Exercise 1.3: Write a README

Create a README template by running in the **R console**:

``` r
usethis::use_readme_rmd()
```

Open `README.Rmd` and write a short README following this structure (from [R Packages (2e)](https://r-pkgs.org/other-markdown.html#sec-readme)):

1.  One paragraph describing what the package does
2.  Installation instructions: `devtools::install_github("your-username/munichvisitors")`
3.  A brief overview of what is included
4.  A short example showing how to use it — **come back and fill this in after Exercise 3.1**, once you have a working function to show

::: callout-tip
Knit `README.Rmd` to produce `README.md` — GitHub renders the `.md` version on your repository page. Re-knit whenever you update it.
:::

::: {.callout-note title="Solution: example README" collapse="true"}
```` markdown
---
output: github_document
---

# munichvisitors

The `munichvisitors` package provides tidy monthly visitor counts for
Munich's public museums, sourced from Munich Open Data (Statistisches
Amt München). It includes a ready-to-use plot function so you can
explore trends with a single line of code.

## Installation

``` r
# install.packages("devtools")
devtools::install_github("your-username/munichvisitors")
```

## What's included

- `museum_visitors` — a data frame of monthly visitor counts per museum,
  with year-on-year comparison figures
- `plot_museums()` — a line chart of visitor trends over time

## Example

``` r
library(munichvisitors)
plot_museums()
```

## Data source

Landeshauptstadt München (2017). Monatszahlen Museen. Statistisches Amt München.
Lizenz: [Datenlizenz Deutschland Namensnennung 2.0](https://www.govdata.de/dl-de/by-2-0).
<https://datengartln.de/datasets/detail/bfb4a286-bea5-4bfe-82ce-b9bd354284a5/>
````
:::

------------------------------------------------------------------------

# Part 2: Add Data

The dataset is monthly visitor counts for Munich's museums from [Munich Open Data](https://opendata.muenchen.de/dataset/monatszahlen-museen). You can download the CSV directly from:

👉 <https://datengartln.de/datasets/detail/bfb4a286-bea5-4bfe-82ce-b9bd354284a5/>

## Exercise 2.1: Set up the data-raw folder

Run in the **R console**:

``` r
usethis::use_data_raw("museum-visitors")
```

This creates `data-raw/museum-visitors.R` and opens it automatically. Write your download-and-clean script in that file.

## Exercise 2.2: Write the data-raw script

Fill in `data-raw/museum-visitors.R` to:

1.  Download the CSV from the URL below
2.  Clean the column names with `janitor::clean_names()`
3.  Save the cleaned object with `usethis::use_data()`

The direct CSV download URL is:

``` r
url <- paste0(
  "https://opendata.muenchen.de/dataset/bfb4a286-bea5-4bfe-82ce-b9bd354284a5/",
  "resource/6c6a809e-91ee-4f3e-9268-a8b7bc38311c/download/",
  "monatszahlen2603_museen_16_03_26.csv"
)
```

::: callout-tip
You will need `readr` and `janitor` to write the data-raw script. Since they are only used to build the data (not in any package function), declare them as suggested packages rather than imports:

``` r
usethis::use_package("readr", type = "Suggests")
usethis::use_package("janitor", type = "Suggests")
```
:::

::: {.callout-note title="Solution" collapse="true"}
Write the following in `data-raw/museum-visitors.R` (an **R script**, not the console):

``` r
## code to prepare `museum_visitors` dataset goes here

url <- paste0(
  "https://opendata.muenchen.de/dataset/bfb4a286-bea5-4bfe-82ce-b9bd354284a5/",
  "resource/6c6a809e-91ee-4f3e-9268-a8b7bc38311c/download/",
  "monatszahlen2603_museen_16_03_26.csv"
)

museum_visitors_raw <- readr::read_csv(url)

museum_visitors <- museum_visitors_raw |>
  janitor::clean_names()   # WERT → wert, AUSPRAEGUNG → auspraegung, …

usethis::use_data(museum_visitors, overwrite = TRUE)
```

Run the entire script (not just the last line) to download and save the data. Use **Ctrl+Shift+Enter** to run all, or click **Source** in the top right of the RStudio editor pane.
:::

## Exercise 2.3: Check the data loaded correctly

Reload the package and inspect the dataset in the **R console**:

``` r
devtools::load_all()   # Ctrl+Shift+L
museum_visitors
dplyr::glimpse(museum_visitors)
```

What columns does the cleaned dataset have? Note down the column names — you will need them for the next exercise.

## Exercise 2.4: Document the dataset

Data objects need their own documentation. By convention, create `R/data.R` by running in the **R console**:

``` r
usethis::use_r("data")
```

Write a roxygen2 documentation block for `museum_visitors`. Include:

- A title and description
- `@format` describing the data frame and each column
- `@source` with the open data URL
- The dataset name as a string at the end (not a function call)

::: {.callout-note title="Solution" collapse="true"}
Write the following in `R/data.R` (an **R script**):

``` r
#' Monthly visitor counts for Munich museums
#'
#' Monthly visitor statistics for Munich's public museums, sourced from
#' Munich Open Data (Statistisches Amt München). Data covers all major
#' municipal museums with year-on-year comparison figures.
#'
#' @format A data frame with one row per museum per month:
#' \describe{
#'   \item{monatszahl}{Category label (always "Besucher*innen")}
#'   \item{auspraegung}{Museum name}
#'   \item{jahr}{Year}
#'   \item{monat}{Year-month code (YYYYMM format)}
#'   \item{wert}{Visitor count for that month}
#'   \item{vorjahreswert}{Visitor count in the same month of the prior year}
#'   \item{veraend_vormonat_prozent}{Percentage change vs. previous month}
#'   \item{veraend_vorjahresmonat_prozent}{Percentage change vs. same month prior year}
#'   \item{zwoelf_monate_mittelwert}{12-month rolling average}
#' }
#' @source Landeshauptstadt München (2017). Monatszahlen Museen.
#'   Statistisches Amt München. Lizenz: Datenlizenz Deutschland
#'   Namensnennung 2.0 (dl-by-de).
#'   <https://datengartln.de/datasets/detail/bfb4a286-bea5-4bfe-82ce-b9bd354284a5/>
"museum_visitors"
```

Then run in the **R console**:

``` r
devtools::document()   # Ctrl+Shift+D
?museum_visitors       # check the rendered help page
```
:::

------------------------------------------------------------------------

# Part 3: Add a Function

## Exercise 3.1: Create the plot function

Create a new R file for the function by running in the **R console**:

``` r
usethis::use_r("plot_museums")
```

Write a function `plot_museums()` that produces a line chart of monthly visitor counts, with one line per museum (`auspraegung`), using the `museum_visitors` dataset that is bundled with the package.

::: callout-tip
Use `ggplot2::` prefixes throughout — never `library(ggplot2)` inside a package. Declare all dependencies in the **R console**:

``` r
usethis::use_package("ggplot2")
usethis::use_package("dplyr")
usethis::use_package("scales")
```
:::

::: {.callout-note title="Solution" collapse="true"}
Write the following in `R/plot_museums.R` (an **R script**):

``` r
#' Annual visitor counts per Munich museum
#'
#' Plots annual visitor totals for each museum in the bundled
#' `museum_visitors` dataset, using the yearly summary rows
#' (where `monat == "Summe"`).
#'
#' @return A `ggplot2` plot object.
#' @export
#' @examples
#' plot_museums()
plot_museums <- function() {
  museum_visitors |>
    dplyr::filter(
      monat == "Summe"
    ) |>
    ggplot2::ggplot(ggplot2::aes(x = jahr, y = wert, colour = auspraegung)) +
    ggplot2::geom_line() +
    ggplot2::labs(x = "Year", y = "Visitors", colour = "Museum") +
    ggplot2::ggtitle("Annual Visitors to Museums in Munich") +
    ggplot2::scale_y_continuous(labels = scales::label_number())
}
```

You will also need to declare `dplyr` and `scales` as dependencies. Run in the **R console**:

``` r
usethis::use_package("dplyr")
usethis::use_package("scales")
```

And add the variable names used in the function to `R/utils.R`:

``` r
utils::globalVariables(c("museum_visitors", "monat", "wert", "auspraegung", "jahr"))
```
:::

## Exercise 3.2: Document and export

Add a roxygen2 block above `plot_museums()` (see solution above). Then run in the **R console** to generate the help files and check the function is accessible:

``` r
devtools::document()   # Ctrl+Shift+D
devtools::load_all()   # Ctrl+Shift+L
plot_museums()
?plot_museums
```

::: callout-important
For a function to be accessible after installing the package, it **must** have `@export` in its roxygen2 block. Without it, `devtools::document()` will not add it to `NAMESPACE` and users will not be able to call it.
:::

## Exercise 3.3: Run `devtools::check()`

Run in the **R console**:

``` r
devtools::check()   # Ctrl+Shift+E
```

Read through any WARNINGs or NOTEs. A clean check shows `0 errors, 0 warnings, 0 notes`. Common issues at this stage:

- Missing `@export` tag
- Functions called without the `package::` prefix
- Undeclared imports in `DESCRIPTION`

::: {.callout-note title="Solution: common fixes" collapse="true"}
If you see `no visible binding for global variable 'monat'` (or similar), create `R/utils.R` (run `usethis::use_r("utils")` in the **R console**) and add:

``` r
utils::globalVariables(c("museum_visitors", "monat", "wert", "auspraegung", "jahr"))
```

This tells R CMD check that these names are intentionally used as unquoted column names (tidy evaluation).
:::

## Exercise 3.4: Add a test

Set up testing and create a test file by running in the **R console**:

``` r
usethis::use_testthat()
usethis::use_test("plot_museums")
```

Write at least two `expect_*` assertions in `tests/testthat/test-plot_museums.R` (an **R script**):

::: {.callout-note title="Solution" collapse="true"}
``` r
test_that("plot_museums returns a ggplot object", {
  result <- plot_museums()
  expect_s3_class(result, "gg")
})

test_that("museum_visitors has expected columns", {
  expect_true(all(c("auspraegung", "jahr", "monat", "wert") %in%
                    names(museum_visitors)))
})
```

Run the tests in the **R console**:

``` r
devtools::test()   # Ctrl+Shift+T
```
:::

------------------------------------------------------------------------

# Part 4: Add Functions for a Second Dataset

::: {.callout-tip title="Dataset"}
Munich Open Data publishes monthly statistics across many domains. Browse the catalogue here:

👉 <https://opendata.muenchen.de/dataset/?tags=Monatszahlen>

Choose a dataset that interests you — for example, monthly library loans, traffic counts, or another cultural indicator. You do not need to include the data in the package; a function that downloads and returns it on demand is fine.
:::

Design and implement **at least one new function** for your chosen dataset. Your function should:

- Have a clear, descriptive name (use a verb: `get_`, `plot_`, `summarise_`)
- Accept at least one argument (e.g. a year range, a category filter, or a plot type)
- Be documented with a full roxygen2 block (`@param`, `@return`, `@export`, `@examples`)
- Be declared in `DESCRIPTION` if it uses external packages

::: {.callout-tip title="Interface design"}
Before writing the function body, write the call you *wish* existed. Ask: what does the user want to do? What do they need to specify? What should be returned? Once the interface feels right, filling in the body is easier (outside-in design from the lecture).
:::

After writing your function:

1. Document it — run `devtools::document()` in the **R console**
2. Reload and try it — run `devtools::load_all()` in the **R console**
3. Write at least one test — run `usethis::use_test("your-function-name")` in the **R console**, then write assertions in the created **R script**
4. Run the full check — run `devtools::check()` in the **R console**

------------------------------------------------------------------------

# Part 5 (Bonus Task): Vignette

A **vignette** is a long-form how-to guide that walks a new user through a complete workflow using your package. Unlike a help page (`?function`), a vignette tells a story.

## Exercise 5.1: Create a vignette

Run in the **R console**:

``` r
usethis::use_vignette("munichvisitors")
```

This creates `vignettes/munichvisitors.Rmd`. Write your vignette taking the perspective of someone who has never used the package before.

A good vignette covers:

- Why this package exists and what problem it solves
- How to install the package
- The main dataset: what it contains and how to access it
- The main function(s): what they do and how to call them
- At least one worked example with rendered output

::: callout-tip
Check out the [dplyr vignettes](https://dplyr.tidyverse.org/articles/) for examples of well-written package vignettes. Notice how they lead with a motivating problem, not with function documentation.
:::

## Exercise 5.2: Build and preview

Run in the **R console**:

``` r
devtools::build_vignettes()
vignette("munichvisitors", package = "munichvisitors")
```

------------------------------------------------------------------------

# Part 6: Reflection Log

Take a few minutes to add this week's entry to your [reflection log](_reflection-prompts.qmd). Then commit and push.

------------------------------------------------------------------------

# Keyboard Shortcuts

These shortcuts cover the most common steps of the package development loop. Learn them — they will save you a lot of time.

| Action | Windows / Linux | Mac |
|---|---|---|
| `devtools::load_all()` — reload the package | `Ctrl+Shift+L` | `Cmd+Shift+L` |
| `devtools::document()` — regenerate help files | `Ctrl+Shift+D` | `Cmd+Shift+D` |
| `devtools::test()` — run tests | `Ctrl+Shift+T` | `Cmd+Shift+T` |
| `devtools::check()` — full R CMD check | `Ctrl+Shift+E` | `Cmd+Shift+E` |
| Insert a pipe `\|>` | `Ctrl+Shift+M` | `Cmd+Shift+M` |
| Insert a roxygen2 skeleton | **Code → Insert Roxygen Skeleton** | **Code → Insert Roxygen Skeleton** |

------------------------------------------------------------------------

# Using LLMs for Package Development

LLMs are genuinely useful at several steps of package development — but they require careful review. From the lecture:

::: {.callout-warning title="What to check when using LLM-generated package code"}
- **Be specific about the interface** — tell the LLM exactly what functions you want, what arguments they take, and what they return. Vague prompts produce vague APIs.
- **Check `DESCRIPTION`** — LLMs often add unnecessary `Imports`. Every dependency is a liability your users inherit.
- **Check `NAMESPACE`** — missing `@export` tags mean functions are silently unavailable; extra exports expose internal helpers.
- **Review code style** — LLMs mix styles. Enforce consistency with `styler::style_pkg()` after generation.
- **Run `devtools::check()`** — treat any NOTE or WARNING as a bug, not a suggestion.
:::

Suggested workflow: write the *interface* yourself (function names, arguments, return values), then ask the LLM to fill in the body and generate the roxygen2 block. Review both carefully before committing.

------------------------------------------------------------------------

# Resources

**R Package Development**

- [R Packages (2e)](https://r-pkgs.org/) — Hadley Wickham & Jenny Bryan; the definitive reference
- [Introduction to R Packages — N.J. Tierney](https://intro2rpkgs.njtierney.com/) — a step-by-step walkthrough used in preparing this practical
- [usethis documentation](https://usethis.r-lib.org/)
- [devtools documentation](https://devtools.r-lib.org/)
- [roxygen2 documentation](https://roxygen2.r-lib.org/)

**Data sources used in this practical**

- [Munich Museums open data](https://datengartln.de/datasets/detail/bfb4a286-bea5-4bfe-82ce-b9bd354284a5/) — the museums CSV
- [Munich Open Data — Monatszahlen catalogue](https://opendata.muenchen.de/dataset/?tags=Monatszahlen) — further monthly statistics for Part 4

**Testing**

- [testthat documentation](https://testthat.r-lib.org/)
- [Testing chapter — R Packages (2e)](https://r-pkgs.org/testing-basics.html)

**LLMs and package development**

- [JOSS guidance on AI-assisted code](https://joss.theoj.org/about#generative_ai) — how to acknowledge LLM usage in published software