Practical #4
Advanced Statistical Programming using R — Version Control & Remotes
Quiz
Before starting, work through this QUIZ to check your understanding of the concepts covered in this week’s lecture on debugging and on using LLMs (large language models) in a statistical programming workflow.
General Remarks
Last week you practised finding and fixing bugs. This week the focus shifts to version control — keeping track of your work, understanding your project’s history, and connecting your local repository to GitHub.
The session builds from your local setup outward: first making sure your repository and .gitignore are in good shape, then connecting to GitHub and practicing the pull/fetch/push workflow you will use every day on the group project.
Check you have all three of these before the practical begins:
- Git installed — run
git --versionin a terminal. You should see a version number. - A GitHub account — if you don’t have one yet, sign up at github.com now.
- Your debugging exercise from last week — the
.qmdor.Rfile(s) you worked on in Practical #3, ideally already in a git-tracked folder.
If any of these is missing, ask your instructor at the start and we will sort it out before you move on.
All exercises below are written CLI-first. If you prefer a graphical interface, two options exist:
GitHub Desktop (desktop.github.com) is the recommended GUI. It handles authentication automatically, covers all the workflows in this practical, and is generally more reliable than the RStudio Git pane. Use File → Add Local Repository to connect an existing folder.
RStudio Git pane — RStudio has a built-in Git tab (top-right panel) that covers staging, committing, pushing, and pulling. It is only available when working inside an R Project (a folder with a .Rproj file), and can be unreliable for anything beyond basic commits. Where it applies, it is noted as an optional alternative in the exercises below.
If you have used Google Drive, you already understand the core problem Git solves: backing up files, going back to older versions, and sharing work with others. Git does all of this — but more deliberately. Instead of syncing automatically in the background, you decide exactly when to save a version and what to call it. That extra control is what makes it so useful for code and research.
The key difference: Drive saves continuously and silently. Git saves only when you run git commit, and you write a short message explaining what changed and why. Your history stays readable — not a pile of auto-saves labelled “version 47”.
Resource for the whole session: Happy Git with R (Jenny Bryan). Every task below has a corresponding section there if you want more detail.
Exercise 0: Quick review and setup check
In Practical #1 you set up a local git repository and learned init, add, and commit. GitHub was optional. This exercise gets everyone to the same starting point: a local repo with your Practical #3 work committed, ready to push to GitHub in Exercise 1.
0.1 Check your global git config
Open a terminal and run:
git config --global user.name
git config --global user.emailYou should see your name and email. These appear in every commit you make. If they are blank or wrong, set them now — use the same email address as your GitHub account, so that GitHub can link your commits to your profile:
git config --global user.name "Your Name"
git config --global user.email "you@example.com"0.2 Get your practical from last week into a local git repo
Navigate to your Practical #3 folder in the terminal:
cd path/to/your/practical3-folderThen run these two diagnostic commands:
git log --oneline # do you have any commits?
git remote -v # is a GitHub remote configured?If git log opens a scrollable view and your prompt doesn’t return, press q to exit.
Find your situation in the table below and follow the corresponding steps.
git log |
git remote -v |
Situation | What to do |
|---|---|---|---|
| error / no commits | no output | No repo yet | See A below |
| shows commits | no output | Local only (most likely) | Nothing — move on to Exercise 1 |
| shows commits | shows a GitHub URL | Already on GitHub | Commit any unsaved changes, push, then skip to Exercise 2 |
A — No repo yet: initialise one now:
git init
git add .
git commit -m "add debugging exercise"If git status or git log produces errors, or you see mentions of merge conflicts or detached HEAD, the fastest fix is to start clean:
- Copy your
.qmd/.Rfiles somewhere safe outside the folder. - Delete the
.gitfolder:rm -rf .git - Re-initialise:
git init,git add .,git commit -m "initial commit"
This loses history, which is fine at this stage — the goal is to have something clean to push.
Exercise 1: Push to GitHub
1.1 Create a new repository on GitHub
- Go to github.com/new.
- Name the repository
statprog-debugging(or similar). - Set it to Public.
- Do not tick “Add a README” — your local folder already has content and an empty README would create a conflict on first push.
- Click Create repository.
GitHub will show you a page of instructions. You want the block labelled “…or push an existing repository from the command line”.
1.2 Set up SSH authentication
Before you can push, GitHub needs to verify your identity. We recommend SSH keys — once set up, you never need to paste a password again.
Follow the step-by-step instructions at:
lmu-osc.github.io/Introduction-RStudio-Git-GitHub/SSH.html
The process has three steps:
- Generate an SSH key pair on your machine.
- Add the public key to your GitHub account.
- Tell Git to use SSH when connecting to GitHub.
If SSH does not work for you today, HTTPS with a PAT is a valid fallback:
- GitHub → Settings → Developer Settings → Personal access tokens → Tokens (classic) → Generate new token.
- Tick the
reposcope. Set expiry to at least end of semester. - Copy the token immediately (it won’t be shown again) and paste it when Git asks for a password.
SSH is still preferred for the rest of the course — set it up before Exercise 2 if you can.
1.3 Verify your SSH connection
Once the SSH key is added to GitHub, run this in your terminal to confirm everything works:
ssh -T git@github.comA successful response looks like:
Hi YOUR-USERNAME! You've successfully authenticated, but GitHub does not provide shell access.
If you see Permission denied (publickey), go back through the LMU OSC instructions — the most common cause is that the public key was not pasted correctly into GitHub, or a passphrase was set and needs to be entered.
ssh -T git@github.com is read-only — it does not change anything, it only checks whether authentication succeeds. Safe to run at any time.
1.4 Connect your local repo to GitHub and push
Copy the commands GitHub shows you — they will look something like:
git remote add origin git@github.com:YOUR-USERNAME/statprog-debugging.git
git branch -M main
git push -u origin mainRun them in your terminal from inside your project folder.
git remote add origin <url> registers GitHub as the remote named origin. The -u flag in git push -u origin main sets origin/main as the default upstream, so from now on plain git push and git pull work without any extra arguments.
You can confirm the connection at any time:
git remote -vYou should see origin listed twice (fetch and push) pointing to your GitHub URL.
1.5 Verify
Refresh your GitHub page — your files should appear. Check that the commit message is what you expect.
Exercise 2: Working with .gitignore
Not every file in your project folder should be tracked by Git. Generated output, R session artefacts, and sensitive files like API keys should be excluded. The .gitignore file tells Git which files and patterns to ignore.
2.1 See what Git currently sees
git statusLook at the untracked files list. Are there any files you would not want on GitHub — for example .RData, .Rhistory, or a _files/ folder?
2.2 Create or edit your .gitignore
From inside your project folder, create the file if it does not exist yet and open it for editing:
touch .gitignore # creates an empty file if it doesn't exist yet
nano .gitignore # opens it for editing in the terminalIn nano: type or paste your content, then save with Ctrl+O → Enter, and exit with Ctrl+X.
You can also open .gitignore in RStudio via File → Open File. Note that RStudio’s file browser hides dotfiles (files starting with .) by default — if you can’t see .gitignore there, use the terminal to edit it instead.
Add the following as a starting point for an R / Quarto project. Data files are commonly listed here too — they are often large, change infrequently, and can usually be re-downloaded or regenerated, so there is little value in tracking them with Git:
# R session artefacts
.Rhistory
.RData
.Rproj.user/
# Quarto build output
/_site/
/.quarto/
*_files/
# OS noise
.DS_Store
Thumbs.db
Add any other files or patterns that appeared in git status and should not be committed.
| Pattern | What it matches |
|---|---|
.RData |
A specific file by name |
*.csv |
All files with that extension |
data/ |
The entire data/ folder |
!data/README.md |
Exception: track this one file even though data/ is ignored |
A leading / anchors to the repo root: /_site/ only matches a top-level _site folder, not docs/_site/.
2.3 Verify that ignoring works
git statusFiles matching your patterns should no longer appear in the untracked list.
2.4 Commit your .gitignore
git add .gitignore
git commit -m "add gitignore for R and Quarto"
git pushAdding a pattern to .gitignore stops future tracking but does not remove the file from the index. To stop tracking a committed file:
git rm --cached path/to/file
git commit -m "stop tracking sensitive file"--cached removes it from the index (and from GitHub after you push) without deleting it from your local disk. For sensitive files like API keys: once a secret is in git history, treat it as compromised and regenerate it.
Exercise 3: Exploring git history and restoring a past version
3.1 Browse the log
git log --oneline # one line per commit
git log --oneline --graph # with branch graph
git log -- myfile.qmd # commits that touched one fileTo inspect a specific commit:
git show <hash> # full diff for that commit
git show <hash>:practical3.qmd # the file as it was at that commitIf you are working inside an R Project, open the Git tab → click the History button (clock icon). Click any commit to see its diff. Click a file in the lower pane to see what changed in that file specifically — additions in green, deletions in red.
3.2 Make a change you will want to undo
Edit your .qmd — delete a section or change a heading. Stage and commit:
git add .
git commit -m "deliberately break something for exercise 3"Run git log --oneline and note the short hash of this commit and the one before it.
3.3 Restore a past version of the file
git checkout <hash> -- practical3.qmd # use the hash of the commit BEFORE the breakThis stages the restored file automatically — you can see this with git status. Commit it:
git commit -m "restore practical3.qmd to pre-break version"In the History panel, click the commit before the break → select the file in the lower pane → Save As → overwrite the current file. This runs the same git checkout <hash> -- <file> command under the hood, but leaves the file unstaged — you still need to git add and git commit afterwards.
git revert <hash> — creates a new commit that exactly undoes a specific past commit. Safe for shared repos because it adds to history rather than rewriting it.
git diff <hash-B> <hash-A> | git apply — applies the diff between two commits as unstaged changes, useful when you want to review before committing.
3.4 Push
git pushExercise 4: Clone, edit, and push
This exercise gets you comfortable with the clone → edit → commit → push cycle — the daily workflow once a repo already exists on GitHub.
Get the SSH URL of your repository: green Code button on GitHub → SSH tab → copy.
Navigate to a folder outside your existing project:
cd ..cd ..moves you one level up in the folder structure. If you are not sure where you ended up, runpwd(Mac/Linux) orcd(Windows) to print your current location. Then clone:git clone git@github.com:YOUR-USERNAME/statprog-debugging.git statprog-copy cd statprog-copyAdd a short comment to the top of your
.qmd— something like# cloned copy — practical 4. Save.Stage, commit, and push:
git add . git commit -m "add comment from practical 4" git pushNoteOptional: RStudio UITick the file checkbox in the Git pane → Commit → type your message → Commit → Push (upward arrow). Requires an R Project in the cloned folder.
Navigate back to your original folder and pull:
cd .. cd 03-debugging # or whatever your original folder is called git pullYour comment should now appear there too.
NoteOptional: RStudio UIClick the Pull button (downward arrow) in the Git pane.
Exercise 5: fetch, inspect, then pull
git pull is convenient but it does two things at once: it fetches new commits from the remote and immediately merges them into your local branch. Sometimes you want to see what has changed before merging — especially on a shared repository. That is what git fetch is for.
Make a change on GitHub via the web interface: go to your repository, click your
.qmd, click the pencil icon, add a comment line at the top, and commit.Back in your terminal, fetch without merging:
git fetch originYour local files are unchanged. Git has downloaded the new commits but not applied them.
Inspect what came in:
git log HEAD..origin/main --oneline # commits on remote that you don't have yet git diff HEAD origin/main # line-by-line diff between your local and remoteNow merge the fetched changes:
git merge origin/mainOr equivalently,
git pulldoes fetch + merge in one step. Usefetch+mergewhen you want to review first; usepullwhen you trust the remote and just want to sync.
git pull is shorthand for git fetch origin followed by git merge origin/main. On a solo project the difference rarely matters. On a shared repository, fetching first gives you a chance to see what your collaborators did before those changes land in your working files.
Exercise 6: Reflection Log
- Take a few minutes to write this week’s reflection log.
- Commit and push your reflection log to GitHub.
Some prompts: Did anything not work as expected? What was the most confusing part of connecting your local repo to GitHub? When do you think you will use git fetch instead of git pull?