R · Quarto setup

Getting R, RStudio/Posit Cloud, and Quarto running for the simulation labs

This page gets your software working so you can run the simulation code that appears in the labs. Read it once near the start of the term, get a “hello, probability” report to render, and then come back to it whenever something breaks.

One framing first, because it shapes everything below. In this course, simulation supports probability reasoning — it is not the center. The mathematics is the point: a probability is a number you reason about, and a simulation is a way to check your reasoning by letting the computer play out the random process many times and counting. When you write that the shuttle is late with probability \(0.19\), a short simulation that comes back near \(0.19\) is reassurance that you set the problem up correctly. It is never a substitute for understanding why the number is \(0.19\). So the software here stays deliberately small: a few base-R functions, one file per task, and a habit of writing down what you expected before you run anything.

A second framing about cost. Everything you need is free and open. This course does not use Cengage, WebAssign, MyLab, or any paid statistics platform. R is free, the editors below are free (one of them runs entirely in a browser with nothing to install), and Quarto is free. If a site ever asks you to pay to do the coursework, that is not part of this class — check Blackboard and ask.

What to install

You need three pieces. They stack: R does the computing, the editor is where you write and run code, and Quarto turns a plain-text document with code in it into a clean report.

1. R — the language that does the probability. R is the engine. Download it from CRAN, the Comprehensive R Archive Network, at https://cran.r-project.org. Pick the build for your operating system (Windows, macOS, or Linux), install it with the defaults, and you are done — you will rarely open R directly, because the editor below talks to it for you. We use only base R in this course: the functions that ship with R itself, no add-on packages to install. That keeps setup short and keeps the focus on the probability rather than on tooling.

2. An editor — where you actually work. Choose one of two. You do not need both.

RStudio Desktop (recommended if you want everything on your own machine). It is a free, friendly front end for R: a place to type code, a console that runs it, a panel that shows your plots, and the one-button rendering you will use for reports. Download it from https://posit.co/download/rstudio-desktop/ and install R first (step 1), then RStudio — RStudio finds your R installation automatically.
Posit Cloud (recommended if you would rather not install anything). It is RStudio running in your web browser, hosted by Posit. Create a free account at https://posit.cloud, start a new project, and you have R, the editor, and Quarto already set up and connected — nothing to download. This is the easiest path if you are on a managed or borrowed computer, and the free tier is enough for this course’s small simulations.

Either choice gives you the same experience for our purposes. Pick the one that fits your machine and move on.

3. Quarto — the document system that makes the report. Quarto takes a plain-text file (a .qmd file) that mixes your writing and your code, runs the code, and produces a tidy HTML or PDF report with the results and figures dropped in. Get it from https://quarto.org/docs/get-started/. If you installed a current RStudio Desktop, Quarto came bundled and you can skip the separate download; install it on its own only if quarto check (below) says it is missing. On Posit Cloud, Quarto is already there.

Make R discoverable by Quarto

“Discoverable” just means Quarto can find your R installation so it can run the code in your document. In the common setups this is automatic, but here is how to confirm it and what to do if it is not.

Working in RStudio Desktop or Posit Cloud (the usual case). Open your .qmd file and click the Render button at the top of the editor. RStudio hands the file to Quarto, Quarto runs the R code through the R that RStudio already knows about, and a preview appears. If a report renders, R is discoverable — you are set, and you can stop here.
Confirming from a terminal. Open a terminal (in RStudio: the Terminal tab next to the Console) and run quarto check. It reports the Quarto version, whether it found R, and whether the small support package it needs for R is in place. Green checks mean you are ready. If it reports a missing piece, the message names it, and on RStudio the fix is usually to let it install the support package when prompted.
Posit Cloud is pre-connected. A new Posit Cloud project already has R, the editor, and Quarto wired together, so quarto check there is mostly a sanity confirmation rather than a fix-it step.

If you are stuck on this step, do not spend an evening on it alone — bring it to office hours or the class help channel. Getting one report to render is the only setup milestone that matters.

Your first reproducible report

Here is a minimal Quarto document. Create a new file called hello-probability.qmd, paste this in, and render it. It estimates the chance the shuttle is late, which in our running “commuter’s morning” example is \(P(\text{late}) = 0.19\) (these are synthetic numbers, seed set), and checks that a simulation lands nearby.

The block below is shown as text so you can read it as a whole; copy it into your own .qmd and render there.

---
title: "Hello, probability"
format: html
---

We model each weekday as a single trial that is "late" with probability 0.19,
then simulate a term's worth of mornings and compare the simulated rate to 0.19.


::: {.cell}

```{.r .cell-code}
set.seed(35003)
days   <- 75                         # mornings in a term
late   <- rbinom(n = days, size = 1, prob = 0.19)   # 1 = late, 0 = on time
c(simulated_late_rate = mean(late), expected = 0.19)
```
:::

A few things to understand, because you will see this pattern in every lab.

#| label: simulate-late-shuttle gives the code chunk a name. Labels make error messages point at the right place and let you refer to a result later; get in the habit of naming every chunk something descriptive.
#| eval: false tells Quarto not to run this chunk when this site is built — that is why the code on the course pages is shown but never executed here. In your own copy you set it to #| eval: true (or simply delete the line, since chunks run by default) so the code actually runs and you see the result. Reading code is not the same as running it; you learn the probability by running it.
set.seed(35003) fixes R’s random-number generator to a known starting point. rbinom produces “random” draws from a deterministic stream, and seeding it means you and anyone else who runs the same file get the same draws — so the report is reproducible. rbinom(n = days, size = 1, prob = 0.19) asks for days independent yes/no trials, each late with probability \(0.19\); mean() of those zeros and ones is the simulated late rate, which should sit near \(0.19\).

When you render your copy with #| eval: true, the simulated rate will be close to \(0.19\) but not exactly \(0.19\) — that gap is the lesson. A simulation estimates a probability; it does not replace the exact calculation. The exact value comes from the model, and we keep the simulation as a check.

Reproducibility habits we keep all term

These three habits make your work trustworthy — to your future self, to a classmate, and to anyone reading your report. They cost almost nothing and they pay off the first time something looks surprising.

Always set.seed(35003) before any simulation. We use the same seed across the whole course so that the numbers in your session match the numbers discussed in class and in the notes. If you change the seed, your draws change, and your “0.19-ish” result will be a different 0.19-ish result — fine in principle, confusing when you are comparing notes. Seed first, simulate second, every time.
End each report with sessionInfo(). This prints your R version and platform, so if a result cannot be reproduced later, the record of what you ran it on is right there in the report. Include it as a final chunk (shown here, as everywhere on this site, with #| eval: false; in your copy let it run):

sessionInfo()

One file per task. Keep each lab or exploration in its own .qmd with a clear name (hello-probability.qmd, lab-02-monte-carlo.qmd, and so on). One file, one question, one set of results. It is far easier to debug, to re-run, and to hand in than a single sprawling file that does five things. When a task is done, its file stands on its own and renders from top to bottom without you having to remember a hidden step.

Together these habits mean any report on your machine can be re-run by you next month, or by a classmate on a different computer, and produce the same numbers. That reproducibility is the whole reason we bother with code in a probability course.

AI Use Note (when you use an assistant)

You may use an AI assistant (a chatbot or a coding helper) to get unstuck — to explain an error message, suggest why a chunk will not render, or remind you what an R function does. The expectation is simple: you remain responsible for everything in your file, and you verify what the tool gives you. An assistant can produce code that runs cleanly and is still wrong about the probability — a simulation that estimates the wrong quantity renders perfectly. So treat assistant output the way you treat your own first draft: check it against the model and against a simulation before you trust it.

When you lean on an assistant, jot down a one-line record of how. Capturing it keeps you honest and makes it easy to explain your process if asked.

Field	What it captures
Tool	Which assistant you used (name and, if relevant, the version).
Purpose	What you asked it for — e.g. “explain a Quarto render error,” “remind me of `rbinom`’s arguments.”
Verification	How you confirmed the result is correct — the load-bearing part: which exact calculation or simulation you checked it against, and whether it agreed. Output you did not verify is output you cannot rely on.

The Verification row is the one that matters. If you cannot say how you checked an assistant’s answer, you do not yet know whether it is right, and you should not put it in your report.

Public vs. graded

These notes, the examples, and the practice here are public and ungraded — study material only. No graded prompts, answer keys, rubrics, point values, or due dates appear on this site. Graded checkpoints, quizzes, homework, labs, the midterm, the project, and the final live in Blackboard (the LMS), which is authoritative for due dates, submissions, and grades. If this page and Blackboard ever disagree, follow Blackboard.