Week 15 — Final review & synthesis

How the whole course fits together

The week question

After fifteen weeks, here is the question that should have a clean answer: when you meet a new problem about uncertainty, what do you actually do? Not “which formula,” but “which moves” — how do you set up a model, decide what to condition on, summarize what you expect, and reason about what happens when things repeat. This is a one-day synthesis week (our last class meeting is Mon Dec 7), so the goal is not to add new machinery. It is to step back and see that the whole semester has been one connected chain of ideas, not a pile of separate tricks. We will walk that chain end to end using the case we have carried since Week 1 — Maya’s commuter morning — recalling each tool by the number it produced. By the end you should be able to look at the arc and say, in your own words, why each link follows from the one before it.

Why this matters

A cumulative final is not a memory test for thirty disconnected facts. It rewards students who can recognize what kind of problem they are looking at and reach for the right move. Someone who has memorized the binomial pmf but cannot tell when a situation is binomial will stall; someone who understands that “fixed number of independent yes/no trials with the same success chance” is the binomial will reconstruct the formula if they have to. Synthesis week is where that recognition gets built. It is also where you turn a semester of notes into a study plan that actually works — using retrieval and spacing instead of rereading — so the last stretch before the exam is efficient rather than frantic.

There is a second, longer payoff. The habits this course drills — name the sample space, state the assumptions, condition carefully, summarize with expectation and variance, reason about averages — are the habits you will use whenever you face data, risk, or a decision under incomplete information, long after STAT 35003 is behind you. The final is a checkpoint; the reasoning is the keepsake.

Learning goals

By the end of this week you should be able to:

  • Trace the arc of the course from uncertainty through the limit theorems, naming each tool and saying what problem it solves and what it assumes.
  • Pick the right tool for a described situation — recognize when to count, when to condition, when a random variable and its expectation are the right summary, which distribution family fits, and when a limit theorem applies.
  • State the four big ideas in plain words: a probability needs a model plus assumptions; conditioning is updating; expectation and variance summarize a distribution; limits explain why averages are stable.
  • Study for a cumulative exam well, using active retrieval and spaced practice rather than passive rereading, and self-test against the course’s worked cases.
  • Reconstruct the commuter’s-morning thread from memory — the chain \(0.81 \to 0.60 \to 0.162 \to E[X]=5 \to \text{Var}=2.5 \to 0.0547 \to 15\text{ min} \to \rho\approx0.35 \to \text{LLN/CLT}\) — and explain how each number followed from the previous tool.

Core vocabulary

This week introduces no new terms; it re-collects the ones the course already built. Treat the list as a checklist — if any entry feels unfamiliar, that is your signal for where to spend review time.

  • Probability model — the package of a sample space \(\Omega\), an assignment \(P(\cdot)\), and the assumptions that justify it (Week 1).
  • Conditional probability \(P(A\mid B)\) — the probability of \(A\) once we know \(B\) happened; the formal name for updating on information (Weeks 3, 5).
  • Independence \(A \perp B\) — knowing one event tells you nothing about the other, \(P(A\cap B)=P(A)P(B)\) (Week 4).
  • Random variable \(X\) — a number attached to each outcome; described by a pmf \(p(x)\) (discrete) or density \(f(x)\) (continuous) and a cdf \(F(x)=P(X\le x)\) (Weeks 7, 10).
  • Expectation and variance \(E[X]\), \(\text{Var}(X)=\sigma^2\) — the center and the spread that summarize a distribution (Week 8).
  • Distribution family — Binomial, Geometric, Poisson, Exponential, Normal: named models with parameters chosen to fit a situation (Weeks 9, 11).
  • Covariance and correlation \(\text{Cov}(X,Y)\), \(\rho\in[-1,1]\) — how two random variables move together (Week 12).
  • Law of large numbers / central limit theorem — why a sample mean settles toward \(E[X]\), and why the average of many independent pieces is approximately Normal (Week 13).

Concept development

The four big ideas

If you remember only four sentences from the whole term, make them these. Every formula on the exam is a special case of one of them.

  • A probability needs a model plus assumptions. The number \(0.81\) means nothing until you say probability of what, in what sample space, assuming what. Two careful people can get different numbers because they built different, defensible models. Always ask “model of what, assuming what?”
  • Conditioning is updating. \(P(A\mid B)\) is what you believe about \(A\) once you learn \(B\). Independence is just the case where the update does nothing; Bayes’ rule is the case where you run the update backwards. Most of the course is variations on this single move.
  • Expectation and variance summarize a distribution. \(E[X]\) is the long-run average, the balance point; \(\text{Var}(X)\) is how far values typically scatter from it. Two numbers that compress an entire pmf or density into “where” and “how spread.”
  • Limits explain why averages are stable. Individual outcomes are unpredictable, but their average is not: the LLN pins it to \(E[X]\) and the CLT makes its distribution Normal. This is the bridge from probability to the statistics you will meet next.

How to study for a cumulative final

The research-backed moves are simple and most students under-use them.

  • Retrieval practice beats rereading. Close the notes and try to produce the answer — state the addition rule, write the binomial pmf, explain why rain and lateness are dependent. The effort of pulling it from memory is what builds durable recall; rereading feels productive but mostly builds false confidence. Reread only to check after you have tried.
  • Space your sessions. Three sessions of forty minutes across three days beat one two-hour cram. Spacing forces repeated retrieval and lets the harder links resurface, which is where the learning happens. Revisit a topic after you have started to forget it, not before.
  • Interleave problem types. Mix a conditioning problem, a counting problem, and a distribution problem in one sitting rather than doing ten of each kind in a block. The hard skill on a cumulative exam is recognizing which tool applies, and you only practice recognition when types are mixed.
  • Self-test against the course’s worked cases. Cover the commuter’s-morning numbers and try to regenerate each one — \(0.81\), \(0.162\), \(E[X]=5\), \(0.0547\), \(\rho\approx0.35\) — explaining the move, not just the arithmetic. If you can rebuild the thread, you understand the course.
  • Diagnose, don’t just grind. When you miss a self-check, ask which link broke — was it the model, the conditioning, the summary, or the limit? Fix the link, not the symptom.

Worked examples

Worked example — the commuter’s morning, recalled end to end (the recurring slice)

Synthetic data; seed set. Here is the whole thread in one place, each number tagged with the move that produces it. This is the single most useful object to study: if you can walk it from memory, you have walked the course.

Setup. Maya is a commuter student. It rains with probability \(P(\text{rain})=0.30\). The shuttle is on time on \(60\%\) of rainy mornings and \(90\%\) of dry mornings.

Wk 1–2 — a model and the rules. The on-time probability is not given; it is built by the law of total probability from the rain-conditioned rates: \[ P(\text{on time}) = P(\text{on time}\mid \text{rain})P(\text{rain}) + P(\text{on time}\mid \text{no rain})P(\text{no rain}) = (0.60)(0.30) + (0.90)(0.70) = 0.81 . \] The complement rule then gives \(P(\text{late}) = 1 - 0.81 = 0.19\).

Wk 3–4 — conditioning and (non-)independence. The conditional rate \(P(\text{on time}\mid \text{rain}) = 0.60\) is not the marginal \(0.81\), so \[ P(\text{on time}\mid \text{rain}) = 0.60 \ne 0.81 = P(\text{on time}) \quad\Longrightarrow\quad \text{on time and rain are dependent.} \] Rain carries information about lateness — knowing it rained should change your guess.

Wk 5 — Bayes, running the update backwards. Switch to the lateness rates \(P(\text{late}\mid \text{rain})=0.40\) and \(P(\text{late}\mid \text{no rain})=0.10\), so \(P(\text{late})=0.19\) as before. Given that Maya was late, how likely is it that it rained? \[ P(\text{rain}\mid \text{late}) = \frac{P(\text{late}\mid \text{rain})P(\text{rain})}{P(\text{late})} = \frac{(0.40)(0.30)}{0.19} = \frac{0.12}{0.19} \approx 0.632 . \] The headline Bayes number of the course is the screening posterior: with prevalence \(0.02\), sensitivity \(0.95\), and a \(0.10\) false-positive rate, a positive test gives \(P(D\mid +) = 0.019/0.117 \approx 0.162\) — a vivid reminder that a “\(95\%\) accurate” test on a rare condition still leaves a positive mostly a false alarm.

Wk 6–9 — counting, random variables, discrete models. Maya guesses a \(10\)-question true/false quiz, \(p=0.5\). Let \(X\) be the number correct. There are \(\binom{10}{k}\) ways to get \(k\) right out of \(2^{10}=1024\) equally likely answer patterns, so \(X \sim \text{Binomial}(10, 0.5)\) with \(p(x) = \binom{10}{x}(0.5)^{10}\). Its summaries are \[ E[X] = np = (10)(0.5) = 5, \qquad \text{Var}(X) = np(1-p) = (10)(0.5)(0.5) = 2.5 . \] The chance of acing it (eight or more correct) is \[ P(X \ge 8) = \frac{\binom{10}{8}+\binom{10}{9}+\binom{10}{10}}{1024} = \frac{45 + 10 + 1}{1024} = \frac{56}{1024} \approx 0.0547 . \]

Wk 10–11 — continuous models. Shuttles arrive as a Poisson process at rate \(\lambda = 4\) per hour (one every \(15\) minutes). The wait \(T\) until the next shuttle is continuous, with probability read as area under a density: \(T \sim \text{Exponential}(\text{rate } \lambda = 4/\text{hr})\), mean \(1/\lambda = 15\) minutes, and \[ P(T \le 15\text{ min}) = 1 - e^{-\lambda t} = 1 - e^{-(4)(0.25)} = 1 - e^{-1} \approx 0.632 . \] Maya’s commute time is \(C \sim \text{Normal}(\mu = 22, \sigma = 5)\) minutes, and \(P(C \le 30) = \Phi(1.6) \approx 0.945\).

Wk 12 — joint dependence, made numeric. With \(X=\mathbb{1}\{\text{rain}\}\) and \(Y=\mathbb{1}\{\text{late}\}\) and the joint probabilities \(P(1,1)=0.12\), \(P(1,0)=0.18\), \(P(0,1)=0.07\), \(P(0,0)=0.63\), \[ \text{Cov}(X,Y) = E[XY] - E[X]E[Y] = 0.12 - (0.30)(0.19) = 0.063, \qquad \rho = \frac{0.063}{\sqrt{(0.21)(0.1539)}} \approx 0.35 . \] The positive correlation is the Week 3–4 “not independent” observation turned into a number: rain goes with lateness.

Wk 13 — limits. Average \(n\) independent commute times \(C \sim \text{Normal}(22, 5)\). The LLN says the sample mean \(\bar{C}_n \to 22\) as \(n\) grows; the CLT sharpens this to \[ \bar{C}_n \;\approx\; \text{Normal}\!\left(22, \frac{5}{\sqrt{n}}\right), \] so the spread of the average shrinks like \(1/\sqrt{n}\). That is why a semester of averaged commutes is far more predictable than any single morning — the capstone idea the whole arc was building toward.

Worked example — a new clinic-callback problem (transfer)

Synthetic context; no real data; seed set. The point of synthesis is that you can drop the whole toolkit onto a situation you have never seen. A small clinic calls patients to confirm appointments. On any given call, the patient picks up with probability \(0.40\), independently across calls. Walk the same links.

Model and conditioning. The sample space for one call is \(\{\text{pickup}, \text{no pickup}\}\) with \(P(\text{pickup}) = 0.40\) — a model with its assumption (calls behave alike and independently) stated up front. If the clinic places \(8\) calls in a block, the natural random variable is $W = $ number of pickups.

Recognize the family. Fixed number of trials (\(n=8\)), each a yes/no with the same success chance (\(p=0.40\)), independent — that is a binomial: \(W \sim \text{Binomial}(8, 0.40)\), with pmf \(p(w) = \binom{8}{w}(0.40)^w(0.60)^{8-w}\). We did not memorize a new formula; we recognized the shape.

Summarize. Center and spread come straight from the Week 8 results: \[ E[W] = np = (8)(0.40) = 3.2, \qquad \text{Var}(W) = np(1-p) = (8)(0.40)(0.60) = 1.92 . \] So the clinic should plan around roughly three pickups per block of eight, give or take about \(\sqrt{1.92}\approx 1.39\).

Apply a tail and a limit. The chance that no one picks up in a block is \[ P(W = 0) = (0.60)^{8} \approx 0.0168 , \] small but not negligible. And if the clinic runs many such blocks over a week, the average number of pickups per block is, by the LLN, going to settle near \(3.2\) — and by the CLT, that weekly average is approximately Normal — so staffing can be planned around the average even though any single block is unpredictable. Same four moves as the commuter’s morning: build the model, recognize the family, summarize, then reason about the average. That portability is the whole point of the course.

A common mistake

The signature synthesis-week error is reaching for a formula before identifying the situation. Students who studied by memorizing pmfs and integrals often pattern-match on surface words — “test” \(\to\) Bayes, “how many” \(\to\) binomial — and apply a formula whose assumptions do not hold. The clinic problem is binomial only because the calls are independent with a constant success chance; change either assumption and the formula is wrong even though the words look the same.

The repair is to lead with recognition, not recall. Before writing anything, ask the diagnostic questions in order: What is the sample space and what are the assumptions? Am I being asked to condition or update? Is the unknown a number — a random variable — and if so does it fit a named family? Do I need a summary (\(E\), \(\text{Var}\)) or a probability, and is a limit theorem in play? Two narrower slips ride along with this one: confusing a density \(f(x)\) with a probability (it is not — only its area over an interval is), and treating dependent events as independent (multiplying \(P(A)P(B)\) when rain and lateness clearly travel together). On a cumulative exam, naming the situation correctly is most of the work; the arithmetic is the easy part once the model is right.

Low-stakes self-checks (ungraded)

These are for your own thinking — ungraded, self-check, no submission. Cover the answers in your notes and try to produce each one before you look; that retrieval effort is the study method this week is recommending.

  1. From memory, write the chain of the commuter’s-morning numbers in order — \(0.81\), \(0.632\), \(0.162\), \(E[X]=5\), \(\text{Var}=2.5\), \(0.0547\), \(15\) min, \(\rho\approx0.35\) — and name the move that produces each. Where does conditioning first show up?
  2. A streaming service finds that \(5\%\) of accounts are fraudulent. Its detector flags \(90\%\) of fraud and \(8\%\) of legitimate accounts. A flagged account — how likely is it actually fraud? Which Week 5 tool is this, and why is the answer smaller than your gut expects?
  3. You are told \(X\) counts the number of defective items in a shipment of \(20\), each defective with probability \(0.03\), independently. Name the family, write \(E[X]\) and \(\text{Var}(X)\), and say which assumption would break the binomial if it failed.
  4. In one sentence each, state the four big ideas of the course, then point to one commuter’s-morning number that illustrates each.
  5. Explain, without formulas, why averaging many independent commute times is more predictable than a single commute, and name the two theorems that make that precise.

Reading and source pointer

This synthesis week ranges across Grinstead & Snell, Chapters 1–9 rather than tracking a single chapter: Ch 1 for sample spaces and the basic rules, Ch 4 for conditional probability, independence, and Bayes, Ch 3 for counting, Ch 5–6 for random variables, expectation, variance, and the named families, Ch 7 for joint distributions and dependence, and Ch 8–9 for the law of large numbers and the central limit theorem. For the shape of an effective cumulative review — organizing around big ideas and recognition rather than isolated formulas — the MIT OCW 18.05 review and orientation material is a useful companion (used for orientation only; nothing is reproduced).

These notes are the course’s own synthesis, grounded in but not copied from the sources. All example data are synthetic, with seeds set.

Public vs. graded

These notes, the examples, and the practice here are public and ungraded — study material only. No graded prompts, answer keys, rubrics, point values, or due dates appear on this site. Graded checkpoints, quizzes, homework, labs, the midterm, the project, and the final live in Blackboard (the LMS), which is authoritative for due dates, submissions, and grades. If this page and Blackboard ever disagree, follow Blackboard.

Looking ahead

This is the last set of notes for the term. Our final class meeting is Mon Dec 7, with a consultation day Dec 8 for individual questions. The final-exam window is Dec 9–15, and the exact block is announced through Blackboard — treat this note as orientation only; it contains no exam items, no schedule beyond that window, and no graded content of any kind. Beyond this course, the same arc keeps going: the central limit theorem you met in Week 13 is the doorway to inferential statistics — confidence intervals, hypothesis tests, and the Bayesian updating you have been practicing all term, now applied to data rather than to a single morning’s shuttle. You already own the reasoning; the next course supplies the data.

See also

  • Notation glossary — every fixed symbol used across the course, collected in one place for final review.
  • Distribution reference — the parameterizations of the Binomial, Geometric, Poisson, Exponential, and Normal families to check before the exam.
  • Course syllabus — schedule, policies, the final-exam window, and where graded work actually lives.

(Week 15 has no companion lab; the simulation labs pair with Weeks 2, 5, 9, and 13 — revisit the law-of-large-numbers and CLT lab when you review the limit theorems.)