Notation glossary

The symbols and conventions this course commits to

Probability has more than one perfectly respectable way to write almost everything: complements, distributions, and even the humble Normal all come in competing notations. Different textbooks make different choices, and the disagreements are quiet — a wrong reading of a symbol renders just as cleanly as a right one. This page collects the choices this course commits to, so that when you meet a symbol in the week notes you can look it up here and know exactly what it means and what it does not mean.

This mirrors the instructor’s private notation ledger; it is an original reference, not reproduced from any source. The conventions are the course’s own synthesis, grounded in but not copied from Grinstead & Snell or any other text.

Core symbols

Read this table as the course’s working dictionary. The middle column says what the symbol means; the right column flags the choice we made when more than one was on the table.

Symbol Meaning Convention chosen
\(P(\cdot)\) the probability of an event always \(P(\cdot)\), never \(\Pr\)
\(A^{c}\) the complement of event \(A\) — everything in \(\Omega\) that is not in \(A\) \(A^{c}\), never \(A'\) or \(\bar{A}\)
\(A\cup B,\; A\cap B\) union (“\(A\) or \(B\)”) and intersection (“\(A\) and \(B\)”) “or” is inclusive\(A\cup B\) allows both
\(P(A\mid B)\) the conditional probability of \(A\) given that \(B\) happened defined only when \(P(B)>0\)
\(A \perp B\) \(A\) and \(B\) are independent independence, not mutual exclusivity
\(X,\, Y\) random variables capitals for the variable; lowercase \(x,y\) for values
\(p(x)\) probability mass function of a discrete \(X\) \(p(x)=P(X=x)\)
\(f(x)\) probability density function of a continuous \(X\) a density, not a probability
\(F(x)\) cumulative distribution function \(F(x)=P(X\le x)\)
\(E[X]\) the expectation (long-run average) of \(X\) call the value \(\mu\) when you name it
\(\operatorname{Var}(X),\; \sigma^{2}\) the variance of \(X\) the two notations mean the same thing
\(\sigma\) the standard deviation \(\sigma=\sqrt{\operatorname{Var}(X)}\)
\(\operatorname{Cov}(X,Y),\; \rho\) covariance and correlation \(\rho\in[-1,1]\); covariance is unbounded
\(X\sim \text{Dist}(\cdot)\) \(X\) is distributed as” the named distribution parameterizations fixed below

A few of these are worth saying out loud. The bar in \(P(A\mid B)\) is not division — it is the word “given,” and the whole expression only makes sense when the conditioning event has positive probability. The symbol \(p(x)\) and the phrase \(P(X=x)\) are interchangeable for a discrete variable, but \(f(x)\) and a probability are not interchangeable: a density can exceed \(1\), and only the area under it is a probability. And \(A\perp B\) (independent) and “\(A\) and \(B\) can’t both happen” (mutually exclusive) are genuinely different ideas — the course keeps them apart deliberately, and so should you.

The parameterizations we fix

Several standard distributions can be written with more than one parameter, and the choices contradict each other across textbooks and software. To avoid silent mistakes, the course fixes one parameterization for each and states it again at the point of use.

  • \(\text{Binomial}(n,p)\)\(n\) independent trials, each a success with probability \(p\). Mean \(np\), variance \(np(1-p)\). In the recurring case, the \(10\)-question guessing quiz is \(X\sim\text{Binomial}(10,0.5)\), so the mean is \(5\) and the variance is \(2.5\).

\[ p(x)=\binom{n}{x}\,p^{x}\,(1-p)^{\,n-x}, \qquad x=0,1,\dots,n. \]

  • \(\text{Geometric}(p)\) — the number of trials up to and including the first success, so the support is \(\{1,2,3,\dots\}\) (it cannot be \(0\) under our convention) and the mean is \(1/p\). This is the “trials-to-first-success” reading, not the “failures-before-the-first-success” reading.

\[ p(x)=(1-p)^{\,x-1}\,p, \qquad x=1,2,3,\dots, \qquad E[X]=\frac{1}{p}. \]

  • \(\text{Poisson}(\lambda)\)\(\lambda\) is the mean count, and a defining feature is that the mean and the variance are equal: mean \(=\) variance \(=\lambda\). In the recurring case, shuttle arrivals over an hour are \(\text{Poisson}(\lambda=4)\).

\[ p(x)=\frac{e^{-\lambda}\,\lambda^{\,x}}{x!}, \qquad x=0,1,2,\dots, \qquad E[X]=\operatorname{Var}(X)=\lambda. \]

  • \(\text{Exponential}(\lambda)\)\(\lambda\) is a rate, not a mean. The mean is its reciprocal, \(1/\lambda\). In the recurring case, the wait for the next shuttle is \(\text{Exponential}(\lambda=4/\text{hr})\), so the mean wait is \(1/\lambda = 15\) minutes.

\[ f(x)=\lambda\,e^{-\lambda x}, \qquad x\ge 0, \qquad E[X]=\frac{1}{\lambda}. \]

  • \(\text{Normal}(\mu,\sigma)\) — the two parameters are the mean \(\mu\) and the standard deviation \(\sigma\)not the variance. This matches R’s rnorm(n, mean, sd). When a textbook writes \(\text{Normal}(\mu,\sigma^{2})\) it means the same distribution but lists the variance second; this course always lists the standard deviation second and says so. In the recurring case, commute time is \(C\sim\text{Normal}(\mu=22\text{ min},\,\sigma=5\text{ min})\).

\[ f(x)=\frac{1}{\sigma\sqrt{2\pi}}\;\exp\!\left(-\frac{(x-\mu)^{2}}{2\sigma^{2}}\right). \]

The single most common parameterization mistake is reading the second Normal slot as a variance when this course intends a standard deviation (or vice versa). When in doubt, the rule here is: the second number is the SD. All numbers in the examples above are synthetic; seeds are set wherever simulation appears.

Words we keep distinct

Some pairs of terms get blurred in casual speech but mean different things in probability. The course keeps each pair separate on purpose, and most graded misunderstandings trace back to collapsing one of them.

  • Probability vs. conditional probability. \(P(A)\) is the probability of \(A\) with no extra information; \(P(A\mid B)\) is the probability of \(A\) once you know \(B\) happened. They can be wildly different. In the recurring case \(P(\text{on time})=0.81\), but \(P(\text{on time}\mid\text{rain})=0.60\) — conditioning on rain changes the answer, which is exactly why the two events are not independent.

  • Independent vs. mutually exclusive. \(A\perp B\) (independent) means knowing one tells you nothing about the other: \(P(A\cap B)=P(A)P(B)\). Mutually exclusive means the two cannot both occur, so \(P(A\cap B)=0\). These are not the same — in fact two events with positive probability that are mutually exclusive are necessarily dependent, because one happening rules the other out.

  • pmf vs. density. A probability mass function \(p(x)=P(X=x)\) gives an actual probability at each value of a discrete variable, and these probabilities sum to \(1\). A density \(f(x)\) for a continuous variable is not a probability — it can be larger than \(1\), and a single point carries probability \(0\). Only the area under \(f\) over an interval is a probability: \(P(a\le X\le b)=\int_a^b f(x)\,dx\).

  • Variance vs. standard deviation. The variance \(\operatorname{Var}(X)=\sigma^{2}\) is measured in squared units; the standard deviation \(\sigma=\sqrt{\operatorname{Var}(X)}\) is in the original units and is the one you can compare directly to the values of \(X\). For the guessing quiz, \(\operatorname{Var}(X)=2.5\) while \(\sigma\approx 1.58\).

  • Covariance vs. correlation. Covariance \(\operatorname{Cov}(X,Y)=E[XY]-E[X]E[Y]\) measures whether two variables move together, but its scale depends on the units, so its size alone says little. Correlation \(\rho=\operatorname{Cov}(X,Y)/(\sigma_X\sigma_Y)\) rescales it to the fixed range \([-1,1]\), where the sign is the direction and the magnitude is the strength. For the (rain, late) pair in the recurring case the covariance is \(0.063\) and the correlation is \(\rho\approx 0.35\) — same story, but correlation is the one you can interpret on a universal scale.

Public vs. graded

These notes, the examples, and the practice here are public and ungraded — study material only. No graded prompts, answer keys, rubrics, point values, or due dates appear on this site. Graded checkpoints, quizzes, homework, labs, the midterm, the project, and the final live in Blackboard (the LMS), which is authoritative for due dates, submissions, and grades. If this page and Blackboard ever disagree, follow Blackboard.