Probability
Probability Rundown
Here we provide a brief summary of probability rules we will be using.
A random variable represents an event whose outcome is uncertain. A probability distribution is an assignment of weights to outcomes. Probability distributions must satisfy the following properties:
We use the notion \(P(A, B, C)\) to denote the joint distribution of the variables \(A\), \(B\), \(C\). In joint distributions ordering does not matter i.e. \(P(A, B, C) = P(C, B, A)\).
We can expand a joint distribution using the chain rule:
The marginal distribution of \(A, B\) can be obtained by summing out all possible values that variable \(C\) can take as \(P(A, B) = \sum_c P(A, B, C = c)\). The marginal distribution of \(A\) can also be obtained as \(P(A) = \sum_b \sum_c P(A, B = b, C = c)\).
When we do operations on probability distributions, sometimes we get distributions that do not necessarily sum to 1. To fix this, we normalize: take the sum of all entries in the distribution and divide each entry by this sum.
Conditional probabilities assign probabilities to events conditioned on some known facts.
Bayes' Rule:
To write that random variables \(A\) and \(B\) are mutually independent, we write \(A \newcommand{\indep}{\perp \!\!\! \perp} \indep B\), equivalent to \(B \indep A\).
Probability Inference
For the next several weeks, we will use a new model where each possible state for the world has its own probability. More precisely, our model is a joint distribution, i.e. a table of probabilities which captures the likelihood of each possible outcome, also known as an assignment of variables.
Inference by Enumeration
Given a joint PDF, we can trivially compute any desired probability distribution \(P\left(Q_1 \ldots Q_m \mid e_1 \ldots e_n\right)\) using inference by enumeration, for which we define three types of variables:
- Query variables \(Q_i\), which are unknown and appear on the left side of the conditional (\(\mid\)) in the desired probability distribution.
- Evidence variables \(e_i\), which are observed variables whose values are known and appear on the right side of the conditional (\(\mid\)) in the desired probability distribution.
- Hidden variables, which are values presents in the overall joint distribution but not in the desired distribution.
In Inference By Enumeration, we follow the following algorithm:
- Collect all the rows consistent with the observed evidence variables.
- Sum out (marginalize) all the hidden variables.
- Normalize the table so that it is a probability distribution (i.e. values sum to 1)