Statistics & Data Analysis

Experimental proof makes justifications more clear and accessible.

Methodology

Suppose we have some algorithm which we suspect runs efficiently most of the time. We can test this using the following experimental method:

This raises the following questions:

This is the set of items from which object to test are chosen. It is often referred to by $\Omega$.

We should have a population of finite size.

The set forming a population may vary over time:
- The students forming a year 1 module.
The population may be fixed:
- The permutations of a set of numbers.
You can focus on a subset of the population.
- This is to find how characteristics of a specific set effect the population as a whole.

You can’t consider everyone in a population for the following reasons:

This means that you need to sample a population with a random selection method.

Each element $X\in\Omega$ has a probability $P[X]$ of being chosen:

\[\sum_{X\in\Omega}P[x]=1;0\leq P[X]\leq 1\]

In an unbiased / uniform selection method:

\[P[x]=\frac{1}{\vert\Omega\vert}\]

Where $\vert\Omega\vert$ is the size of the population.

Every member has the same chance of being selected.

Giving everyone an equal chance can be misleading in practice. This can give errors in our analysis.

Random variables are a methods of associating a variable with a member of a population:

A student from the 1$^{\text{\bf st}}$ year has a particular grade.

\[r:\Omega\rightarrow\Bbb R\]