Index

Selection can be performed faster if we know where to find the rows that are being selected. One way of doing this is by using indexes.

Given the values for one or more attributes of a relation $R$, provides quick access to the tuples with these values.

Consider that we have the following data that we want to index:

Students

We can use the following index table, sorted by value:

value	pointers to rows
G401	1234, 3456, 4567
G701	2435

You should have an index for each attribute that you want to select.

There are two different types of index:

Secondary - Only points to the location of records on disk.

Always dense.
Primary - In addition, defines how data is sorted on disk.

Good when attributes involve primary keys.

To complete a selection using an index:

This has a running time of:

\[O(\log d+\text{size of output})\]

You can use the following datatypes as indexes:

You can create an index in SQL using the following command:

CREATE INDEX
	ON Students USING btree -- default, can use hash
	(programme, year); -- the columns that you want to index