Skip to content
UoL CS Notes

Index

COMP207 Lectures

Selection can be performed faster if we know where to find the rows that are being selected. One way of doing this is by using indexes.

Index

Given the values for one or more attributes of a relation $R$, provides quick access to the tuples with these values.

Consider that we have the following data that we want to index:

  • Students

    id name programme
    1234 Anna G401
    2345 Ben G701
    3456 Chloe G401
    4567 Dave G401

We can use the following index table, sorted by value:

value pointers to rows
G401 1234, 3456, 4567
G701 2435

You should have an index for each attribute that you want to select.

There are two different types of index:

  • Secondary - Only points to the location of records on disk.

    Always dense.

  • Primary - In addition, defines how data is sorted on disk.

    Good when attributes involve primary keys.

Using indexes

To complete a selection using an index:

  1. Find the entry that you are selecting in the index (G401).
  2. Visit all rows in Students whose ids occur in the index entry for G401.

This has a running time of:

\[O(\log d+\text{size of output})\]

Forms of Indexes

You can use the following datatypes as indexes:

  • Hash-Index - Good for equality.
  • B+- Tree - Good for ranges.

You can create an index in SQL using the following command:

CREATE INDEX
	ON Students USING btree -- default, can use hash
	(programme, year); -- the columns that you want to index