UoL CS Notes

Sorting - Counting Inversions & Quick Sort

COMP202 Lectures 2022-02-17

Counting Inversions - Divide & Conquer

The standard approach for counting inversions takes:

\[\Omega(n^2)\]

we can speed this up by using the following method.

Divide the permutation into two fairly equal parts.
Sort each sub-list and merge them into a single sorted list. Then count the inversions again.

This is a modified merge sort.

Divide & Conquer Pseudo-code

countInversions(L)

if L has one element in it then
	there are no inversions, so return(0, L)
else
	divide the list into two halves
		A contains the first n/2 elements
		B contains the last n/2 elements
	(k_A, A) = countInversions(A)
	(k_B, B) = countInversions(B)
	(k, L) = mergeAndCount(A, B)
	return (k_A + k_B + k, L)

mergeAndCount(A, B)

CurrentA = 0
CurrentB = 0
Count = 0
L = []
while both lists (A and B) are non-empty	
	let a_i and b_j denote the elements pointed to by CurrentA and CurrentB
	append the smaller of a_i and b_j to L
	if b_j is the smaller element then
		increase Count by the number of elements remaining in A
	advance the Current pointer of the appropriate list
Once one of A and B is empty, append the remaining elements to L
return(Count, L)

This method has a time of:

\[O(n\log n)\]

Quick Sort

Divide if $\vert S\vert>1$, select a pivot element $x$ in $S$ and create three sequences:
- $L$ stores elements in $S$ that are less than $x$.
- $E$ stores elements in $S$ that are equal to $x$.
- $G$ stores elements in $S$ that are greater than $x$.
Recursively sort the sequences $L$ and $G$.
Put the sorted elements from $L$, $E$ and $G$ together to form a sorted list.

The worst case running time is:

\[O(n^2)\]

however we can improve this, by using a random pivot to:

\[O(n\log n)\]

We can use this to select a good pivot that balances the binary tree that is created.

A good pivot has at least $\frac 1 4$ of the whole list on both sides of the pivot.
The probability of choosing a good pivot is $\frac 1 2$ as there are $\frac n 2$ good pivots.
- As a result we can expect to get a good pivot in two repeats.
As the list will have at most $\frac 3 4$ of the list on either side of the pivot, there are at most $\log_{\frac 4 3} n$ levels when using good pivots.
Hence, the expected length of each path will be $2\log_{\frac4 3}n$