Number Theory & Cryptography

COMP202 Lectures 2022-05-14

Symmetric Key Cryptography

This topic has been covered before in:

COMP211 - Symmetric Key Cryptography

One-Time Pad

This is one of the most secure ciphers known.

For this encryption scheme Alice and Bob share a random bit string $K$ that is as long as any message that are going to send.

This key $K$ is the symmetric key that is used for the encryption and decryption process.

Although is is very secure and fast to compute, Alice and Bob must share very long keys $K$ which are used only once. This gives a lot of overhead.

One-Time Pad Encryption

To encrypt the message $M$, Alice computes:
\[C=M\oplus K\]
$\oplus$ is the exclusive or (XOR) operation.
Alice then sends $C$ to Bob on any reliable communication channel. The communication is secure because $C$ is computationally indistinguishable from a random bit string.

This relies highly on the fact that $K$ was selected randomly.

One-Time Pad Decryption

Bob can easily decrypt the cipher-text $C$ to recover $M$ using the following:

\[\begin{aligned} C\oplus K &=(M\oplus K)\oplus K\\ &= M\oplus (K\oplus K)\\ &= M\oplus 0\\ &= M \end{aligned}\]

where 0 represents the all-zero string with the same length as $M$.

Number Theory

The following methods allow us to devise a solution for public key cryptography.

Divisibility

Given integers $a$ and $b$, we use the notation $a\vert b$ to denote that $b$ is a multiple of $a$.

If $a\vert b$ then there is another integer $k$ such that $b=a\times k$. We note the following properties about this divisibility relationship:

Theorem: Let $a,b,c$ be integers:

If $a\vert b$ and $b\vert c$ then $a\vert c$ (transitivity).
If $a\vert b$ and $a\vert c$ then $a\vert(i\times b +j\times c)$ for all integers $i$ and $j$.
If $a\vert b$ and $b\vert a$ then either $a=b$ or $a=-b$.

Prime Numbers and Composite Numbers

Integers that are not prime are composite.

Theorem: Let $n>1$ be an integer. Then there is a unique set of prime numbers ${p_1, \ldots,p_k}$ and positive integers ${e_1,\ldots,e_k}$ such that:

\[n=p_1^{e_1}\times\ldots\times p_k^{e_k}\]

The product $p_1^{e_1}\times\ldots\times p_k^{e_k}$ is known as the prime decomposition of $n$. It is unique, up to the ordering of the primes in the factorisation.

Greatest Common Divisor

Let $a$ and $b$ denote positive integers. The greatest common dividor of $a$ and $b$, denoted $\text{gcd}(a,b)$, is the largest integers that divides both $a$ and $b$.

If $\text{gcd}(a,b)=1$, then we say that $a$ and $b$ are relatively prime.

The definition of the GCD can be extended like so:

If $a>0$, then $\text{gcd}(a,0)=\text{gcd}(0,a)=a$.
$\text{gcd}(a,b)=\text{gcd}(\lvert a\rvert,\lvert b\rvert)$ if $a$ and/or $b$ is negative.
$\text{gcd}(0,0)$ is undefined as there is no largest integer that divides 0.

Theorem: If $d=\text{gcd}(a,b)$, then there exist integers $j$ and $k$ such that:

\[d=j\times a +k\times b\]

In other words, the greatest common divisor of $a$ and $b$ is expressible as a linear combination of $a$ and $b$:

Function	Expression
$\text{gcd}(56,24)=8$	$8=1\times 56 +(-2)\times(24)$
$\text{gcd}(25,32)=1$	$1=5\times 25 +(-4)\times 31$
$\text{gcd}(45,25)=5$	$5=(-1)\times 45 +2\times 25$
$\text{gcd}(57,363)=3$	$3=57\times 51 +(-8)\times 363$

Modulo Operator

For $b>0$, the modulo operator, denoted by $a\mod b$, defines the remainder of $a$ when divided by $b$. That is $r=a\mod b$ means that:

\[r=a-\left\lfloor\frac a b\right\rfloor b\]

This means that $r$ is always an integer in the set ${0,1,\ldots,b-1}$ (even when $a$ is negative), and there is an integer $q$ such that:

\[a=q\times b +r\]

Congruences

$a$ is congruent to $b$ modulo $n$ if the following is true:

\[a \mod n = b\mod n\]

We can write this as:

\[a\equiv b\pmod n\]

In other words, if $a\equiv b \pmod n$, then $a-b=kn$ for some integer $k$. We can see this in the following example:

\[16\equiv 8\pmod2\]

as $16-8$ is a multiple of 2.

Euclid’s Algorithm

Euclid’s algorithm is a method to fine the greatest common divisor of two integers $a$ and $b$. This algorithm relies on the following property:

Theorem: Suppose $a\geq b>0$, then:

\[\text{gcd}(a,b) = \text{gcd}(b,a\mod b)\]

Euclid’s Algorithm Pseudo-Code

EuclidGCD(a, b)

Input: $a$ and $b$ are non-negative and not both zero.
Output: $\text{gcd}(a,b)$

while b != 0 do
	(a, b) = (b, a mod b)
return a

If $b=0$, this routine will return the value of $a$. This is correct under our assumption that $a$ and $b$ are not both zero.

Euclid’s Algorithm Example

We can display this algorithm in a table:

	1	2	3	4	5	6	7
$a$	412	260	152	108	44	20	4
$b$	260	152	108	44	20	4	0

hence $\text{gcd}(412, 260)=4$.

Euclid’s Algorithm Time Complexity

For $i>0$, let $a_i$ denote the first element of the ordered pair during the $i^\text{th}$ step in the while loop. The second argument is equal to $a_{a+1}$, so:

\[a_{i+2} = a_i\mod a_{i+1}\]

This implies that, after the first time through the loop, the sequence $a_i$ is strictly deceasing.

We can show that $a_{i+2}<\frac 1 2 \times a_i$. This lead the following result:

Theorem: Let $a,b$ be two positive integers. Euclid’s algorithm computes $\text{gcd}(a,b)$ by executing:

\[O(\log(\max(a,b)))\]

arithmetic operations.

Extended Euclidean Algorithm

If $d=\text{gcd}(a,b)$, there are integers $j$ and $k$ such that $d=j\times a + k\times b$.

We can modify Euclid’s algorithm to fine these numbers $j$ and $k$ wehile we compute $\text{gcd}(a,b). This is called the extended Euclidean algorithm.

Extended Euclidean Algorithm Pseudo-Code

ExtendedEuclidGCD(a, b)

Input: Non-negative integers $a$ and $b$ (not both zero).
Output: $d=\text{gcd}(a,b)$, integers $j,k$ where $d=j\times a+k\times b$.

if b = o then
	return (a, 1, 0)
r = a mod b
q = floor(a / b)	// Let q be the integers such that a = q * b + r
(d, j, k) = ExtendedEuclidGCD(b,r)
return (d, k, j - kq)

Extended Euclidean Algorithm Example

We can represent $\text{gcd}(412, 260)$ in the following table:

	1	2	3	4	5	6	7
$a$	412	260	152	108	44	20	4
$b$	260	152	108	44	20	4	0
$q=\left\lfloor\frac a b\right\rfloor$	1	1	1	2	2	5	*
$r = a\mod b $	152	108	44	20	4	0	*
$j$	12	-7	5	-2	1	0	1
$k$	-19	12	-7	5	-2	1	0

Note that $a,b,q$ and $r$ are filled from left to right, while $j$ and $k$ are filled from right to left. This means that the final answer for $j$ and $k$ are in column 1.

This produces the result of:

\[12\times 412 + (-19)\times260=4\]

Public-Key Cryptography

A public-key cryptosystem consists of an encryption function $E$ and a decryption function $D$. For any message $M$, the following properties must hold:

Both E and D are easy to compute for the following:
\[D(E(M))=M\\ E(D(M))=M\]
It is computationally infeasible to derive D from E.
The knowledge of the encryption method gives no information about the decryption scheme. Anybody can send a private message to the hold of the function $D$, but only that person knows how to decrypt it.

For this reason $E$ is referred to as a one-way, or trapdoor, function.

In this kind of encryption method $E$ is made public and $D$ is kept private.
The recipient of a message should be able to verify that it came from the sender, assuming that the sender is the only person who has a the private key.

This is referred to as digital signatures.

RSA

RSA Encryption

The main idea of the RSA method is as follows:

Let $p$ and $q$ denotes two large distinct prime numbers.
Let $n=p\times q$ and define $\phi(n) = (p-1)(q-1)$.
We can then choose two numbers $e$ and $d$ such that:
1. $1<e<\phi(n)$
2. $e$ and $\phi(n)$ are relatively prime ($\text{gcd}(e,\phi(n))$)
3. $ed\equiv1(\mod\phi(n))$
  
  We can use the Extended Eculidean algorithm to find $d$, given $e$.

Assume that we have a message to send. We can split this message up into block of at most $n$ characters. We can then encrypt these messages $M$ and concatenate them back together later.

\[0<M<n\]

The plain-text $M$ is encrypted using the public keys $e$ and $n$ by the following operation:

\[C = M^e\mod n\]

RSA Decryption

Decryption of the received cipher-text $C$ is handled by modular exponentiation:

\[M = C^d \mod n\]

This is correct as for every integer $0<x<n$ we have:

\[x^{ed}\equiv x(\mod n)\]

RSA Encryption Strength

Even knowing $e$ doesn’t allow us to figure out $d$, unless we know $\phi(n)$.

Most cryptographers believe that breaking RSA required the computation of $\phi(n)$, which in turn requires factoring $n$. Factoring has not been proven to be difficult but it can take a long time to solve.

As the ability to factor large numbers increases, we simple have to choose larger primes $p$ and $q$ so that $n=p\times q$ is outside of the current factoring capabilities.

RSA Time Complexity

Using fast exponentiation, the size of the operands is never more than $O(\log n )$ bits, and it takes $O(\log k)$ arithmetic operations to find $x^k\mod n$.

This leads to the following result:

Theorem: Let $n$ be the modulus of the RSA algorithm. Then RSA encryption, decryption, signature and verification each take $O(\log n)$ arithmetic operations per block.

Digital Signatures

RSA supports digital signatures. Suppose that Bob sends a message $M$ to Alice and that Alice wants to verify that it was Bob who sent it. Bob can create a signature using the decryption function applied to $M$:

\[S = M^d\mod n\]

Alice verifies the digital signature using the encryption function, by checking that:

\[M\equiv S^e(\mod n)\]

Since only Bob knows the decryption function, this will verify that it was indeed Bob who sent the message.

Any person can use the encryption function as well to reconstruct the message $M$, so this is not a method to secretly pass information from Bob to Alice.

Fast Exponentiation

A possible bottleneck in the RSA algorithm is computing expressions of the form:

\[x^k \mod n\]

The naive approach is to calculate $x^2\mod n$, then use that to get $x^3\mod n$ and so on.

We can do much better with an algorithm based on repeated squaring. For example, if we wanted to compute $x^{16}$, we could first find $x^2$ and keep squaring to get $x^{16}$

If we are performing modular exponentiation, like in RSA, after each step we can find $x^i\mod n$ to keep the results small (between 0 and $n-1$).

Fast Exponentiation Pseudo-Code

FastExponentation(x, k, n)

Input: Integers $x, k\geq 0$, and $n>0$
Output: $r=x^k\mod n$

r = 1
t = x
while k != 0 do
	if k is odd then
		r = r * t mod n
	t = t^2 mod n
	k = floor(k / 2)
return r