This post has the goal to collect and organize what I have learned about cryptography, in particular RSA algorithm.

I am not a professional in this sector by any means, but after stumbling on some posts on BlueSky discussing cryptography I got curious. I read something about elliptic curve, but I figured out it was like learning to hike starting from Everest.

So I went to the basics: the RSA algorithm.

But first why would you need cryptography? Basically to communicate with another person in such a way that just the two of you could understand the messages. If someone steals the message, he should not be able to understand the original message.

Symmetric, Asymmetric Encryption and RSA algorithm

There are two big branches of encryption: symmetric and asymmetric.

With symmetric encryption the key to encrypt and decrypt the message is the same. Therefore the sender and the receiver should agree on this one before starting exchanging messages. This requirement can be easily satistified if the two people can agree on it in a safe way, for instance meeting each other. But you can see how this is highly unfeasable for today’s internet. That’s why we need asymmetric encryption.

In asymmetric encryption one key is public for everybody to know to encrypt the message (public key) and the key to decrypt the message is known just by the receiver (private key). The public key is available to anyone that wants to send a message to you, while you are the only one knowing the private key.

For defining these two keys we cannot just randomly choose them, an algorithm is needed.

One of the most famous algorithm is RSA. It is an elegant solution to a hard problem that uses modular algebra and prime numbers.

Prime numbers are numbers that have no divisors except for themselves and 1. I think modular algebra deserves a section for itself.

Modular Algebra

Here I want to provide just the basis of this interesting field. For a more detailed explanation check out Wikibooks.

Citing Wikipedia, modular arithmetic is a system of arithmetic for integers, where numbers “wrap around” when reaching a certain value, called the modulus. A bunch of stuff related to time measurement are connected to modulus arithmetic. On a 12-hours clock if you are at 10 (a.m.) and add 4 hours, you arrive to twelve, wrap around it, and then start counting from one, reaching 2. On a calendar, if you are on the 5th day of the week (Friday) and want to add 10 days you will not get to the 15th day of the week, but reset every time you reach the seventh one. In this case you arrive to the $(4 + 10) \bmod 7$, i.e. Monday.

Using a formula: $(4+10) \bmod 7 = 14 \bmod 7 = 0$.

In the end the result is the rest of the division.

We have also subtraction $(13 - 4) \bmod 7 = 9 \bmod 7 = 2$

and multiplication $(5 \cdot 3) \bmod 7 = 15 \bmod 7 = 1$

Division can be considered as multiplication by the inverse (like in standard arithmetic) and for RSA Algorithm inverse is probably the most important thing. So let’s take a deeper look.

Inverse

The inverse of a number $\mathit{m}$ is defined as the integer $\mathit{x}$ such that $(\mathit{m} \cdot \mathit{x})\bmod \mathit{n} = 1$.

For being concrete $\mathit{m}=5$, $\mathit{n}=12$.

With numbers these small you can try for $\mathit{x}$ going from $0$ to $11$ and find the one satisfying the constraint. But for larger modulus, that’s pretty hard. Thanks to Euclide there is a nice algorithm to compute this value.

$$ \begin{aligned} 5 \cdot \mathit{x} &= 1 \bmod 12 \\
5 \mathit{a} &= 1 + 12\mathit{b}\\
5 \mathit{c} &= 1 + 2\mathit{b} \\
\mathit{c} &= 1 + 2\mathit{d} \\
\end{aligned} $$

The second passage is because any integer $\mathit{b}$ multiplied by $\mathit{n}$ is equal to $0$ in modulo $\mathit{n}$.

We want the smallest $x$ possible, so we take $d=0 \rightarrow c = 1 \rightarrow b = (5 \cdot 1 - 1) / 2 = 4 / 2 = 2\rightarrow a = (1 + 12 \cdot 2) / 5 = (1 + 24) / 5 = 5$.

This should render without any issues.

Indeed $ 5 \cdot 5 \bmod 12 = 25 \bmod 12 = (2 \cdot 12 + 1) \bmod 12 = 1$.

RSA Algorithm

RSA is a public-key cryptosystem, one of the oldest that is widely used for secure data transmission. The initialism “RSA” comes from the surnames of Ron Rivest, Adi Shamir and Leonard Adleman, who publicly described the algorithm in 1977.

It uses modular arithmetic and prime numbers. Indeed it leverages the fact that is hard to factor out the product of two large prime numbers.

Algorithm

Being an algorithm, let’s see what are the steps:

  • Take $p, q$ large prime numbers
  • Compute $n = p \cdot q$
  • Compute the totient of $n$ defined as $\phi(n) = (p-1)(q-1)$
  • Choose the public exponent $d$ such that the maximum common divisor between $\phi(n)$ and $d$ is 1, i.e. $d$ and $\phi(n)$ are coprimes.
  • Define the private exponent $e$ as the inverse of the public exponent in modulo $\phi(n)$, namely $e$ s.t. $d \cdot e = 1 \bmod(\phi(n))$
  • Define public key as $(d, n)$ and private key as $(e, n)$

Example

Let’s replace the variables with some real values. Since it’s just an illustrative example let’s consider large $11$ and $5$. Therefore:

  • $p=11$ and $q=5$
  • $n = p \cdot q = 55$
  • $\phi(n) = (p-1)(q-1) = 10 \cdot 4 = 40$
  • $d$ s.t. $d$ and $\phi(n)$ are coprimes: Let take $d = 7$.
  • e s.t. $d \cdot e = 1 \bmod(\phi(n))$, $7 \cdot e = 1 \bmod(40) \rightarrow e = 23$ indeed $23 \cdot 7 = 161$ and $40 \cdot 4 = 160 \rightarrow 161 \bmod 40 = 1$
  • Public key $(7, 55)$, Private key $(23, 55)$

Encryption and Decryption

Now we have the algorithm to define the public and private key, but how are those used to encrypt and decrypt a message?

The encrypted message $C$ is obtained by elevating message $M$ to the power of the public exponent $d$ and applying modulo operation with base $n$.

$$ \begin{aligned} C = M^d \bmod(n) \end{aligned} $$

For decrypting, for how the inverse is defined with the modulo operation, we have to elevate $C$ to the power of the private exponent $e$ and apply modulo operation with base $n$.

$$ \begin{aligned} M = C^e \bmod(n) \end{aligned} $$

For encoding the message the characters are mapped to their ASCII values and then stringed together using big-endian or little-endian. I would have liked to provide an example of the whole encoding-encryption-decryption-decoding of a message but for any reasonable message, the calculations are not trivial by hand.

Conclusions

Writing this post I learned about a simple but effective cryptographic algorithm. Even if RSA is now outdated is nice to see how a relative easy to understand algorithm can be used in an important field of computer science. Its power lies in the fact that is hard to discover the private exponent by brute force (thanks product of large prime numbers!).

As a by-product of this post I integrated Mathjax support for the blog. Considering the time spent to make it work, I am better using it again in one of the next posts!