Encryption 101, RSA 001 (The maths behind it)

Zup folks, for the picoCTF we (dtm, pysec, kowalski and me), had to solve several crypto challenges.
RSA being the most common subject.

So here I thought it’s time to share what I know about RSA, because it’s one of the most common algorithms seen in public key crypto.

First let’s start with a little story:
Suppose you work for the CIA or NSA. The year is 1800 and computers weren’t yet invented. Some journalist is being threatened by Erdogan, but he has valuable info and you want to make sure he can send it to your office, without having to worry about some KGB trained Putin getting his hands on it. You could send a safe with a padlock and a key, but what if someone intercepts the key and makes a copy of it?, that’d be terrible.
Luckily you meet 3 guys called Ron Rivest, Adi Shamir, and Leonard Adleman. They have an awesome invention. They have invented a padlock that uses two keys: one for unlocking and one for locking. Genius right?
So now you send the unbreakable safe with this unbreakable padlock and the key for locking to this journalist and he can write his message (‘Send Nudes’) on a paper, put it in the box, lock the box with the padlock using the special key for locking and put it somewhere on a Ali Express delivery camel. Mission Succeeded!

The two key system is what make public key crypto so succesful, we would refer to the key for locking as the public key and the key for unlocking as the private key.

Here the idea is that we use a mathematical system / formula that is easy to calculate one way, but very hard to reverse. RSA uses such a formula, the idea is as follows:

Multiplying two very big numbers is easy, finding the original two big number with just the product of it is VERY hard (tedious).

Before we continue we have to cover some mathematics, it’s probably very unlike anything you’ve ever seen in school, but it’s not that hard.

##Muh Maths

So let’s start off with looking at a very simple python program:

import sys
a = int(sys.argv[1])
b = int(sys.argv[2])
n = int(sys.argv[3])
c = (a * b) % n

That’s it, this the mathematics behind RSA, it’s called modular multiplication

In RSA you’ll usually work with really big numbers of several hundred digits
Something you might see is the following calculation:
61447^28918293809555798793287493827982374 mod 9830849029834038402948
(note that mod is the same as % earlier in the python program).

Square and multiply

So how could the computer calculate such large number?
A cool trick for this is called square and multiply
For example:
2^684343 mod 102
This would be a way too big number if you’d try to calculate
2^684343 first and then do mod 102.
Luckily there are a few cool tricks in modular multiplication:
a^b mod c is the same as (a mod c)^b mod c in MM (modular multiplication)
So for example 103^999999999999999 mod 102
is the same as (103 mod 102)^9999999999999999 mod 102 which is the same as:
1^99999999999999999 mod 102 which is just 1.

So how does this property help us solve 2^684343 mod 102?
Well:

2^684343 mod 102 ==
2 * 2^684342 mod 102 ==
2 * (2^2)^342171 mod 102 ==
2 * 4^342171 mod 102 ==
2 * 4 * 4^342170 mod 102 ==
2 * 4 * (4^2)^171085 mod 102 ==
2 * 4 * 16^171085 mod 102 ==
2 * 4 * 16 * 16^171084 mod 102 ==
2 * 4 * 16 * (16^2)^85542 mod 102 ==
2 * 4 * 16 * 256^85542 mod 102 ==
2 * 4 * 16 * (256 mod 102)^85542 mod 102 ==
2 * 4 * 16 * 52^85542 mod 102 ==
2 * 4 * 16 * (52^2)^42771 mod 102 ==
2 * 4 * 16 * 2704^42771 mod 102 ==
2 * 4 * 16 * (2704 mod 102)^42771 mod 102 ==
2 * 4 * 16 * 52^42771 mod 102 ==
2 * 4 * 16 * 52 * 52^42770 mod 102 ==
2 * 4 * 16 * 52 * (52^2)^21385 mod 102 ==
2 * 4 * 16 * 52 * 2704^21385 mod 102 ==
2 * 4 * 16 * 52 * (2704 mod 102)^21385 mod 102 ==
2 * 4 * 16 * 52 * 52^21385 mod 102 ==
2 * 4 * 16 * 52^2 * 52^21384 mod 102 ==
etc etc

So this would be an approach to solving such big calculations, I hope you understood the method.

Groups

Now let’s have a look at Groups
Groups are a mathematical term for a set of elements with an operation defined for them.
This may sound rather vague, but you already know some groups :wink:

For example, the real numbers with addition (R+) is a group.
here the operation is ‘+’ as you know it and the numbers are the numbers as you know them. (1, 5, sqrt(5), pi etc).
These are a group because the following properties hold:

  • if a is in R+ and b is in R+, then so is a + b.
  • there is an element 0 in R+, such that for all x in R+, x + 0 = x (well obviously) This is the identity element.
  • for every a in R+, there is inv(a) in R+ (-a in this case) such that a + inv(a) = 0 (a + -a = 0)
  • for every a, b and c in R+, (a + b) + c = a + (b + c)

If this doesn’t make any sense, don’t worry, it just means that adding two numbers will result in a new number and that that number exists in the numbers.
Also it means that for every number, there is a unique number, such that adding that number to the number does nothing to the number (for 4 + 0 = 4).
And lastly it means that the order doesn’t matter:
(2 + 4) + 4 is the same as
2 + (4 + 4)

We will now look at the group Zn*
This may sound weird, but don’t mind the notation.
Here Z mean the whole numbers (3, 5, -2, 2323232 etc)
and * means multiplication.

The n here stands for the modulo.
That means if n is 15 for example you’d have the group:
Z15* (Just notation).
So now rather than R+ with addition we have Zn with multiplication (Zn*).

The following things now hold for Zn*

  1. If a and b are Zn* then so is a * b
  2. There is an element e in Zn* such that a * e = a (here it’s 1)
  3. For every a in Zn* there is an a^-1 such that a * a^-1 = e

Considering these criteria let’s look at Z8* (meaning n = 8 (Zn*) ).
So what elements would be in Z8*?
0 is not in it.
After all according to rule 3 there must be an element in Z8* such that 0 * x = 1.
This is never gonna be true so 0 can’t be in it.
1 is in it.
2 is not in it.
if 2 is in Z8* then so is 22 according to rule 1. But then so is 222 = 8 mod 8 = 0, but we already found 0 is not in it.
Another way to approach the question wether 2 is in it, is by looking at rule 3. There must be some x in Z8
such that
2 * x mod 8 = 1.
But since both 2 and 8 are even numbers, this is never gonna work out.
Then with the same logic we know that 4, 6 and 8 are not in Z8*.
After fiddling around a bit we will find that only 1,3,5 and 7 are in Z8*.
That means that there are 4 numbers in Z8*.
And that if you multiple any of these numbers the result will be one of these four numbers

The order of groups

As we found earlier there are 4 elements in Z8*. We then say that the order of Z8* is 4.
Notation: #Z8* = 4 or |Z8*| = 4. Normally the order of a group Zn* can be tedious to find, but there are a few special cases:

  • n is prime
  • n is the product of two or more prime numbers who are all different from each other.

In this case we can use Eulers Totient Function. the notation for this is called phi(n)

So let’s look at the different cases discussed earlier

  1. If n is prime, then phi(n) is simply n-1
  2. If n is the product of say p,q,r (n = pqr) then phi(n) is (p-1) * (q-1) * (r-1),
    If p,q,r,d… then just (p-1)(q-1)(r-1)(d-1)…

Now why would we even care about Eulers Totient Equation?
As it turns out, if n is the product of two prime numbers or n is prime the following holds:
Let x be an element of Zn*, then x^phi(n) mod n = 1.

Back to RSA
RSA works on this principle:
Find two very large prime numbers p and q, where p != q.
Then calculate n by multiplying p and q.
calculate phi(n) by doing (p-1)*(q-1) (Eulers function).
Now pick a nice public key e such that e is prime and e is rather small (like 62219).
Now calculate the inverse of e modulo phi(n)

Muh math again

Woops how do we calculate the inverse?
Ez, use python, but since that’s trivial let’s do it by hand using a more simple example.
Suppose you have p = 13 and q is 17.
Then n = 13 * 17 = 170 + 21 + 30 = 221.
Then phi(n) = 12 * 16 = 160 + 32 = 192.
Now suppose you’ve chosen e = 7.
So now you want to calculate inv(e) mod phi(n).
There is a neat little trick for this.
It goes as following:

  0 * 7 = 192 mod 192 (since 192 mod 192 = 0)
  1 * 7 =   7 mod 192
-27 * 7 =   3 mod 192 (0*7 - 27 * 1*7 -> 192 - 189 = 2)
 55 * 7 =   1 mod 192 (1*7 -2(-27 * 7) -> 7 - 2*3 = 1)  

So this means 55 is the inverse of 7 mod 192.
Don’t worry if you don’t really understand this technique, it takes some brainfarts until you get it, but it’s not that important considering we have a computer who is perfectly capable of doing all this work for is. check out python’s gmpy2 module.

Back to RSA

So now that we know how to calculate the inverse let’s look at that RSA example once again.
We had e = 62219 we use the computer to calculate inv(e) and find d.
So now we have n,e and d. This is all you need to the RSA to work.
Suppose you have a message m that you want to encrypt.
The procedure is as follows the server gives you (n,e), it’s public key.
You then calculate c = m^e mod n.
You then send this to the server.
The server receives c and calculates m by doing c^d mod n = m.
So why does this work?

Well remember that c = m^e.
And that d = e^-1 mod phi(n)

(m^e)^(e^-1 mod phi(n) ) ==
m^(e * e^-1 mod phi(n) ) ==
m^(1 mod phi(n)) ==
m^(phi(n)+1) ==
m^phi(n) * m^1 ==
1 * m^1 ==
m^1 ==
m

Awesome right?
Let’s look at our previous example again where we had p = 13, q = 17, e = 7 and d = 55 n = 221.
Suppose we want to encrypt the message _ (a single spacebar) then in ascii that is 0x20, meaning 32 in decimals, so m = 32.

Then we calculate 32^7 mod 221 = 59
we send 59 to the server and the server then calculates
59^55 mod 221 = 32, yay success!

This was probably a bit messy, so please give me loads of feedback and I will keep improving this article.

22 Likes

Very hard article to follow. However. You do cover a very interesting topic. Breaking the article up with breaks, bullet points, and using visual aids (images and such) to help visualise how it works would really help the readability.

Including a whiteboard video from youtube explaining groups would also make the article 1000x times better.

You asked for feedback :wink:

4 Likes

I changed a lot, tell me what you think

1 Like

These were helpful to me… Especially those videos.

4 Likes

The beginning is much better. Although around groups, it starts getting quite vague.

The group notation Zn*, n stands for modulo, Z means whole numbers. And * means multiplication.

So Z2* means I can pass it a value, say, 2 (to test if it exists in the group).

So I would do 2 (the value I passed) mod 2 (the 2 in Z2*) ? and get 0? So it’s not in the group? The explanation of groups here gets me twisted. Forgive me, but I am learning from groups strictly from this article. This is quite a realistic view of a noobie just viewing groups for the first time.

So now, eulers function.

phi(n) = (p-1)*(q-1)

So, how can we get n out of this? And what significance is this to us?

I’m still trying to figure out how to encrypt “send nudes” from this.

1 Like

This really explains things. Thanks for this.

phi(n) is used to calculate d, the private key.
n you already have, n = pq
Z2
is NOT a number or a function
It’s a set of elements.

Very helpful, thanks!

I think you’ve got a little typo in your calculations:

  0 * 7 = 192 mod 192 (since 192 mod 192 = 0)
  1 * 7 =   7 mod 192
-27 * 7 =   3 mod 192 (0*7 - 27 * 1*7 -> 192 - 189 = 2)
 55 * 7 =   1 mod 192 (1*7 -2(-27 * 7) -> 7 - 2*3 = 1)

At the end of line 3 you’re saying 192 - 189 = 2 :wink:.

Anyway, I really enjoy to see posts about maths! I’m excited for a complete description of AES… :grin:

Edit: I’ve read different explanations of RSA in the last years and this is the first time I think I’ve really got it… Great job!

1 Like

Love the idea of the tutorial, however I feel like phi(n) comes out of nowhere, so I’m adding a few lines in case some are interested :

As said, Zn is the ring of integers modulo n, ie : {0, 1, …, n - 1}
Now : (Zn)* is the group of integers in Zn that admit an inverse modulo n.
(so as to respect the 3rd property of groups, as mentionned in the post)

In fact the following equivalence holds true :

x is invertible modulo n <=> x is coprime with n

coprime ?

x and n are coprime <=> x and n share NO common prime factor.

For example :
20 = 2 * 2 * 5
and
15 = 3 * 5
share 5 as a common prime factor and thus aren’t coprime.

(The proof of the equivalence relies heavily on Bezout’s theorem, so I’m not putting it out here)

So basically we now know :

x is in (Zn)* <=> x and n are coprime

Now this is where Euler’s phi comes into play :
phi(n) = the number of x in Zn that are coprime with n
= the number of elements in (Zn)*

Thus, if p is prime :
phi§ = number of x in Zp such that x and p don’t share a common prime factor
But since p is prime, x and p have a factor in common <=> x is a multiple of p
<=> x = 0 mod p
Now since x is in {0, 1, … p - 1}, x is coprime with n <=> x != 0
<=> x is in {1, 2, … p - 1}
Hence phi§ = p - 1

Using the same idea, you can show that if n = p * q, with p and q 2 prime numbers
phi(n) = (p - 1)(q - 1)
By counting the number of multiples of p and q

TL; DR : phi(n) is the number of elements in (Zn)*

EDIT :
The proof of the decryption is actually incomplete :

Let x be an element of Zn*, then x^phi(n) mod n = 1.

However M doesn’t have to be in (Zn)*, so you can’t apply this theorem directly to show that “M^phi(n) = 1 mod n”

The usual way :
Using the Chinese Remainder Theorem :

x = M mod n <=> x = M mod p and x = M mod q

mod p :
We now simply show that M^(phi(n) * k + 1) = M mod p :
phi(n) = (p - 1)(q - 1)
Thus M ^ (phi(n) * k + 1) = M ^ ((p-1)(q-1)k + 1)
= M * (M ^ (p - 1) ) ^ (q-1)k

Now 2 cases :

  • either M is coprime with p, ie M isn’t a multiple of p
  • Or M = 0 mod p : M is a multiple of p

In the first case, we can use Fermat’s theorem :
M ^ (p-1) = 1 mod p
and thus M * (M ^ (p - 1)) ^ (q-1)k = M * 1 = M mod p

In the Second Case : M = 0 mod p
We have in particular M^(phi(n) * k + 1) = 0 = M mod p

Doing the same mod q and using the Chinese theorem, the proof is complete.

2 Likes

Wow that was a really helpful post!

I changed it earlier but forgot to save the edit and I went back today and pressed cancel :cry: :cry: :cry: :cry:

2 Likes

I have a little question. I take every character solo, right? What’s about encrypt every character from ascII to search them in the text for reading the message? It’s totally easy if I’m correct. But if it’s true someone already did it. What do you think?

No you’d take multiple characters: for example “hello” would become 0x68657a7a7d which would be some big number.

Ok thank you for this answer

This topic was automatically closed after 30 days. New replies are no longer allowed.