WTF is RFC5114?
We all know the importance of large prime numbers in cryptography, given that n = pq
form large numbers’ prime factorization requires an unreachable amount of computational resources.
But why is this RFC enforces a secure DH group (used not just in crypto, but zeroknowledge systems too like FFS or Chaum-Pedersen) and the ANSI-RSA standard requires “strong” primes on top of this? (there is a dispute on this, more below)
So as you already know, RSA begins with finding a modulus N by multiplying two large primes.
But we all know that modulo algebra is a really friendly area in math, given that the modulo is an equivalence-relation, so we can do almost any kind of magic with it without breaking the result.
There are several examples of these magics for eg Fermats-little theorem which states that if a number n is congruent to 1 modulo a factor of n, then the gcd(x − 1, n) will be divisible by that factor.
John Pollard came up with a special-usecase prime factorization like this called p-1
that uses this magic. It happens that the current RSA “rules” we define is enough for that special usecase.
The algo is defined like this (actual example is from wikipedia):
We want to get the prime factorization of 299
- First we get a number called a “smooth bound” to 299
A smooth number is a concept written down by Leonard Adleman, it means that number X is Y-bound if neither of X’s prime factors are greater than Y.
Let this number be5
for start. - Then we multiply the floor of all primes on the power of
log_p(n)
smaller than our smooth bound
Formally
We selected the bound5
, the primes up to 5 are[2,3,5]
floor(log2(299)) = 8
floor(log3(299)) = 5
floor(log5(299)) = 3
So our equation is:M = 2^8 * 3^5 * 5^3
- Then we select a random coprime
a
ton
(so that GCD(a,n) == 1, you can use the euclidean algo here), let this be 2. I recommend using always 2 for IRL RSA cracking since 2 will be always a coprime (as any other integer) and it’s faster to operate on powers of 2 on a CPU. - We search for
GCD(a^m -1 ,n)
which is 13
If the result is 1 so(a^m)-1
is also a coprime, we select a larger bound and restart
If the result is n we need a smaller bound and restart
13 is smaller than N and larger than 1 so this is our solution.
299/13 = 23 which is a prime so we are over with the factorization now.
So as you may get the overall idea now, since we won’t have an easy job with large primes we’ll need this procedure several times for a full factorization.
There’s an efficient mature implementation of p-1 and also several factorization algos here: https://gforge.inria.fr/forum/forum.php?forum_id=11510
The solution
Safe primes are primes in the form 2p+1 where p is a prime too. Because the p-1 algorithms efficiency depends on not p’s but p-1’s prime factorization, if you satisfy this dependency it implies that an attacker cannot select a smooth parameter easily (as it’s proven that if q=2p+1 then q-1 will have a large factor).
Strong primes takes this even further, these are primes p
that have a property that it’s greater then the arithmetic mean of it’s neighbor primes. Formally p_n > (p_(n-1) + p_(n+1)) / 2
. We need these because Pollard came up with a +1 and even a Rho factorization algorithm later
The case with ANSI and stuff
So you may feel like to destroy all your 4096-bit keys but remember that this algo still has a bottleneck on runtime O(B*log(B)*log^2(n)), so if you have to increase B several times you’ve fallen into the “computationally infeasible to crack” pitfall again. But this is no explanation so the ANSI-RSA standard required strong primes to be used with implementations
However
Even more efficient factorization algorithms showed up (tutorials on those later) so several other “standards” (famous xkcd here) advised that using strong primes is just an overhead for they can’t provide security against the newest weapons.