Breaking non-ANSI RSA with Pollard's algorithm

Silur · August 25, 2017, 7:22pm

WTF is RFC5114?

We all know the importance of large prime numbers in cryptography, given that n = pq form large numbers’ prime factorization requires an unreachable amount of computational resources.
But why is this RFC enforces a secure DH group (used not just in crypto, but zeroknowledge systems too like FFS or Chaum-Pedersen) and the ANSI-RSA standard requires “strong” primes on top of this? (there is a dispute on this, more below)

So as you already know, RSA begins with finding a modulus N by multiplying two large primes.
But we all know that modulo algebra is a really friendly area in math, given that the modulo is an equivalence-relation, so we can do almost any kind of magic with it without breaking the result.
There are several examples of these magics for eg Fermats-little theorem which states that if a number n is congruent to 1 modulo a factor of n, then the gcd(x − 1, n) will be divisible by that factor.

John Pollard came up with a special-usecase prime factorization like this called p-1 that uses this magic. It happens that the current RSA “rules” we define is enough for that special usecase.

The algo is defined like this (actual example is from wikipedia):

We want to get the prime factorization of 299

First we get a number called a “smooth bound” to 299
A smooth number is a concept written down by Leonard Adleman, it means that number X is Y-bound if neither of X’s prime factors are greater than Y.
Let this number be 5 for start.
Then we multiply the floor of all primes on the power of log_p(n) smaller than our smooth bound
Formally
We selected the bound 5, the primes up to 5 are [2,3,5]
floor(log2(299)) = 8
floor(log3(299)) = 5
floor(log5(299)) = 3
So our equation is: M = 2^8 * 3^5 * 5^3
Then we select a random coprime a to n (so that GCD(a,n) == 1, you can use the euclidean algo here), let this be 2. I recommend using always 2 for IRL RSA cracking since 2 will be always a coprime (as any other integer) and it’s faster to operate on powers of 2 on a CPU.
We search for GCD(a^m -1 ,n) which is 13
If the result is 1 so (a^m)-1 is also a coprime, we select a larger bound and restart
If the result is n we need a smaller bound and restart
13 is smaller than N and larger than 1 so this is our solution.
299/13 = 23 which is a prime so we are over with the factorization now.

So as you may get the overall idea now, since we won’t have an easy job with large primes we’ll need this procedure several times for a full factorization.
There’s an efficient mature implementation of p-1 and also several factorization algos here: https://gforge.inria.fr/forum/forum.php?forum_id=11510

The solution

Safe primes are primes in the form 2p+1 where p is a prime too. Because the p-1 algorithms efficiency depends on not p’s but p-1’s prime factorization, if you satisfy this dependency it implies that an attacker cannot select a smooth parameter easily (as it’s proven that if q=2p+1 then q-1 will have a large factor).
Strong primes takes this even further, these are primes p that have a property that it’s greater then the arithmetic mean of it’s neighbor primes. Formally p_n > (p_(n-1) + p_(n+1)) / 2. We need these because Pollard came up with a +1 and even a Rho factorization algorithm later

The case with ANSI and stuff

So you may feel like to destroy all your 4096-bit keys but remember that this algo still has a bottleneck on runtime O(B*log(B)*log^2(n)), so if you have to increase B several times you’ve fallen into the “computationally infeasible to crack” pitfall again. But this is no explanation so the ANSI-RSA standard required strong primes to be used with implementations
However
Even more efficient factorization algorithms showed up (tutorials on those later) so several other “standards” (famous xkcd here) advised that using strong primes is just an overhead for they can’t provide security against the newest weapons.

oaktree · August 26, 2017, 2:21am

Where do all these a's, m's, etc, come up?

Silur · August 26, 2017, 2:32am

a is the random coprime we choose against N in the third step, m is the sum we got from the second step

oaktree · August 26, 2017, 2:33am

For clarity’s sake, you should declare the variables before you use them in later steps.