Leapfrogging to Find Primes and Factor Composites (the Inverse Sieve of Eratosthenes):

By Jeremy A.Smith BSc (Hons)

The code

The Lisp version (heavily commented, the recommended version)
The C++ version (not very well commented, only recommended to illustrate that it can be a very fast algorithm)

The Basics

The basic idea is that addition is by its nature more computationally efficient than multiplication or division, as both of the latter use many additions or subtractions. Also, using variables for each necessary prime is more efficient on memory than keeping track of every non-prime up to this point and every multiple of every prime. If we can generate prime numbers with the storage of just a 128-bit number (or larger) for each prime under the square root, we can reduce the time required to generate a prime number, from thousands of divisions, to just thousands of additions, which are roughly equivalent to one division with larger numbers. We can also reduce the memory by a huge factor.

The Sieve of Eratosthenes is fast and efficient up to a point, but the memory required to store the primality of 4 billion numbers is, representing each number with a single memory bit, approximately 512 megabytes. Modern desktop computers come with about this amount of RAM. However, for numbers above 4 billion, this becomes a prohibitive requirement.

As an example, the Leapfrogging algorithm can reliably find approximately 17,000 prime numbers per 5-second pass, above 1 trillion (which is 1,000,000,000,000), with a penalty of about 120 megabytes of memory, on a 1.2ghz Pentium 3. This is a rough implementation written in Clisp, a Common Lisp package. The author estimates that the memory requirement could be decreased by a factor of 10, and the speed by a factor of 100, in a C++ version using a bignum library.

In comparison, finding prime numbers up to and over 1 trillion, each number being stored as 1-bit, would require 125,000,000,000 bytes (about 125 gigabytes) of RAM to process using the Sieve of Eratosthenes.

As a bonus, the Leapfrog algorithm can factor composites it has processed, with no speed penalty, by associating a numeric entry in the MiniTable vector with a list of the primes that stepped onto this number and displaying them when the number is discarded from the MiniTable.

It is possible that existing algorithms are more useful than Leapfrog. The author is unsure whether using Elliptic-Curve factoring on each number above (say) 1 trillion would be faster than Leapfrog. But if a C++ implementation is 100 times faster, then it is doubtful whether any other method could go as fast as finding 1.7 million guaranteed prime numbers every 5 seconds with such a modest memory requirement. Certainly, the Leapfrog is a faster method than testing every new number against a huge primorial, which gets incredibly slow and will not easily factor composites that it finds.

Finally, Leapfrog is guaranteed to correctly check numbers in its domain as prime or composite. It is not probabilistic - what you see is most decisively what you get.

Is It Useful?

There are various methods to test for primality with an incredibly low rate of probable innacuracy, so it is not terribly useful to prime number researchers. However, the illustration (below) does make a good educational tool on prime number theory.

An Illustration (the Infinite Draughts Board)

As we can see in this picture:

The primes 2 and 3 are added by default.
The numbers indicate the position along the draughts board, and the coloured lines indicate the movement of the virtual draughts piece labelled with a prime.
Lines emanate from the squares of primes.
New primes with no corresponding line are denoted as primes.
Primorials, where all the lines meet up, are denoted 'P' (this is hard to see).

This image shows how simple the algorithm is, and elegant.

The algorithm, step-by-step

We start with the prime 2, and count up from 9 (3*3 = 9 so this is the start point). If we start with the prime 2 and count up from 9, we start off initialising the primes' variables to as close together as possible. In this case, 2 starts on 8 because round(9/2)=4 and 4*2 is 8. The prime starts on the closest multiple of itself.

The primorial begins as 2*1 = 2.

2 = 8

3 is required, because we've reached a power of 3 (at least a power >= 2). See section [below] for details. We increase the primorial from 2 to 2*3 = 6.

2 = 8
3 = 9

First, we increase 3:

3 = 12

Then we increase 2 until it's >= 12:

2 = [8],10,12

The number we didn't 'hit' with our prime variables was 11. This is a new prime, and is added to the new-primes list (nothing to do with the 'scratch primes' list which currently contains 2 and 3).

Leaping again. Note that the start value is enclosed in square brackets to easily show the gaps:

3 = [12],15
2 = [12],14,16

The number missing here is 13, so this is a new prime.

Leaping again:

3 = [15],18
2 = [16],18

The missing prime here is 17.

3 = [18],21
2 = [18],20,22

The prime here is 19.

3 = [21],24
2 = [22],24

23 is prime.

3 = [24],27
2 = [24],26,28

No primes here! But we have come across the square of a prime.

I haven't bothered up to now, but we need to occasionally add primes to the primes list we are using to leapfrog (not the 'new' primes which are the results). We do this by multiplying the highest prime by 2 and checking for squares in that list, after every leaping. This is 3, so we multiply it by 2. Then, we look at each square in that list:

3*3 = 9
4*4 = 16
5*5 = 25
6*6 = 36

We see that 5*5 is 25, and we have just hopped over 25, so we add the prime to the primes list and set its variable to 25, which is guaranteed to be a multiple of itself. The primorial becomes 30. One more example to show the addition of 5 to the scratch primes list:

5 = [25],30
3 = [27],30
2 = [28],30

First, 29 is prime. But here we also hit the primorial of all the primes to date, which doesn't happen very often (the next one is at 210).

The algorithm continues on until we run out of squared primes, which never happens as there is an infinite number of primes.

Algorithm:

Start with all the prime numbers below sqrt(n) in Primes (eg, 2,3,5)
For each prime number in Primes, set a variable to ((n/prime)-1) * prime
Make a hash-table or vector MiniTable which can associate a number with either a boolean or a list of primes
Start = n
Loop:
- The last prime is, eg, 5 (LastPrime)
- Add the variable of the last prime in Primes to MiniTable as True: MiniTable[LastPrime] = True
- Increment the variable of the last prime in Primes: Variables[LastPrime] += Prime
- Add the variable of the last prime in Primes to MiniTable as True: MiniTable[LastPrime] = True
- For all except the last prime in Primes, in reverse order (eg, 3,2):
  - Variable Last = NIL
  - While the variable of this prime is <= the variable of the last prime (eg, var[3] <= var[5]):
    - Add the variable of this prime to MiniTable as True: MiniTable[Prime] = True
    - Increment the variable of this prime: Variables[Prime] += Prime
    - Add the variable of this prime to MiniTable as True: MiniTable[Prime] = True
    - Set Last to the variable of this prime
- Now we see if we need to expand the domain
  - For Prime from the last prime in Primes (LastPrime) to LastPrime*2:
    - If Prime*Prime >= Start AND Prime*Prime < Last AND Prime is co-prime to Primorial AND Prime <> LastPrime
      - Add Prime to the end of Primes
      - Primorial = Primorial * Prime
      - MiniTable[Prime*Prime] = True
      - Initialise the variable for this new prime: Variables[Prime] = Prime*Prime
- Now we look for primes
  - For i from Start to Last
    - If Not MiniTable[i]
      - Add i to the new-primes list
    - If MiniTable[i]
      - Remove i from MiniTable to save memory
- Start = Last + 1

Notes for this paper

I have University experience in mathematics, but my degree is not in maths.