GPU : A Giga@lobal Processing Unit ?

The discrete logarithm problem

Description of the plugin crypto.dll

One-way functions are functions that are very easy to compute in one direction, but very difficult to invert. One of the simplest functions with that one-way property is the power modulo a number.
2 to the power of 5 is 32. Modulo is the simple operation that gives the remainder of a division between two numbers, e.g. 32 modulo 19 is 13.
2 to the power of 5 modulo 19 is 13.
2^5 mod 19 = 13

Easy way round

There are two ways offered by GPU to compute the power modulo a number.
Switch to the Computing page and type "2,5,19,powmod’, then click on "Compute locally" and see if the result matches. "powmod" is implemented in this way: a loop multiplying 2 for 5 times with itself; each time the modulo operation is performed. The source code of "powmod" can be found in basicdll.pas.

It turns out that there is a more efficient way to compute this special power. It uses the property that (2^3)^4 = 2^(3*4) = 2^12 . The previous plugin used n multiplications, if it were computing 2^n mod x. The plugin implemented in cryptodll.pas uses an algorithm called "Square and multiply". For the same task with n as power, only about logarithm with base 2 of n loops are needed. Try the following task "2,5,19,squareandmul" in GPU. Refer to cryptodll.pas to see how the Square and Multiply algorithm works in detail.

To compare the two plugins, measure how long it takes '11,60000000,5999,squareandmul' compared to '11,60000000,5999,powmod'.

Another weakness of "powmod", is that it can"t handle big numbers. Big jobs may result in different results, especially if you choosed a modulo that is higher than 32 bits.

Difficult way round

Now let"s take it the other way round. If someone asks you the following: how many times should I multiply 2 if I want to get 13 modulo 19? Ok, you know from the previous example, that the right answer is 5.

If you forgot it, you could just try to build a table and see where the number 13 comes up in the right column. But, good look if the modulus is 70383776563201.

For a computer, the approach of building such a table is too expensive in time and memory space. Here the one-way property shows itself clearly.

For this reason, GPU implements an algorithm that tries to solve the problem using the baby step / giant step algorithm. This algorithm uses only a table with a size of the square root of n, where n is the modulus. For further information, refer to the source code in the plugin cryptodll.pas. The function name is "discretelog".

Let us ask GPU to perform the computation: this is the packet sent out and it has a lot of parameters, explained later. Type "13,2,19,4,0,discretelog". Again, click on the "Compute locally" and see if the result is 5. The result 5 is called the discrete logarithm of 13.

You encountered the first three parameters above ("13,2,19"). In cryptography 13 is the most important parameter. Normally the other parameters are given from cryptographic algorithms, and what we search for is only the discrete logarithm. 2 is sometimes called the generator of the cyclic group, or in other texts, the primitive root, because it generates the integers between 1 and 18, if we exponentiate 2 with numbers between 1 and 18 modulo 19. Note that if we compute 2^19 mod 19, we get again 2, which is the same as we compute 2^1. This is the reason why this particular mathematical structure is called cyclic group.

Elements of the group are the ones between 1 and 18, and the order they come in the structure is given by the operation 2^x mod 19 performed over the numbers between 1 and 18.

The following table should clarify the situation (the table can be computed using squareandmul or powmod):

Better said, the result 5 is the discrete logarithm of 13 with primitive root 2 over the cyclic group with modulo 19 and multiplication as operator.

The order of the group is 18, because there are 18 different elements that can be generated. Please note in decyphering that it is difficult to choose a primitive root, and size of the group in order to get them to generate a whole group. There are some constraints for the modulo operator (19). The modulo number has to be relative prime to all other elements in the group. E.g if you choose 2 as primitive root, and 16 as modulo operator, you won"t get all numbers between 1 and 15 to be generated. The table here explains the situation :

This desaster happens because 16 and the group element 4 have the factor 2 in common. In conclusion, the modulus p has to be prime, in order to generate all numbers between 1 and p-1.

How to start a big computation

Before starting the big computation the last two parameters in "13,2,19,4,0,discretelog" need to be explained. The baby step, giant step algorithm uses a system with milestones. Milestones are simply elements of the group (defined in the section “Difficult way round”) in a fixed distance between them.

So in the first table milestones could be the bold ones: 2,4,8,16,13,7,14,9,17,15,11,3,6,12,5,10,1

Now the parameter 4 explains itself: this is the number of milestones the program creates.
The distance between the milestones is chose by the plugin itself; it is the square root of n, where n is the modulus. You shouldn't specify more than sqrt(n) milestones, or else the sorting algorithm won't work, because the algorithm generates duplicates and indexes are not powers of the generator anymore. Our plugin checks this, and if the number is too big, it defaults to sqrt(n).

We can now imagine computations where the memory space of one machine is not enough to contain all milestones. In fact, GPU normally doesn't digest more than one million milestones.

The idea is to choose at random a starting point, in the big circle built by the elements of the group, if they are ordered using the power function. The perimeter of the circle is named with real numbers from 0 to 1. In other words we normalize the element index (that is a number between 1 and the order of the group) to a real number between 0 and 1.

"0" in "13,2,19,4,0,discretelog" defines as starting point the first number in the table.
"0.9999" would define as starting point the last number in the table.

We now try the following packet
"13,2,19,2,rnd,discretelog". "rnd" is a function that returns a random real number between 0 and 1. By clicking on the "Compute locally" many times, sometimes one finds the result 5, but sometimes 0. By working with only two milestones the program doesn"t always find the discrete logarithm and returns 0.

Computing this packet "11412647538025,11,70383776563201,1000000,rnd,discretelog" will take about 10 minutes on a 1GHz machine (August 2002). Here we use one million milestones, although seven million (sqrt(70383776563201)=7 million) are needed to cover the whole circle of the huge cyclic group. So about 70 minutes are needed to solve the task.

Using rnd as parameter we get an easy way to parallelize the Baby step / giant step algorithm. By sending the above packet to a small network of GPUs, they will start to generate from a random starting point a part of the table needed to find out the discrete logarithm. Most of GPUs will respond with a "0" because they didn't find the discrete logarithm. However, soon or later, a GPU will generate a table that covers the discrete logarithm and will send back the correct result.

The logo was designed by Mark Grady. Graphics are by David A. Lucas.
© 2002-2003 the GPU Development team

Page generated in 1.54 milliseconds by PHP 4.4.3-dev on