Saturday, 5 December 2015

Primality Test, another problem

In previous article we looked into the problem on sorting, now we are going to look into another problem, the prime numbers.

What is it ?
A Prime Numbers is a natural number greater than 1 which has no positive divisor except 1 and itself. For example, 2,3,5,7,11,13,17,19,23 etc are prime numbers.

Why are we considering it here ?
One simple answer is that, to start off with a simple problem which is quite easy to understand
Another way to look at it is that, Prime Numbers are really very much useful in today's life and is a great tool in computer security and hashing functions.

How can a special set of number be a tool ?
Well, to understand this, we have to know a little bit of security. In today's world, many of use GMail, Facebook, Twitter, Online Bank accounts and one of the most common parts in all of these is security, you have to provide your credentials in the web page and if it is correct, you gain access to the portal. Also, one more thing to note is that, your bank account details are only visible to yourself. These all are handled through a special kind of study in computer science, known as Cryptography. In the field of cryptography, Prime Numbers play a major role.
Another major application of prime numbers are in the field of hashing, which basically is devoted to generate a unique piece of information. In case of  creating a unique hash, prime numbers are quite useful.
So, we'll be looking at the prime numbers in this article.

So, what's special in it ?
Obviously you are thinking that, 'It is very easy to find a prime number, you just have to check if it divides by any other integer or not.'
Well, the process is very easy if you have to think of a prime number in a small range. But, it is quite hard to find a prime number in large ranges. Cause, as the number goes up, the density of primes goes less.
Let's take an example to see how it looks like on increasing range and the number of primes in that range.

For numbers upto 10, number of primes are 4

For numbers upto 100, number of primes are 25

For numbers upto 1000, number of primes are 168

For numbers upto 10000, number of primes are 1229

For numbers upto 100000, number of primes are 9592

For numbers upto 1000000, number of primes are 78498

For numbers upto 10000000, number of primes are 664579

The statistics show you how the density of prime numbers go less in higher range.

So, you can think, if you have to find a prime number in higher range you have to do a lot of jobs and most probably you are not going to do that manually. We are most probably going to employ a computer to do that.

So, we are in need to find an efficient algorithm to deal with a higher range of numbers.

OK, let's start with a very basic algorithm of finding prime numbers which checks all numbers from 2 to n-1 as a proper divisor of a number n.

Process - 1
        public static boolean isPrime1(int num) {
  for (int i = 2; i < num; i++) {
   if (num % i == 0) {
    return false;
   }
  }
  return true;
 }

For example, to check primality of 36, we check with,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35

But if we check the comparisons, we can observe that there are some redundant checks in the system. Let's take out the divisors of 36, i.e 2, 3, 4, 6, 9, 12, 18.

So, we can observe that, we can improve the process by eliminating the higher order numbers. Simply we can check for the first half of n. Here is an implementation,

Process - 2
        public static boolean isPrime1(int num) {
  for (int i = 2; i < (num / 2); i++) {
   if (num % i == 0) {
    return false;
   }
  }
  return true;
 }


Now let's check if we can further improve the process. OK, let's prepare a table now,
2*18 = 18*2 = 36
3*12 = 12*3 = 36
4*9 = 9*4 = 36
6*6 = 6*6 = 36

So, we can check only upto 6, 36. In fact, not only 36, this happens to be true for any number. The divisors repeat themselves after n. Let's see an implementation here,

Process - 3
        public static boolean isPrime1(int num) {
  for (int i = 2; i * i <= num; i++) {
   if (num % i == 0) {
    return false;
   }
  }
  return true;
 }

Now, as per the patterns of prime numbers, we can see that any prime number can be expressed as (6*k + i). So, we can further improve our process by eliminating some of the numbers from the set of 2 to n. Now, we can check only for a number by 2, 3 and any form of 6k ± 1 \scriptstyle{}\leq\sqrt n. This process takes almost one third of the time than the first method to check for all numbers upto n. Let's look at the following implementation,

Process - 4
       public static boolean isPrime(int num) {
  if (num <= 1) {
   return false;
  } else if (num <= 3) {
   return true;
  } else if (num % 2 == 0 || num % 3 == 0) {
   return false;
  }
  int i = 5;
  while (i * i <= num) {
   if (num % i == 0 || num % (i + 2) == 0) {
    return false;
   }
   i = i + 6;
  }
  return true;
 }

Let's take an example here,

Let's say we have to check for the primality of number 9973, now according to process - 4, the following happens,
 Process - 4 Demonstration for 9973
Initially, 9973 is checked against 2 & 3. Check fails

Then we use a counter starting from 5,
9973 is checked against 5, 7 
Counter incremented to 11

9973 is checked against 11, 13 
Counter incremented to 17

9973 is checked against 17, 19 
Counter incremented to 23

9973 is checked against 23, 25 
Counter incremented to 29

9973 is checked against 29, 31 
Counter incremented to 35

9973 is checked against 35, 37 
Counter incremented to 41

9973 is checked against 41, 43 
Counter incremented to 47

9973 is checked against 47, 49 
Counter incremented to 53

9973 is checked against 53, 55 
Counter incremented to 59

9973 is checked against 59, 61 
Counter incremented to 65

9973 is checked against 65, 67 
Counter incremented to 71

9973 is checked against 71, 73 
Counter incremented to 77

9973 is checked against 77, 79 
Counter incremented to 83

9973 is checked against 83, 85 
Counter incremented to 89

9973 is checked against 89, 91 
Counter incremented to 95

9973 is checked against 95, 97 
Counter incremented to 101
The numbers marked in red are composite numbers. So, we could have safely ignored those comparisons. Cause we already had completed those checks. Because, a composite is the result of two prime number multiplication. So, if a number is divided by any composite number, the same goes true for the divisors of that composite number.
So, we can observe that, there are 9 redundant comparisons. So, we could have ignored 9 comparisons out of total 34 comparisons.

 Now, let's take another example of another prime number 99991. Let's see what happens,
Process - 4 Demonstration for 99991
Initially, 99991 is checked against 2 & 3. Check fails

Then we use a counter starting from 5,
99991 is checked against 5, 7 
Counter incremented to 11
99991 is checked against 11, 13 
Counter incremented to 17
99991 is checked against 17, 19 
Counter incremented to 23
99991 is checked against 23, 25 
Counter incremented to 29
99991 is checked against 29, 31 
Counter incremented to 35
99991 is checked against 35, 37 
Counter incremented to 41
99991 is checked against 41, 43 
Counter incremented to 47
99991 is checked against 47, 49 
Counter incremented to 53
99991 is checked against 53, 55 
Counter incremented to 59
99991 is checked against 59, 61 
Counter incremented to 65
99991 is checked against 65, 67 
Counter incremented to 71
99991 is checked against 71, 73 
Counter incremented to 77
99991 is checked against 77, 79 
Counter incremented to 83
99991 is checked against 83, 85 
Counter incremented to 89
99991 is checked against 89, 91 
Counter incremented to 95
99991 is checked against 95, 97 
Counter incremented to 101
99991 is checked against 101, 103 
Counter incremented to 107
99991 is checked against 107, 109 
Counter incremented to 113
99991 is checked against 113, 115 
Counter incremented to 119
99991 is checked against 119, 121 
Counter incremented to 125
99991 is checked against 125, 127 
Counter incremented to 131
99991 is checked against 131, 133 
Counter incremented to 137
99991 is checked against 137, 139 
Counter incremented to 143
99991 is checked against 143, 145 
Counter incremented to 149
99991 is checked against 149, 151 
Counter incremented to 155
99991 is checked against 155, 157 
Counter incremented to 161
99991 is checked against 161, 163 
Counter incremented to 167
99991 is checked against 167, 169 
Counter incremented to 173
99991 is checked against 173, 175 
Counter incremented to 179
99991 is checked against 179, 181 
Counter incremented to 185
99991 is checked against 185, 187 
Counter incremented to 191
99991 is checked against 191, 193 
Counter incremented to 197
99991 is checked against 197, 199 
Counter incremented to 203
99991 is checked against 203, 205 
Counter incremented to 209
99991 is checked against 209, 211 
Counter incremented to 215
99991 is checked against 215, 217 
Counter incremented to 221
99991 is checked against 221, 223 
Counter incremented to 227
99991 is checked against 227, 229 
Counter incremented to 233
99991 is checked against 233, 235 
Counter incremented to 239
99991 is checked against 239, 241 
Counter incremented to 245
99991 is checked against 245, 247 
Counter incremented to 251
99991 is checked against 251, 253 
Counter incremented to 257
99991 is checked against 257, 259 
Counter incremented to 263
99991 is checked against 263, 265 
Counter incremented to 269
99991 is checked against 269, 271 
Counter incremented to 275
99991 is checked against 275, 277 
Counter incremented to 281
99991 is checked against 281, 283 
Counter incremented to 287
99991 is checked against 287, 289 
Counter incremented to 293
99991 is checked against 293, 295 
Counter incremented to 299
99991 is checked against 299, 301 
Counter incremented to 305
99991 is checked against 305, 307 
Counter incremented to 311
99991 is checked against 311, 313 
Counter incremented to 317
Now, here we can observe that, we could have ignored 39 comparisons out of total 106 comparisons.

In fact, as the range goes on increasing, the number of redundant checks also keep on increasing.

Although Process - 4 is quite efficient and takes less number of comparisons, we can even speed up the process. We can further improve the process by eliminating the composite numbers upto n. If we can have a set of prime numbers within the range of n, we can further eliminate some more comparisons. The only drawback here is that, we have to mark a limit of the check. Let's say, we are having a set of all prime numbers upto 500. So, we can check primality for a number upto 250000.

So, we can see that, using different approaches, we can find the solution of a problem and also we found that using some mathematics, we can improve the approach.

But a million dollar question arises just at this point, 'How can you define that Process - 4 is better than Process - 1 ? Is it true that Process - 4 is the best one ?'

Well, to find answer of this question, we've to use some mathematics again, we will discuss over the process in our subsequent discussion known as 'Asymptotic Analysis'.

Till then, you can check with some prime numbers, generate series of prime numbers.
To ease your task, I have created an online tool. You can definitely take a look and play with this tool, Prime Series Generator.

Hope, you enjoyed this article, thanks for reading !!!

Prev     Next
Palash Kanti Kundu

No comments:

Post a Comment