Wednesday, 9 December 2015

Asymptotic Analysis - Part 1

Well, in our previous discussions, we found that how it is useful to study about algorithms. Now, let's dig down deeper. First, we want to know what does it mean to analyse an algorithm and  how do we do that ?

Analysis of Algorithms:  In computing, it is useful to know the complexity of any program will lead to solutions of some interesting questions like,

  • How long does it take to run a program with the given input set ?
  • How much space does it require to complete the problem ?
  • Is the problem solvable ?
Well, what do we do that having all those information ?
A good question indeed. To answer this question, let me take a hypothetical scenario of cleaning your house. You are in charge of cleaning your house and you have got some money in your hand to spend.

Now, what are the considerations, you want to do ?
  • How much time it should take to clean the house ?
  • What are the tools you should be using ?
  • When cleaning a room, do you need to shift all the furnitures in other room ?
  • Do you have enough space to shift the furnitures to other location ?
  • Is ir even possible to do that ?
So, you see, you are in a pretty good situation to take a decision on how to do that.

OK, let  me put some constraint on it, you have just one day to do that.

So, you are most probably to think in the direction where you will be vacating each room and clean that and again deploy all the furnitures you had in the room earlier.

Now, let me change the scenario, you have plenty of time and you have no extra room to clean your house, in that scenario, you mostly think of rearranging furnitures in the room and clean a part of the room and continue the stuff until the whole room gets clean and then you move to other room.

Now, let's put some serious condition that you have only one day and no extra room to clean your house.

What about this one ???

Well, its pretty complex to do that. I have to find a better way to do that.
Well, think of it...



Cool, I got some idea to do that, I will move my bed to ...
Congratulations !!! You have successfully found out the most efficient algorithm to clean your house.

Hey, but what does that relate to algorithm and computer science ? We do have plenty of resources nowadays. Quite fast CPU, Gigabytes of RAM. Why do we even care for time and space ?
Well, even with so fast computers and lots of resources, we are still in need of efficient algorithms and perform analysis of algorithms,

Let's talk about why we do that,

I had run a sorting algorithm which runs in a quadratic time (order of n2). So, as the input size grows larger, the running time grows quadratically and I had run the same sorting with another algorithm of order nlog(n).

If you take a look on previous article, you will find a table with run time of two different sorting algorithms with the same input set. You can obviously understand that, with 2.9 GHZ Processor Speed and dedicated 3 GB of RAM (Yes, I used 3 GB as the JVM Parameter for that program to run), the second algorithm (order of n2) failed to sort the numbers even after 9 days of execution while the first algorithm(order of nlogn), did that  in 8 seconds with an input size of only 50000000 numbers.

And sometimes, getting a correct answer late is considered as a wrong answer.

Well, that was a sort of test run against running time we can also have another parameter to test, the space usage. Many people use smartphones for day to day activities, may be some of you are reading this article on a smartphone or tablet. Many of the smartphones have less main memory than a desktop or laptop but in some scenarios, we are yet to perform the same set of operations on these devices too. So, in times, we have to think of the space usage along with the running time of the algorithms.

So, these are the basic needs of performing an analysis of algorithms where we take decisions which way to go. The situation is something like the below, where we have three sorting algorithms.

Conventions: Shorter line length means shorter running time, thicker line means more space usage.


So, we can see, that the first algorithm sorts the array in a long time with least memory usage and the third algorithm is fastest but it takes a greater memory footprint. While the second algorithm is efficient with a reasonable memory footprint. So, you need to understand, how much data is there, what are the expectation from the system.

That's all behind the decision making.

Now, we are in a good shape to understand that Analysis of Algorithms are necessary. Now let's see, how we can do that.

When analysing algorithms, we consider running time and space usage of a particular algorithm. For the time being, we'll just consider about the running time.
Now, to do that, we analyse on input size instead of,

  1. Running algorithms and measuring time, because for different machine the same algorithm can perform differently. Also, the programming language plays a key role in this.
  2. Count the steps in the  implementation, because the programming styles are different for different programmers.
As an alternative, we use Asymptotic Analysis to determine the complexity of an algorithm. Let's see what does this mean,

Asymptotic Analysis: Asymptotic  Analysis is a mathematical analysis which describes limiting behaviour.

In simple English, we can define that, it is a method of determine the rate of growth of running time as the input size grows.

So, we can tell that, if an algorithm runs in quadratically, then as the input grows in a linear fashion, the running time increases quadratically.

Well, that's all for now, we'll look into more depth in our subsequent articles.

Hope you enjoyed this article, happy learning...

Saturday, 5 December 2015

Primality Test, another problem

In previous article we looked into the problem on sorting, now we are going to look into another problem, the prime numbers.

What is it ?
A Prime Numbers is a natural number greater than 1 which has no positive divisor except 1 and itself. For example, 2,3,5,7,11,13,17,19,23 etc are prime numbers.

Why are we considering it here ?
One simple answer is that, to start off with a simple problem which is quite easy to understand
Another way to look at it is that, Prime Numbers are really very much useful in today's life and is a great tool in computer security and hashing functions.

How can a special set of number be a tool ?
Well, to understand this, we have to know a little bit of security. In today's world, many of use GMail, Facebook, Twitter, Online Bank accounts and one of the most common parts in all of these is security, you have to provide your credentials in the web page and if it is correct, you gain access to the portal. Also, one more thing to note is that, your bank account details are only visible to yourself. These all are handled through a special kind of study in computer science, known as Cryptography. In the field of cryptography, Prime Numbers play a major role.
Another major application of prime numbers are in the field of hashing, which basically is devoted to generate a unique piece of information. In case of  creating a unique hash, prime numbers are quite useful.
So, we'll be looking at the prime numbers in this article.

So, what's special in it ?
Obviously you are thinking that, 'It is very easy to find a prime number, you just have to check if it divides by any other integer or not.'
Well, the process is very easy if you have to think of a prime number in a small range. But, it is quite hard to find a prime number in large ranges. Cause, as the number goes up, the density of primes goes less.
Let's take an example to see how it looks like on increasing range and the number of primes in that range.

For numbers upto 10, number of primes are 4

For numbers upto 100, number of primes are 25

For numbers upto 1000, number of primes are 168

For numbers upto 10000, number of primes are 1229

For numbers upto 100000, number of primes are 9592

For numbers upto 1000000, number of primes are 78498

For numbers upto 10000000, number of primes are 664579

The statistics show you how the density of prime numbers go less in higher range.

So, you can think, if you have to find a prime number in higher range you have to do a lot of jobs and most probably you are not going to do that manually. We are most probably going to employ a computer to do that.

So, we are in need to find an efficient algorithm to deal with a higher range of numbers.

OK, let's start with a very basic algorithm of finding prime numbers which checks all numbers from 2 to n-1 as a proper divisor of a number n.

Process - 1
        public static boolean isPrime1(int num) {
  for (int i = 2; i < num; i++) {
   if (num % i == 0) {
    return false;
   }
  }
  return true;
 }

For example, to check primality of 36, we check with,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35

But if we check the comparisons, we can observe that there are some redundant checks in the system. Let's take out the divisors of 36, i.e 2, 3, 4, 6, 9, 12, 18.

So, we can observe that, we can improve the process by eliminating the higher order numbers. Simply we can check for the first half of n. Here is an implementation,

Process - 2
        public static boolean isPrime1(int num) {
  for (int i = 2; i < (num / 2); i++) {
   if (num % i == 0) {
    return false;
   }
  }
  return true;
 }


Now let's check if we can further improve the process. OK, let's prepare a table now,
2*18 = 18*2 = 36
3*12 = 12*3 = 36
4*9 = 9*4 = 36
6*6 = 6*6 = 36

So, we can check only upto 6, 36. In fact, not only 36, this happens to be true for any number. The divisors repeat themselves after n. Let's see an implementation here,

Process - 3
        public static boolean isPrime1(int num) {
  for (int i = 2; i * i <= num; i++) {
   if (num % i == 0) {
    return false;
   }
  }
  return true;
 }

Now, as per the patterns of prime numbers, we can see that any prime number can be expressed as (6*k + i). So, we can further improve our process by eliminating some of the numbers from the set of 2 to n. Now, we can check only for a number by 2, 3 and any form of 6k ± 1 \scriptstyle{}\leq\sqrt n. This process takes almost one third of the time than the first method to check for all numbers upto n. Let's look at the following implementation,

Process - 4
       public static boolean isPrime(int num) {
  if (num <= 1) {
   return false;
  } else if (num <= 3) {
   return true;
  } else if (num % 2 == 0 || num % 3 == 0) {
   return false;
  }
  int i = 5;
  while (i * i <= num) {
   if (num % i == 0 || num % (i + 2) == 0) {
    return false;
   }
   i = i + 6;
  }
  return true;
 }

Let's take an example here,

Let's say we have to check for the primality of number 9973, now according to process - 4, the following happens,
 Process - 4 Demonstration for 9973
Initially, 9973 is checked against 2 & 3. Check fails

Then we use a counter starting from 5,
9973 is checked against 5, 7 
Counter incremented to 11

9973 is checked against 11, 13 
Counter incremented to 17

9973 is checked against 17, 19 
Counter incremented to 23

9973 is checked against 23, 25 
Counter incremented to 29

9973 is checked against 29, 31 
Counter incremented to 35

9973 is checked against 35, 37 
Counter incremented to 41

9973 is checked against 41, 43 
Counter incremented to 47

9973 is checked against 47, 49 
Counter incremented to 53

9973 is checked against 53, 55 
Counter incremented to 59

9973 is checked against 59, 61 
Counter incremented to 65

9973 is checked against 65, 67 
Counter incremented to 71

9973 is checked against 71, 73 
Counter incremented to 77

9973 is checked against 77, 79 
Counter incremented to 83

9973 is checked against 83, 85 
Counter incremented to 89

9973 is checked against 89, 91 
Counter incremented to 95

9973 is checked against 95, 97 
Counter incremented to 101
The numbers marked in red are composite numbers. So, we could have safely ignored those comparisons. Cause we already had completed those checks. Because, a composite is the result of two prime number multiplication. So, if a number is divided by any composite number, the same goes true for the divisors of that composite number.
So, we can observe that, there are 9 redundant comparisons. So, we could have ignored 9 comparisons out of total 34 comparisons.

 Now, let's take another example of another prime number 99991. Let's see what happens,
Process - 4 Demonstration for 99991
Initially, 99991 is checked against 2 & 3. Check fails

Then we use a counter starting from 5,
99991 is checked against 5, 7 
Counter incremented to 11
99991 is checked against 11, 13 
Counter incremented to 17
99991 is checked against 17, 19 
Counter incremented to 23
99991 is checked against 23, 25 
Counter incremented to 29
99991 is checked against 29, 31 
Counter incremented to 35
99991 is checked against 35, 37 
Counter incremented to 41
99991 is checked against 41, 43 
Counter incremented to 47
99991 is checked against 47, 49 
Counter incremented to 53
99991 is checked against 53, 55 
Counter incremented to 59
99991 is checked against 59, 61 
Counter incremented to 65
99991 is checked against 65, 67 
Counter incremented to 71
99991 is checked against 71, 73 
Counter incremented to 77
99991 is checked against 77, 79 
Counter incremented to 83
99991 is checked against 83, 85 
Counter incremented to 89
99991 is checked against 89, 91 
Counter incremented to 95
99991 is checked against 95, 97 
Counter incremented to 101
99991 is checked against 101, 103 
Counter incremented to 107
99991 is checked against 107, 109 
Counter incremented to 113
99991 is checked against 113, 115 
Counter incremented to 119
99991 is checked against 119, 121 
Counter incremented to 125
99991 is checked against 125, 127 
Counter incremented to 131
99991 is checked against 131, 133 
Counter incremented to 137
99991 is checked against 137, 139 
Counter incremented to 143
99991 is checked against 143, 145 
Counter incremented to 149
99991 is checked against 149, 151 
Counter incremented to 155
99991 is checked against 155, 157 
Counter incremented to 161
99991 is checked against 161, 163 
Counter incremented to 167
99991 is checked against 167, 169 
Counter incremented to 173
99991 is checked against 173, 175 
Counter incremented to 179
99991 is checked against 179, 181 
Counter incremented to 185
99991 is checked against 185, 187 
Counter incremented to 191
99991 is checked against 191, 193 
Counter incremented to 197
99991 is checked against 197, 199 
Counter incremented to 203
99991 is checked against 203, 205 
Counter incremented to 209
99991 is checked against 209, 211 
Counter incremented to 215
99991 is checked against 215, 217 
Counter incremented to 221
99991 is checked against 221, 223 
Counter incremented to 227
99991 is checked against 227, 229 
Counter incremented to 233
99991 is checked against 233, 235 
Counter incremented to 239
99991 is checked against 239, 241 
Counter incremented to 245
99991 is checked against 245, 247 
Counter incremented to 251
99991 is checked against 251, 253 
Counter incremented to 257
99991 is checked against 257, 259 
Counter incremented to 263
99991 is checked against 263, 265 
Counter incremented to 269
99991 is checked against 269, 271 
Counter incremented to 275
99991 is checked against 275, 277 
Counter incremented to 281
99991 is checked against 281, 283 
Counter incremented to 287
99991 is checked against 287, 289 
Counter incremented to 293
99991 is checked against 293, 295 
Counter incremented to 299
99991 is checked against 299, 301 
Counter incremented to 305
99991 is checked against 305, 307 
Counter incremented to 311
99991 is checked against 311, 313 
Counter incremented to 317
Now, here we can observe that, we could have ignored 39 comparisons out of total 106 comparisons.

In fact, as the range goes on increasing, the number of redundant checks also keep on increasing.

Although Process - 4 is quite efficient and takes less number of comparisons, we can even speed up the process. We can further improve the process by eliminating the composite numbers upto n. If we can have a set of prime numbers within the range of n, we can further eliminate some more comparisons. The only drawback here is that, we have to mark a limit of the check. Let's say, we are having a set of all prime numbers upto 500. So, we can check primality for a number upto 250000.

So, we can see that, using different approaches, we can find the solution of a problem and also we found that using some mathematics, we can improve the approach.

But a million dollar question arises just at this point, 'How can you define that Process - 4 is better than Process - 1 ? Is it true that Process - 4 is the best one ?'

Well, to find answer of this question, we've to use some mathematics again, we will discuss over the process in our subsequent discussion known as 'Asymptotic Analysis'.

Till then, you can check with some prime numbers, generate series of prime numbers.
To ease your task, I have created an online tool. You can definitely take a look and play with this tool, Prime Series Generator.

Hope, you enjoyed this article, thanks for reading !!!

Prev     Next
Palash Kanti Kundu

Thursday, 26 November 2015

Sorting, the burning problem in hand...

After the introductory articles and preaching about Algorithms, Let's discuss about some practical scenarios where analysis of algorithms plays a key role.

Let's start our discussion with a simple thought. We all know about Dictionary and what it does, right ?

Oh yeah, dictionary is a place where we find the meaning of any word.

Good, now let me ask you the meaning of the word 'Modern'. So, what are you going to do ?

Hey, I will search the dictionary, probably in the middle of the pages. Cause, 'M' is almost in the middle of the dictionary.

Then I will look up at any word, let's say I found the word 'Modest' in that page, then I will look in the previous page, cause 'R' comes before 'S' and accordingly in some other 2/3 steps, I will find the word 'Modern'.

So, that's it, usually anyone in the world will basically do that. Because all of us know that the worrds are arranged in some particular fashion in the dictionary. We basically call this fashion as 'Sorted'.

Now, I would like you to put some overhead on the same problem. Just think that, the dictionary is not sorted at all and may be the dictionary has the following order of words,

Good, Place, Algorithm, Program, Zebra, Work, Bat, Cow, Ball, Billiards, Efficient, Alphabet, Access, Sort, Order, Monotonous, Migration, Immigration, External, Pouch, Internal ...........

Now, I ask you to find the meaning of the word 'Modern' from our this new dictionary.

Ohh Jesus !!! I can't do that. Please forgive me.
Well, that is what happens at a first glance, but if you think quietly, you will realize that you can find the meaning of any word from this dictionary too. Only thing you have to do is, you have to check all the words in the dictionary, until you get the word you are searching for.

Wait, that's basically none of use. If I cannot find a meaning when I require it, what is the use of finding it out ? Rather I will buy a new dictionary instead of using yours inefficient dictionary.
Here is $5 for you to buy the dictionary. ;)

Yes, that is what we basically want, a faster way to make things better and get the required element at ease.

Well, it is true that, 'Getting a right answer late is same as getting a wrong answer'.

That's where lies the beauty of sorting. Mostly, sorting helps in following two ways,
  1.  Easing the process of looking up an element from the pile
  2.  Enhancing the process of merging sequences
So, in today's world data has become huge, here is an article depicting the scenario. Just check the article and get an understanding, how huge the data has become globally. So, to find out the exact information from this data, we need a better and efficient way of searching the exact information from this huge pile of data.

Sorting helps this process in an efficient way. So, to cope up with the large data collection, we need efficient sorting process.

So, sorting has become a burning problem of today and in future also, it will be. Somewhere, I read this quote, 'We need an efficient sorting algorithm every one and half years'.

Hope that, you got an understanding on the usability of sorting. Next, we will be looking into some sorting algorithms and will analyze them.

Prev     Next
Palash Kanti Kundu

Wednesday, 18 November 2015

Motivation to learn about algorithms



If you have taken a look at my previous article, you must have observed that, I have taken some examples of the IT giants like Google, Microsoft, Facebook. So, you might be wondering, all these are related to the giants and they have found it very much useful to devise a new algorithm to solve the problem in hand, I don’t deal with such a huge data in my programs, so probably I don’t need it. Better have a coffee myself instead of reading this article.

Exactly the same I thought when I was pursuing my graduation and ignored this vital part of science. Honestly confessing, until today, I never bothered about this topic and whenever possible I had avoided the discuusion. But today, I found out how useful and interesting this field can be, I also could recall a scenario from my past IT experience. Somewhere there was a code, which was trying to sort an array of records and I remember, it was using a poor algorithm.

At the time of writing the code, the developer might not have thought of a scenario when the size of the incoming array would have crossed a limit this algorithm stands good for. So, when I was working on that piece of code, I faced the issue and the system was taking too long to respond.

No credit for me, I have not fixed the code.
:)

Credit goes to the senior team member who just took a look on the piece of code, removed the code and simply added Arrays.sort() method. I was astonished to see the performance improvement after that but could not figure out what he did on that piece of code.
Well almost after two years of that incident, I have found out the reason today. Basically, Java internally also uses algorithm to solve the case of sorting.

So, what’s the big deal, your code also had an algorithm?
Well, simply Java uses a better algorithm to solve the case of sorting.

So, how do you know about that now after such a long time?
Well, I was writing code for some algorithms and was astonished see the performance boost a good algorithm can provide you. Then I suddenly recalled the event and went to Java Docs to match my answer and my initial guess was correct that Java uses a better algorithm to solve problems.


So, I will be using the standard API, no problem. No need to study Algorithms…
Yes, I encourage you to use the standard API whenever possible. But, in some cases, you may also be stuck with a situation that I faced a long ago and you may not have a senior like you who can fix it in a minute.
To implement things in a better way, you need to figure out the existing worse way, right?

Hmm, makes sense.

Good that you are also in the same page I am into. Well, if you still have a doubt, just check out the following table which is a practical response I have got running two different algorithms on the same system, on the same data set.
Data Size
Run time for Algorithm 1 (in Milliseconds)
Run time for Algorithm 2 (in Milliseconds)
250
0 *
1
500
1
2
1000
0 *
4
1500
1
5
5000
2
26
10000
2
85
50000
11
2381
100000
12
7894
1000000
59
828873
50000000
8075
Running from November 18, 2015. I stopped the execution on November 27, 2015.

* This is inaccurate. It might have completed within a fraction of milliseconds, but definitely it has taken some time to complete but that is really negligible and tends to 0.

So, we can see that as the data size increases, the difference also increases significantly.

Actually, I earlier was stuck in the data range of 50,000. When the developer has wrote this code, he might not have thought of this scenario to be present and a difference between 2 milliseconds and 85 milliseconds is really not a very much notable difference but the difference between 11 milliseconds and 2381 milliseconds is quite observable.
So, you can get some idea why it is better to have knowledge about algorithms.


For your use, I have added all these code in my Git account, https://github.com/Palash90/algorithms.
You can check out the code to run it in your local system and  can play with it using different number sets.


Hope you enjoyed this article…

 Prev   Next
Palash Kanti Kundu

Tuesday, 13 October 2015

Necessity of Algorithms


Algorithms play a great role solving problems in faster and easier ways...

With the introduction, you might be wandering, why the hell you would study another subject and stuff your brain with knowledge...

Well, if I tell you that there are many things you do on a day to day basis, those you do by following an algorithm, then ?

Hmm, baking a cake is following an algorithm, I agree.
What about a more simple stuff like applying shampoo on your hair ?

HOW ???
Well, most of the shampoo container comes with the instruction to wash your hair, something like: wet hair, lather, rinse, repeat

Do you find the consequence and the problem with the algorithm ?
You are most likely to be running out of your shampoo and your time.

How ?
Well, as a human being, you can assume that, you have to repeat just once. So, this works fine for you. But computers have no brain to think or assume. Just take a look on this article.

Consequently, it fails for a computer and you are most likely to avoid this algorithm.

Ohh no, all my works are gone ?
Of course not, you can think better strategy and device a new algorithm.

Can you think an alternative ?
What about this one, wet hair, lather, rinse, repeat once ?

Perfect one.

Feeling happy to device the new algorithm ?

Well,  what did I do ? Can you call it an algorithm ?
Yes, it is a modified algorithm which considers a boundary to the steps to be followed. So, it is an algorithm.

Wow !!!
Yeah, exactly the same feeling I had when I first solved this problem in the days of my algorithm classes.

Seriously, when you think of anything, you get a better way to do it. Not only this simple Shampoo Algorithm, you everyday perform executions of lots of other algorithms as well. Only thing is that, you do not know that it was an algorithm. Again, if you think closely to subject, you can find better algorithms to solve a problem.

So, what are the criteria it should meet to be an algorithm ?
You simply has to find the answers, if you find all the answers as 'YES', it is an algorithm.
  • Is it unambiguous? 
  • Does it have defined inputs and outputs? 
  • Is it guaranteed to terminate? 
  • Does it produce the correct result? 

Now my question to you is, how am I going to be benefited by algorithms ?
Well, there are multiple reasons, you should know about algorithms. When you study about algorithms, your work is influenced by the knowledge you gain. So you will always try to find a better way to do your stuffs.
Let's discuss with some examples,
  • First of all, take the shampoo algorithm. If the wrapper editor would have thought about the problem himself, he was supposed to device the modified algorithm.
  • In many situations, you have to work with collections of data. You can decide which one to use, an Array, a List or a Queue depending on the requirement. Assuming you will be using a list, which one is better in that, a linked list or Hashed Linked List etc.
  • How you achieve a requirement with less number of operations or memory usage.
  • How to device an algorithm for a better unique key design.
And many more you can do with the knowledge of algorithms.


In fact, there are many real time applications that use algorithms to get the output in minimal steps and minimal resource use.
Below are some,
  1. How does Google find billions of search results that too most perfect according to your need ?
    Google use PageRank algorithm.
  2. How does Hangout/Skype sends/receives video over network so fast ?
    They use Compression algorithm.
  3. How does Facebook maintains so many user data and that too perfectly ?
    Facebook use News Feed Algorithm.
So, you see, whatever you think is very complex, there always remains a algorithm to help the stuff and team always to improve their algorithms to stick to the market.


Well, most of the times you will not find a direct impact of Algorithmic study in your coding. Because, almost all the popular language comes with built in API to deal with the stuffs like searching, hashing, sorting etc. But you will find yourself writing a better implementation.

Well, if time permits for you, take a look into this video





Prev   Next

Palash Kanti Kundu