Analysis of Algorithms

 

Ø   An algorithm is a clearly specified set of instructions the computer will follow to solve a problem.  We will look at the time analysis of algorithms, that is, how long does an algorithm take to run as a function of its input.

 

Ø   Start with a simple non-programming example, the stair-counting problem:

Suppose that you and a friend are at the top of a lighthouse and you wonder how many stairs there are to the bottom.  We'll look at three different methods that you could use to answer this question and we will analyze the time requirements of each.


Method 1: Walk down and keep a tally.   In this method, you take a pen and paper and walk down the stairs.  Each time you step down one stair you make a mark on the paper.  When you reach the bottom you run back up to the top of the lighthouse and show the paper to your friend.



Method 2: Walk down and let your friend keep a tally.  In this method you have no pen and paper so your friend keeps a tally by making marks in the dust at the top of the stairs.  You help by giving her a count of the stairs as you go down them; your procedure is as follows: each time you go down a step you lay your hat on that step and run back to your friend telling her to make a mark for that step.  You then run back down to your hat, move it to the next step and repeat the procedure.  This continues until you reach the bottom and you speed back up the steps one more time to look at the count.


Method 3: Ask the lighthouse keeper.  Just as the question is posed you see the light house keeper and ask her what the stair count is.  She gives you a pen and paper and says, "The count is 2689, write it down so you don't forget."

 

Ø   Now that each method has been described we can analyze the time required by each method.  Rather that measuring that actual elapsed time during each method we will count the operations that occurred while carrying out each method.  There are 2 kinds of operations that we will count and we will count them equally even though they may take differing amounts of time:

(1) walking up or down one step is one operation

(2) making a mark or a symbol is counted as one operation

 

Ø   In our time analysis we answer the question: How many operations are needed for each of the three methods?


Method 1:

Steps down: 2689

       Steps up:  + 2689

       Marks:     + 2689

        ------------------------

 Operations: 8067

 

Method 2:

Steps down: 3,616,705 (which is 1+2+...+2689)

Steps up:   + 3,616,705

Marks:      + 2689

------------------------------

Operations: 7,236,099

 

Method 3:

Steps down: 0

Steps up:   + 0

Marks:      + 4


--------------------

Operations: 4

 

Ø   Doing a time analysis of an algorithm is similar to the analysis of the stair counting algorithm: we measure time not as elapsed time but rather by counting the number of operations that must be performed even when the time required for each operation might be slightly different.  There is no precise definition of what constitutes an operation although it should be a small step.  In analyzing programs, each program statement is typically considered to be one operation.

 


Ø   For most programs, the number of operations performed depends on the program's input.  This is also true of the stair-counting example, the number of operations performed depends on the number of stairs in the lighthouse.

 

Ø   When a method's time depends on the size of the input then that time can be expressed as a function of the input's size.  In the stair-counting example, if we let n represent the number of stairs in the lighthouse, then the time required for each method can be expressed as follows:

Method 1: 3n

Method 2: n + 2(1+2+3...+n)

Method 3: the number of digits in n

 

Ø   The expression for the second method is not easy to interpret and can be simplified using the following trick:

   (1+2+3+...+n)

+ (n+...+3+2+1)

  ------------------

    n(n+1)



Therefore, n + 2(1+2+3...+n) could be written as:

n+2n(n+1)/2 = n + n(n+1) = n2 + 2n

 

Ø   The number of operations required for Method 3 is the number of digits in the number n.  When n is written in base 10 notation, this is approximately equal to the base 10 logarithm of n, log10 n.

 

Ø   With these simplifications the revised time analysis for each stair-counting method is:

Method 1: 3n

Method 2: n2 + 2n

Method 3: log10 n

 

                     Big-Oh Notation


Ø   Above we computed the exact number of operations required for each method, but often it is enough to know roughly how the number of operations is effected by the input size.  For example, if we apply the stair-counting method to a lighthouse with 10 times as many steps as were in the first lighthouse, how would the time required by each counting method grow?

Method 1: the number of operations increases tenfold (from 3n to 3(10n)=30n).

Method 2: the number of operations increases approximately 100fold (from about n2 to about (10n)2=100n2)

Method 3: the number of operations increases by 1 (from the number of digits in n to the number of digits in 10n)

 


Ø   We can express this kind of growth information using big-Oh notation.  In big-Oh notation, only the largest term in the formula is used, that is, the term with the largest exponent on n or the term that grows the fastest as n becomes large.  In big-Oh notation, all constant multipliers are omitted.  The time requirements of the stair-counting methods in big-Oh notation are as follows:

Method 1: O(n)            Linear time

Method 2: O(n2)           Quadratic time

Method 3: O(log n)           Logarithmic time

 

 


Ø   When a time analysis is expressed in big-Oh notation, the result is called the order of the algorithm. The stair-counting example also illustrates one of the most important concepts behind the use of big-Oh notation: the order of the algorithm is generally more important than the speed of the processor.  For example, a very fast stair climber is unlikely to be faster than a slowpoke provided the slowpoke uses one of the faster counting methods.

 

Ø   The table from last weeks lecture illustrates the fact that growth rate (i.e., the order) of an algorithm is most important when n (i.e., the input size) becomes sufficiently large.


              Time Analysis of C++ Functions

Ø   An example of a linear time algorithm: a function that finds the maximum value in an array of n elements.  In this example, an array of n elements is the input to the algorithm.

 

template <class Etype>

Etype max_element(const Etype array[ ], size_t size)

{

Etype max=array[0];

for (int i=1; i<size; i++)

max=max<array[i] ? array[i] : max;

return max;

}

 

Ø   Time analysis:


- The parameter size holds the size of the array (i.e., the input to the function), therefore, n is the value of size.

- Prior to the for-loop, there is one operation.

- After the for-loop there is one operation.

- The body of the for-loop consists of one statement (i.e., one operation) that is executed (n-1) times

- Therefore, the total number of operations is 2+n-1 which is O(n) in big-Oh notation.

 

Ø   In fact, if the body of the for-loop had contained any number of operations, say, k operations, the function would take 2+(n-1)k time which is still linear time.  This can be summarized as follows: A loop that does a fixed number of operations n times requires O(n) time.

 

Ø   An example of a quadratic time algorithm: the insertion sort algorithm.

 

 


template <class Etype>

void insertionSort(Etype array[ ], size_t n)

{

for (int p=1; p<n; p++)

{

Etype tmp = array[p];

int j;

for (j=p; j>0 && tmp<array[j-1]; j--)

array[j]=array[j-1];

array[j]=tmp;

}

}

 

Ø   Time analysis:

- There are 2 nested for-loops.  The body of the outer for-loop is executed n times.


- The body of the inner for-loop can be executed at most p times for each value of p.  Summing over all values of p gives a total of: (1+2+3+4...+n) = n(n+1)/2 = ½ n2 + ½ n = O(n2)

 

Ø   A simple rule of thumb is that if you have 2 nested loops, each of which can be executed at most n times, then the algorithm is O(n2).

 

Ø   An example of a logarithmic time algorithm: binary search.  If an input array is sorted and you want to find the index position of a particular number in that array, you can perform a binary search to find that position.

template <class Etype>

int binarySearch(const Etype array[ ], const Etype & value, size_t n)

{

int low=0, high=n-1, mid;

while (low<=high)


{

mid=(low+high)/2;

if (array[mid]<value)

low=mid+1;

else if (array[mid]>value)

high=mid;

else

return mid;

}

return -1;   // this indicates the value was not found

}

 

Ø   Time analysis:

- Each pass through the while loop, the range of array positions to be searched is cut in half. 


- Starting with n positions, after 1 pass through the while-loop we will have n/2 positions remaining to check; after 2 passes we will have ½ x n/2=n/4 positions that still need to be checked; after 3 passes the number is reduced to n/8, and after k passes that number is n/2k.

- In the case of an unsuccessful search, the number of times, k, that the while-loop will be executed is bounded by the requirement that n/2k>=1 or 2k<=n, thus

k<=log2 n.

- Therefore, the time required by this algorithm is

O(log n) in the worse case.

 

Ø   A simple rule of thumb is that an algorithm is

O(log n) if it repeatedly cuts the problem size in half using a constant number of operations each time.

 


Ø   In this discussion, we have talked about the worse-case performance of an algorithm.  This occurs when we count the maximum number of required operations for inputs of any size.  During a time analysis you may find that you are unable to provide an exact count of the operations required, but if the analysis is a worse-case analysis, you may estimate the number of operations always making sure that your estimate is on the high side.  Later on you will see average-case and best-case analysis as well.