Ø An algorithm is a clearly specified set of instructions the computer will follow to solve a problem. We will look at the time analysis of algorithms, that is, how long does an algorithm take to run as a function of its input.
Ø Start with a simple non-programming example, the stair-counting problem:
Suppose
that you and a friend are at the top of a lighthouse and you wonder how many
stairs there are to the bottom. We'll
look at three different methods that you could use to answer this question and
we will analyze the time requirements of each.
Method
1: Walk down and keep a tally. In
this method, you take a pen and
paper and walk down the stairs. Each
time you step down one stair you make a mark on the paper. When you reach the bottom you run back up to
the top of the lighthouse and show the paper to your friend.
Method
2: Walk down and let your
friend keep a tally. In this method you
have no pen and paper so your friend keeps a tally by making marks in the dust
at the top of the stairs. You help by
giving her a count of the stairs as you go down them; your procedure is as
follows: each time you go down a step you lay your hat on that step and run
back to your friend telling her to make a mark for that step. You then run back down to your hat, move it
to the next step and repeat the procedure.
This continues until you reach the bottom and you speed back up the steps
one more time to look at the count.
Method 3: Ask the lighthouse keeper. Just as the question is posed
you see the light house keeper and ask her what the stair count
is. She gives you a pen and
paper and says, "The count is 2689, write it down so you
don't forget."
Ø Now that each method has been described we can analyze the time required by each method. Rather that measuring that actual elapsed time during each method we will count the operations that occurred while carrying out each method. There are 2 kinds of operations that we will count and we will count them equally even though they may take differing amounts of time:
(1)
walking up or down one step is one operation
(2) making a mark or a symbol is counted as one operation
Ø In our time analysis we answer the question: How many operations are needed for each of the three methods?
Method
1:
Steps
down: 2689
Steps up: + 2689
Marks: + 2689
------------------------
Operations: 8067
Method
2:
Steps
down: 3,616,705 (which is 1+2+...+2689)
Steps
up: + 3,616,705
Marks: + 2689
------------------------------
Operations:
7,236,099
Method
3:
Steps
down: 0
Steps
up: + 0
Marks: + 4
--------------------
Operations:
4
Ø Doing a time analysis of an algorithm is similar to the analysis of the stair counting algorithm: we measure time not as elapsed time but rather by counting the number of operations that must be performed even when the time required for each operation might be slightly different. There is no precise definition of what constitutes an operation although it should be a small step. In analyzing programs, each program statement is typically considered to be one operation.
Ø For most programs, the number of operations performed depends on the program's input. This is also true of the stair-counting example, the number of operations performed depends on the number of stairs in the lighthouse.
Ø When a method's time depends on the size of the input then that time can be expressed as a function of the input's size. In the stair-counting example, if we let n represent the number of stairs in the lighthouse, then the time required for each method can be expressed as follows:
Method
1: 3n
Method
2: n + 2(1+2+3...+n)
Method
3: the number of digits in n
Ø The expression for the second method is not easy to interpret and can be simplified using the following trick:
(1+2+3+...+n)
+
(n+...+3+2+1)
------------------
n(n+1)
Therefore,
n + 2(1+2+3...+n) could be written as:
n+2n(n+1)/2 = n + n(n+1)
= n2 + 2n
Ø The number of operations required for Method 3 is the number of digits in the number n. When n is written in base 10 notation, this is approximately equal to the base 10 logarithm of n, log10 n.
Ø With these simplifications the revised time analysis for each stair-counting method is:
Method
1: 3n
Method
2: n2 + 2n
Method
3: log10 n
Ø Above we computed the exact number of operations required for each method, but often it is enough to know roughly how the number of operations is effected by the input size. For example, if we apply the stair-counting method to a lighthouse with 10 times as many steps as were in the first lighthouse, how would the time required by each counting method grow?
Method
1: the number of operations increases tenfold (from 3n to 3(10n)=30n).
Method
2: the number of operations increases approximately 100fold (from about n2
to about (10n)2=100n2)
Method 3: the number of operations increases by 1 (from the number of digits in n to the number of digits in 10n)
Ø We can express this kind of growth information using big-Oh notation. In big-Oh notation, only the largest term in the formula is used, that is, the term with the largest exponent on n or the term that grows the fastest as n becomes large. In big-Oh notation, all constant multipliers are omitted. The time requirements of the stair-counting methods in big-Oh notation are as follows:
Method
1: O(n) Linear time
Method
2: O(n2)
Quadratic time
Method 3: O(log n) Logarithmic time
Ø When a time analysis is expressed in big-Oh notation, the result is called the order of the algorithm. The stair-counting example also illustrates one of the most important concepts behind the use of big-Oh notation: the order of the algorithm is generally more important than the speed of the processor. For example, a very fast stair climber is unlikely to be faster than a slowpoke provided the slowpoke uses one of the faster counting methods.
Ø The table from last weeks lecture illustrates the fact that growth rate (i.e., the order) of an algorithm is most important when n (i.e., the input size) becomes sufficiently large.
Ø An example of a linear time algorithm: a function that finds the maximum value in an array of n elements. In this example, an array of n elements is the input to the algorithm.
template
<class Etype>
Etype
max_element(const Etype array[ ], size_t size)
{
Etype
max=array[0];
for
(int i=1; i<size; i++)
max=max<array[i]
? array[i] : max;
return
max;
}
Ø Time analysis:
-
The parameter size holds the size of the array (i.e., the input to the
function), therefore, n is the value of size.
-
Prior to the for-loop, there is one operation.
-
After the for-loop there is one operation.
-
The body of the for-loop consists of one statement (i.e., one operation) that
is executed (n-1) times
- Therefore, the total number of operations is 2+n-1 which is O(n) in big-Oh notation.
Ø In fact, if the body of the for-loop had contained any number of operations, say, k operations, the function would take 2+(n-1)k time which is still linear time. This can be summarized as follows: A loop that does a fixed number of operations n times requires O(n) time.
Ø An example of a quadratic time algorithm: the insertion sort algorithm.
template
<class Etype>
void
insertionSort(Etype array[ ], size_t n)
{
for
(int p=1; p<n; p++)
{
Etype
tmp = array[p];
int
j;
for
(j=p; j>0 && tmp<array[j-1]; j--)
array[j]=array[j-1];
array[j]=tmp;
}
}
Ø Time analysis:
-
There are 2 nested for-loops. The body
of the outer for-loop is executed n times.
-
The body of the inner for-loop can be executed at most p times for each value
of p. Summing over all values of p
gives a total of: (1+2+3+4...+n) = n(n+1)/2 = ½ n2
+ ½ n = O(n2)
Ø A simple rule of thumb is that if you have 2 nested loops, each of which can be executed at most n times, then the algorithm is O(n2).
Ø An example of a logarithmic time algorithm: binary search. If an input array is sorted and you want to find the index position of a particular number in that array, you can perform a binary search to find that position.
template
<class Etype>
int
binarySearch(const Etype array[ ], const Etype & value, size_t n)
{
int
low=0, high=n-1, mid;
while
(low<=high)
{
mid=(low+high)/2;
if
(array[mid]<value)
low=mid+1;
else
if (array[mid]>value)
high=mid;
else
return
mid;
}
return
-1; // this indicates the value was
not found
}
Ø Time analysis:
-
Each pass through the while loop, the range of array positions to be searched
is cut in half.
-
Starting with n positions, after 1 pass through the while-loop we will
have n/2 positions remaining to check; after 2 passes we will have ½ x
n/2=n/4 positions that still need to be checked; after 3 passes the
number is reduced to n/8, and after k passes that number is n/2k.
-
In the case of an unsuccessful search, the number of times, k, that the
while-loop will be executed is bounded by the requirement that n/2k>=1
or 2k<=n, thus
k<=log2 n.
-
Therefore, the time required by this algorithm is
O(log
n) in the worse case.
Ø A simple rule of thumb is that an algorithm is
O(log
n) if it repeatedly cuts the
problem size in half using a constant number of operations each time.
Ø In this discussion, we have talked about the worse-case performance of an algorithm. This occurs when we count the maximum number of required operations for inputs of any size. During a time analysis you may find that you are unable to provide an exact count of the operations required, but if the analysis is a worse-case analysis, you may estimate the number of operations always making sure that your estimate is on the high side. Later on you will see average-case and best-case analysis as well.