CS 136 - Lecture 14

  1. More examples of recursion
    1. Mergesort
    2. Quicksort

More examples of recursion

Merge sort

Divide and conquer sort:
  /** 
    POST -- elementArray is sorted into non-decreasing order  
  **/
    public void sort(Comparable[] elementArray)
    {
        recMergeSort(0,elementArray.length -1,elementArray);
    }

    /**
        pre: first, last are legal indices of elementArray
        post:  elementArray[firstIndex..last] is sorted in 
                    non-decreasing order
    **/
    protected void recMergeSort(int first, int last, 
                                            Comparable[] elementArray)
    {
        int middle = (first+last)/2;    // middle index of array
        
        if (last - first > 0)        // more than 1 elt
        {
                // Sort first half of list  
            recMergeSort(first,middle,elementArray);   
                // Sort second half of list
            recMergeSort(middle+1,last,elementArray);  
                // Merge two halves
            mergeRuns(first,middle,last,elementArray);
        }
    }

Easy to show recMergeSort is correct if mergeRuns is.

Method mergeRuns is where all the work takes place.

Notes.

/** 
    PRE -- sortArray[first..middle] and 
        sortArray[middle+1..last] are sorted,
        and each range is non-empty.
    POST -- sortArray[first..last] is sorted
**/                             
    protected void mergeRuns (int first, int middle, int last, 
                                            Comparable[] sortArray)
    {
        int elementCount = last-first+1;    // # elts in array
            // temp array to hold elts to be merged
        Comparable[] tempArray = new Comparable[elementCount];      

        // copy elts of sortArray into tempArray in preparation 
        // for merging 
        for (int index=0; index < elementCount; index++)
            tempArray[index] = sortArray[first+index];
        
        
        int outIndex = first;   // posn written to in sortArray 
        int run1 = 0;               // index of first, second runs
        int run2 = middle-first+1; 
        int endRun1 = middle-first; // end of 1st, 2nd runs
        int endRun2 = last-first;   

        // merge runs until one of them is exhausted 
        while (run1 <= endRun1 && run2 <= endRun2) 
        {
            if (tempArray[run1].lessThan(tempArray[run2]))  
            {   // if elt from run1 is smaller add it to sortArray
                sortArray[outIndex] = tempArray[run1];      
                run1++;
            }
            else        
            {               // add elt from run2 to sortArray
                sortArray[outIndex] = tempArray[run2];      
                run2++;
            }
            outIndex++;
        }  // while

        // Out of elts from one run, but other may have elts
        // add remaining elements from run1 if any left 
        while (run1 <= endRun1) 
        {
            sortArray[outIndex] = tempArray[run1];
            outIndex++;
            run1++;
        }

        // add remaining elements from run2 if any left 
        while (run2 <= endRun2) 
        {
            sortArray[outIndex] = tempArray[run2];
            outIndex++;
            run2++;
        }
}

Determine the complexity of recMergeSort.

Claim: complexity is O(n log n) for sort of n elements.

Easiest to prove this if n = 2m for some m.

Prove by induction on m that sort of n = 2m elements takes <= n log n = 2m * m compares.

Base case: m=0, so n = 1. Don't do anything, so 0 compares <= 20 * 0.

Inductive Hypothesis: Suppose claim is true for m-1.

Inductive Step: Will show that claim is true for m.

recMergeSort of n = 2m elements proceeds by doing recMergeSort of two lists of size n / 2 = 2m-1, and then call of mergeRuns on list of size n = 2m.

Therefore,

#(compares) <=  2m-1 * (m-1) + 2m-1 * (m-1) + 2m 
           = 2*(2m-1 * (m-1)) + 2m
            = 2m * (m-1) + 2m
            = 2m * ((m-1) + 1)
            = 2m * m
Therefore #(compares) <= 2m * m = n log n

End of proof.


Quicksort

There is one last divide and conquer sorting algorithm: Quicksort.

While mergesort divided in half, sorted each half, and then merged (where all work is in the merge), Quicksort works in the opposite order.

That is, Quicksort splits the array (with lots of work), sorts each part, and then puts together (trivially).

/** 
  POST -- "elementArray" sorted into non-decreasing order  
**/
public void quicksort(Comparable[] elementArray)
{
    Q_sort(0, elementArray.length - 1, elementArray);   
}

/**
  PRE -- left <= right are legal indices of table.            
  POST -- table[left..right] sorted in non-decreasing order
**/
protected void Q_sort (int left, int right, Comparable[] table)
{
    if (right > left)   // More than 1 elt in table
    {
        int pivotIndex = partition(left,right,table);
        // table[Left..pivotIndex] <= table[pivotIndex+1..right]  
        Q_sort(left, pivotIndex-1, table);      // Quicksort small elts
        Q_sort(pivotIndex+1, right, table);     // Quicksort large elts
    }
}
If partition works then Q_sort (and hence quicksort) clearly works.

Note: it always makes a recursive call on a smaller array (easy to blow so it doesn't and then never terminates).

Partition: Algorithm below starts out by ensuring the elt at the left edge of the table is <= the one at the right. This allows guards on the while loops to be simpler and speeds up the algorithm by about 20% or more. Other optimizations can make it even faster.

/**
    post: table[left..pivotIndex-1] <= pivot 
            and pivot <= table[pivotIndex+1..right]  
**/
protected int partition (int left, int right, Comparable[] table)
{
        Comparable tempElt;         // used for swaps
        int smallIndex = left;      // index of current posn in left (small elt) partition
        int bigIndex = right;       // index of current posn in right (big elt) partition
        
        if (table[bigIndex].lessThan(table[smallIndex]))    
        {   // put sentinel at table[bigIndex] so don't 
            // walk off right edge of table in loop below
            tempElt = table[bigIndex];
            table[bigIndex] = table[smallIndex];
            table[smallIndex] = tempElt;
        } 
        
        Comparable pivot = table[left]; // pivot is fst elt 
        // Now table[smallIndex] = pivot <= table[bigIndex]
        do
        {
            do                          // scan right from smallIndex 
                smallIndex++;   
            while (table[smallIndex].lessThan(pivot));

            do                          // scan left from bigIndex
                bigIndex--;
            while (pivot.lessThan(table[bigIndex]));
            
            // Now table[smallIndex] >= pivot >= table[bigIndex]
             
            if (smallIndex < bigIndex)   
            {   // if big elt to left of small element, swap them
                tempElt = table[smallIndex]; 
                table[smallIndex] = table[bigIndex];
                table[bigIndex] = tempElt;
            } // if 
        } while (smallIndex < bigIndex); 
        // Move pivot into correct pos'n bet'n small & big elts
        
        int pivotIndex = bigIndex;      // pivot goes where bigIndex got stuck
        
        // swap pivot elt w/small elt at pivotIndex
        tempElt = table[pivotIndex];            
        table[pivotIndex] = table[left];    
        table[left] = tempElt;
        
        return pivotIndex;  
    }
The basic idea of the algorithm:

The complexity of QuickSort is harder to evaluate than MergeSort because the pivotIndex need not always be in the middle of the array (in the worst case pivotIndex = left or right).

Partition is clearly O(n) because every comparison results in smallIndex or bigIndex moving toward the other and quit when they cross.

In the best case the pivot element is always in the middle and the analysis results in
O(n log n), exactly like MergeSort.

In the worst case the pivot is at the ends and QuickSort behaves like SelectionSort, giving O(n2).

Careful analysis shows that QuickSort is O(n log n) in the average case (under reasonable assumptions on distribution of elements of array).


Compare the algorithms with real data:
Complxity   100 elts    100 elts    500 elts    500 elts    1000 elts   1000 elts   
            unordered   ordered     unordered   ordered     unordered   ordered     
Insertion   0.033       0.002       0.75        0.008       3.2         .017        
Selection   0.051       0.051       1.27        1.31        5.2         5.3         
Merge       0.016       0.015       0.108       0.093       0.24        0.20        
Quick       0.009       0.044       0.058       1.12        0.13        4.5         

Notice that for Insertion or Selection sorts, doubling size of list increases time by 4 times (for unordered case), whereas for Merge and Quick sorts bit more than doubles time. Calculate (1000 log 1000) / (500 log 500) = 2 * (log 1000 / log 500) ~ 2 * (10/9) ~ 2.2