| Reading | |
|---|---|
| Dale/Lewis: | beginning of 9.4: pp. 287-289 |
| These notes |
Outline:
We assume that the items are originally sitting in consecutive memory locations, though in arbitrary order. Try it with 5 - 10 elements in the list. Think about what method you are using. Can you describe your method clearly enough so it is an algorithm?
Several simple iterative algorithms you might use are described next (selection sort and insertion sort), followed by a most complicated algorithm that is much more efficient for large lists (merge sort).
procedure SelectionSort(List) N = 0 Last = index of the last element in the list while N < Last MIN_INDEX_SO_FAR = N J = N + 1 while J <= Last if List[J] < List[MIN_INDEX_SO_FAR] MIN_INDEX_SO_FAR = J J = J+1 interchange List[N] and List[MIN_INDEX_SO_FAR] N = N + 1
Example: Starting with the list
(D O S A M P L E)we show the list after each time through the outer loop, and the part that must already be sorted at the beginning of the list is underlined.
(A O S D M P L E) (A D S O M P L E) (A D E O M P L S) (A D E L M P O S) (A D E L M P O S) (A D E L M O P S) (A D E L M O P S)The final element must be the largest at the end.
Advantage: Number of swaps is at most the length of the list.
Efficiency Analysis
If length of list is denoted as N, then Selection Sort
always requires a number of operations which grows proportional to
N2,
though at most N-1 swaps.
The intutition is as follows:
procedure InsertionSort (List) N = 1; while N <= index of the last element in the list Select List[N] as the "pivot" entry; Move the pivot entry to a temporary location leaving a hole in List; while there is a name above the hole AND that name is greater than the pivot move the name above the hole down into the hole, leaving a hole above the name Move the pivot entry into the hole in List; N = N+1
Example: We show the results sorting
(D O S A M P L E)after each time through the outer loop, and the part that must already be sorted at the beginning of the list is underlined.
(D O S A M P L E) O is already after D (D O S A M P L E) S is already after O (A D O S M P L E) (A D M O S P L E) (A D M O P S L E) (A D L M O P S E) (A D E L M O P S)Let us illustrate the body of the outer loop just once, going from the third line above, to the fourth. The next pivot is M:
(A D O S M P L E) start of outer loop
M save the pivot; make a hole
(A D O S P L E)
M
(A D O S P L E) inner loop: M < S, shift S
M
(A D O S P L E) inner loop: M < O, shift O
(A D M O S P L E) M > D; inner loop done; insert M in the hole
Advantage: Can be very quick when original list was almost sorted.
Efficiency Analysis
Insertion Sort has efficiency which depends very much on the
original order of the list. In the best case (when the list is nearly sorted),
the overall number of operations is proportional to N. However, in the
worst case, the overall number of swaps and other operations is proportional
to N2.
Merging two sorted lists
The algorothm is:
MergeLists(list1, list2), returning the sorted list
while both lists have more members
if the head of the
first list is less than the head of the second list
remove the head of the first list, and append it to the sorted list
else
remove the head of the second list, and append it to the sorted list
append whichever list is left to the
sorted list
Example 1:
list1: 2 5 6 9 10 list2: 3 6
sorted: empty
2< 3 so put 2 in sorted list
list1: 5 6 9 10 list2: 3 6
sorted: 2
!(5< 3) so put 3 in sorted list
list1: 5 6 9 10 list2: 6
sorted: 2 3
5< 6 so put 5 in sorted list
list1: 6 9 10 list2: 6
sorted: 2 3 5
!(6< 6) so put 2nd list 6 in sorted list
list1: 6 9 10 list2: empty
sorted: 2 3 5 6
no more comparisons since one list is empty
append nonempty list1 to sorted list
sorted: 2 3 5 6 6 9 10
Example 2:
head1: 5 5 head2: 7 9 10 12
sorted: empty
5< 7 so put 5 in sorted list
head1: 5 head2: 7 9 10 12
sorted: 5
5< 7 so put 5 in sorted list
head1: empty head2: 7 9 10 12
sorted: 5 5
no more comparisons since one list is empty
append nonempty list2 to sorted list
sorted: 5 5 7 9 10 12
The recursive procedure for sorting one unsorted list using MergeLists can be described as:
procedure MergeSort (List) if (List has more than one entry) then (Apply the procedure MergeSort to sort the first half of the List; Apply the procedure MergeSort to sort the second half of the List; Apply the procedure MergeLists to the two halves )The precise order of all the recursive calls is complicated, confusing, and not important for our purposes. It is important that the unsorted lists are repeatedly cut in half, and that corresponding pairs of smaller sorted lists are later merged into larger sorted lists.
For example, suppose we start with the unsorted list of characters
(D O S A M P L E)Mark the ends of lists with parentheses. We do not yet know what the final sorted list will be. Indicate that by a list of empty spaces.
(_ _ _ _ _ _ _ _)The final list will be merged from two half lists.
(_ _ _ _)(_ _ _ _)And each of those lists will be merged from half size lists.
(_ _)(_ _)(_ _)(_ _)Which in turn are merged from half sized lists.
(_)(_)(_)(_)(_)(_)(_)(_)Showing all of this so far together we have
(D O S A M P L E) original unsorted list (_ _ _ _ _ _ _ _) place for final sorted list (_ _ _ _)(_ _ _ _) (_ _)(_ _)(_ _)(_ _) (_)(_)(_)(_)(_)(_)(_)(_)The bottom lists are all of length 1, and hence do not require sorting, so we can fill them in with the original elements of the list. Note that the Mergesort algorithm only does something to a list if the lists have length greater than 1.
(D O S A M P L E) original unsorted list (_ _ _ _ _ _ _ _) place for final sorted list (_ _ _ _)(_ _ _ _) (_ _)(_ _)(_ _)(_ _) (D)(O)(S)(A)(M)(P)(L)(E)Now we can work out the sorted lists of length 2 at the next level, by merging the pairs underneath them.
(_ _ _ _ _ _ _ _) (_ _ _ _)(_ _ _ _) (D O)(A S)(M P)(E L) (D)(O)(S)(A)(M)(P)(L)(E)And again, merging lists on length 2 into lists of length 4.
(_ _ _ _ _ _ _ _) (A D O S)(E L M P) (D O)(A S)(M P)(E L) (D)(O)(S)(A)(M)(P)(L)(E)And finally merging the lists of length 4 into a list of length 8
(A D E L M O P S) (A D O S)(E L M P) (D O)(A S)(M P)(E L) (D)(O)(S)(A)(M)(P)(L)(E)In the end
(D O S A M P L E)is sorted into
(A D E L M O P S)You will be expected to be able to do this algorithm by filling out such a table. Start with the original list and unfilled lists split repeatedly in two.
(D O S A M P L E) (_ _ _ _ _ _ _ _) (_ _ _ _)(_ _ _ _) (_ _)(_ _)(_ _)(_ _) (_)(_)(_)(_)(_)(_)(_)(_)Then work your way up from the bottom merging pairs of lists, to get the final answer.
(A D E L M O P S) (A D O S)(E L M P) (D O)(A S)(M P)(E L) (D)(O)(S)(A)(M)(P)(L)(E)A bigger example is sorting
(T H E B I G B A D W O L F A T E)The solution, following the method above, is shown at the end of these notes.
Efficiency Analysis
Merge Sort guarantees that the overall number of operations
used will be proportional to N log2 N.
The difference between sorting in time proportional to N log N
versus N2 is dramatic
(as was the difference between searching in log N operations
rather than N operations).
| N | 100 | 1,000 | 10,000 | 100,000 | 1,000,000 |
|---|---|---|---|---|---|
| N log N | 660 | 10,000 | 130,000 | 1,660,000 | 20,000,000 |
| N2 | 10,000 | 1,000,000 | 100,000,000 | 10,000,000,000 | 1,000,000,000,000 |
Proof: Let's look at an explanation of this guarantee.
If we first consider the number of operations which are used by the MergeLists procedure, we claim that the total is proportional to the sum of the lengths of the two input lists.
Now, if we analyze the overall MergeSort computation, we can look at the hierarchy of subproblems, as shown above.
The two key facts are:
(T H E B I G B A D W O L F A T E) (_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) (_ _ _ _ _ _ _ _)(_ _ _ _ _ _ _ _) (_ _ _ _)(_ _ _ _)(_ _ _ _)(_ _ _ _) (_ _)(_ _)(_ _)(_ _)(_ _)(_ _)(_ _)(_ _) (_)(_)(_)(_)(_)(_)(_)(_)(_)(_)(_)(_)(_)(_)(_)(_)And end up with
(A A B B D E E F G H I L O T T W) (A B B E G H I T)(A D E F L O T W) (B E H T)(A B G I)(D L O W)(A E F T) (H T)(B E)(G I)(A B)(D W)(L O)(A F)(E T) (T)(H)(E)(B)(I)(G)(B)(A)(D)(W)(O)(L)(F)(A)(T)(E)