Sorting Algorithms

You do not need to memorize all of the following, but you should be familiar with them and spend some time writing up your own implementation of them.

Terms

Stability

Sorting algorithms are considered to be stable if two elements with equal keys remain in the same relative order as they appear in the input and the output.

Adaptive

An adaptive algorithm takes advantage of any existing values that are presorted in the input set and tend to sort faster. The amount of disorder in the input set has a negative effect on the ultimate time taken to complete the sorting of said input set.

In-Place

A sorting algorithm that occupies the same storage space as the input set. Typically, it requires one or two additional temp variables but does not require large chunks of additional memory to store subsets of the input set while sorting.

Online

An online sorting algorithm does not require all of the elements to be present at the beginning of the sorting operation. They can be provided as a stream, or one-by-one and the algorithm must keep additional items sorted as they are provided to the input set.

Insertion sort

More efficient for a relatively smaller number of elements. Similar to selection sort. If provided with a nearly sorted input, it is very efficient. Relatively speaking, each element is examined and checked to see if the element before it is greater in value that it is. If so, it is swapped and thus continues down the array until the correct position is found. It can be more memory r/w intensive than selection sort, so in cases where r/w are slow selection sort may be favored. Advantageous over selection sort and other methods as it is online

  • Worst, Avg: O(n ^ 2)
  • Best case, O(n)
  • Online, adaptive, stable, in-place

Example: 13, 24, 9, 8, 6

We typically start at the beginning of the array and work towards the end of the array, each time, we sort down (towards the 0 index) the value until it is in it’s correct place, such that it is less than the element to it’s right.

Assume a for i = 0, i < lengh; i++ semantic.

  1. i = 13, there is nothing to the left, so i++
  2. i = 24, i > i-1, so nothing to do, i++
  3. i = 9, !(i > i-1), so we swap 9 and 24.
  4. Then we check to see whether the i-1 element from 9 is less than 9
  5. It is not so we swap those two
  6. The first three elements in the array are now, 9, 13, 24, the sorted sub-set.
  7. i = 8 . . . we continue checking the ith element to i-1 until we have sorted the entire array.

Quicksort

A divide and conquer algo. Divide the array into two sub arrays via a partioning algorithm (pick the middle element/value or a random value). Then sort all elements less than the pivot to the left of the pivot, all greater than or equal to the right. Then recursively run the same function on the values to the left of the pivot. Can be problematic in that it can result in resource starvation and stack-overflow problems as the result of it’s recursive nature.

Mergesort

A divide and conquer algo. Recursively split the list until you have lists that are 1 element and merge the result.

  • Worst case: O (n log n)
  • Avg case: O (n log n)
  • Best case: O (n log n)
  • Stable (most implementations)
  • http://en.wikipedia.org/wiki/Merge_sort
  • http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Merge_sort.html
  • http://www.ee.ryerson.ca/~courses/coe428/sorting/mergesort.html

Heapsort

Is an inplace algorithm with a worst, best, and average complexity of O(n log n).

Can be though of as an improved selection sort. It is an improvement because of the use of the semi-sorted heap data structure and that there is no longer a linear approach required to compare all items to each other (not O(n^2)).

It is essentially a two part algorithm. We will assume in this example that this is a max heap and not a min heap

  1. Build the heap
  2. Sort the heap

We capitalize on the fundamental nature of the heap that the largest item is always at the root. And, when we remove the largest item and then reform the remaining heap the next largest item bubbles to the top. Utilizing an array to store our heap we divide the array into an unsorted and sorted segments. Initially, the entire array is unsorted.

In each sort operation, i, we remove the largest item which is at the front of the array, and one of the smallest from the end of the array and swap them. The first largest item that we put at the end of the array (last element of array – i) and it begins creating the sorted segment of the array, reducing the unsorted segment by one. The smaller element removed from the end of the array is placed at the root of the array and then sorted downward until the heap is reformed. This completes one sort cycle.

After each sort cycle the next largest element is added to the last-element-of-the-array – ith position in the array and we slowly grow the sorted section of the array.

Once we have removed all of the elements from the heap and added them back to the array we are left with a completely sorted array.

BubbleSort

Selection Sort

Similar to insertion sort, however, you must iterate through the entire array each time, to make sure that you have found the lowest next element, whereas in insertion sort, once you have met the criteria of a > b you can stop iterating for that element and move on to the next. An advantage is that it is simple to implement.

Worst, Avg, Best is O(n^2)

The basic algorithm:

  • Start with two sets, a sorted set on the left and an unsorted set on the right
  • Initially, the sorted set is 0 size, and the unsorted set is the rest of the array
  • For each element in the array:
  • Look for the smallest element in the unsorted set and and add it to the end of the sorted set
  • Continue until unsorted set is 0 size (reach the end of the array)
public void (int[] arr) {
   temp int = 0;          
   int currentMin = INT_MAX;
   int currentMinPosition = 0;

   // Start at the beginning of the array and look for the smallest value
   // from our current position to the end.
   for (int i = 0; i < arr.length-1; i++) {
      
      // Look for the smallest value from i to the end
      for (int j = i; j < arr.length; j++) {
         if (arr[j] < currentMin) {
            currentMin         = arr[j];
            currentMinPosition = j;
         }
      }

      // Once we have found the current minimum position in the unsorted
      // segment of the array, swap it with the current ith position
      temp   = arr[i];
      arr[i] = arr[j];
      arr[j] = temp;
   }
}

Topological Sorting

Nodes in a acyclic graph are placed in an order consistent with directed the edges of the graph. Such that for every directed edge a -> b, a appears before b in the result. The graph must be a DAG to be sorted in this fashion.

Nodes are visited (added to the result) only after all of their connected nodes, those upstream from them in the DAG, have been visited. Thus, a node visited later, must come later when topologically sorted.

The algorithm:

  1. Start with a node that does not have any incoming edges, or an indegree of zero.
  2. Add it to the result and remove it from the graph
  3. Then search for another node with a zero indegree
  4. Add it to the result and remove it from the graph
  5. Continue looping until the graph has zero elements remaining OR we cannot find a node with an indegree of zero

There are one of two possible outcomes:

  1. If the graph has zero elements we completed our ordering and it was indeed a DAG
  2. If we cannot find an additional node with an indegree of zero we have a a cyclic graph and cannot generate a topological sorting of it.

The implementation of the algorithm does not require destroying the original graph. We can either utilize a hashmap to keep track of the indegee value of each node, or a node structure that includes that property along with it.