2014-03-30

On efficiency and sorting


Comparing programs.

When there are two or more programs or functions that solve the same problem equally well it's necessary to find ways to compare their performance. Average time taken to reach the solution would be likely be the first choice of metrics to compared. Unfortunately, average run time can be challenging to calculate for even moderately complex function multiple inputs or where multiple characteristics of a single input affect run time.

In some cases it is sufficient to look at the worst case instead of average cases. This at least lets you determine that a program will solve all problems within specific sizes and types within a certain maximum time. In computer science the worst case run time is symbolized by TP(n) where P is the name of the program and n is the size of its input.

The "big oh".

For further ease of comparison programs can be grouped by a simple function that describes their worst case run times. This classification of time complexity based on input size is called "big oh" and is symbolized by a capital letter 𝒪, in a script font where possible. For example a functions with input size n and a worst case run time of 7n3 - n2 + 8n + 9000 is said to be in 𝒪(n3) because as the input grows extremely large the effect of the highest order exponent is sufficient to describe its overall behavior.

The 𝒪 function is technically an upper bound on the worst case run time so any program that is in 𝒪(n2) is also in 𝒪(n3) and even in 𝒪(2n).

Comparing methods of sorting.

Sorting algorithms are a good place to start looking at efficiency because they are relatively easy to explain and visualize and have been the subject of considerable historical efficiency analysis.

While there are algorithms that sort specific types of data in specific ways in linear time, taking n steps to sort a list of n items, general comparison sorting is limited to times in 𝒪(n lg n). In the cases we examined in CSC148 this type of run time was possible by dividing or partitioning the items to be sorted recursively so that sub list of the items could be sorted (a single item list takes zero steps to sort) and merged in n steps. Any list of items can be devided in half log2n times leading to a final result of log2n merges each taking n steps.


Card file image (CC BY 2.0) Glyn Lowe. Merge sort animation (Wikimedia Commons) Swfung8.