Saturday 19 January 2013

Sorting Algorithms in Java

As the focus on data structures and algorithm increases. The Sorting algorithm are now in more focus esp in Interviews and in other development tasks we do.The purpose of writing this post is to provide information and provide sample code for all the sorting algorithm on one page or redirect on relevant pages. The post will not add anything that is not already available on Internet it will just aggregate the relevant information under one post.

Prior knowledge
One must know basic Java  and the commonly data structures supported in Java.

Basics :

Definitions of all the sorting algorithms and there basic algorithm is given  in this Sorting algorithm page of wiki. These basic definitions of sorting algorithms are sufficient to explain the basic crux of each algorithm.


Sorting algorithms :

1. Bubble sort : Bubble sort is a simple sorting algorithm. The algorithm starts at the beginning of the data set. It compares the first two elements, and if the first is greater than the second, it swaps them. It continues doing this for each pair of adjacent elements to the end of the data set. It then starts again with the first two elements, repeating until no swaps have occurred on the last pass.
  •   When to use : 1. In case the list size to be sorted is small  2.In case there only few or one element is not in position.
Coding Example  : Coding example for bubble sort can be found here.
  •  This algorithm's average and worst case performance is O(n2), so it is rarely used to sort large, unordered, data sets.

2.Selection Sort :  One of the simplest sorting algorithms works as follows: First, find the smallest item in the array and exchange it with the first entry (itself if the first entry is already the smallest). Then, find the next smallest item and exchange it with the second entry. Continue in this way until the entire array is sorted. This method is called selection sort because it works by repeatedly selecting the smallest remaining item.
  •   When to use : 1. In case the list size to be sorted is small  2.In case there only few or one element is not in position. 3. The maximum number of swaps can be n so this sorting is useful in case of expensive swaps. 
 Coding example : The coding example can be found here.
  • This algorithm's complexity is O(n2)

Difference between bubble sort and selection sort : Both the sorting algorithm sounds same.But the key difference is that bubble sort compares adjacent elements till list is fully sorted whereas the selection sort finds the smallest item and places it at the first entry.

3. Insertion Sort : The best definition and explanation for insertion sort is here.


  When to use : 1. The insertion sort can be very fast and efficient when used with smaller arrays. 2. The maximum number of swaps can be n so this sorting is useful in case of expensive swaps.3. Efficient for data sets that are partially sorted

Coding example : The  algorithm is here

int j;
for (j=1;j<=n;j++ ){
    int y = data[j];
    for (int i=j-1; i>=0 && y<data[i] ;i--){
      data[i+1]= data[i];
      data[i]= y;
  }
}
 
Consider an example of array [6,5,7,2].

1. The variable J starts an iteration from position 1 i.e 5 till 2 position 3
2. On each iteration of J we will start comparing the elements with elements at position less than J
3. if the elements is less the positions are swapped.
  • This algorithm's worst case complexity is O(n2) and best case [already sorted array]is O(n).

4. Shell Sort : If you study the insertion sort carefully what it does is it start comparison with the adjacent elements i.e the loop iterates with a gap size of 1. The shell sort suggests that instead of comparing adjacent elements and iterating over a loop with gap size of 1 you should iteration loop with a fixed gap size and that gap size should decrease with every loop.

Coding example : The only change in the algorithm from insertion sort is that there will be an higher loop on the insertion sort code and instead of iterating over elements with gap size of 1 you will iterate over element with different gap sizes till the gap size is 1.The code example is here or here
  • This algorithm's time complexity depends on the selection of gap sizes

5. Combo Sort : Combo Sort has same relation with bubble sort as Shell sort have with insertion sort. In case of combo sort the elements are sorted using bubble sort algorithm however they are not sorted by comparing adjacent element rather the elements that are at a gap from the element . This gap is decided by a shrink factor [generally 1.3] .

Coding example : The algorithm  is here.
  • This algorithm's worst case complexity is O(n2) and best case [already sorted array]is O(n).

6. Merge Sort : Also called as "Divide and conquer algorithm". The animation representation of the algorithm is here.  
Conceptually, a merge sort works as follows
  1. Divide the unsorted list into n sublists, each containing 1 element (a list of 1 element is considered sorted).
  2. Repeatedly merge sublists to produce new sublists until there is only 1 sublist remaining. This will be the sorted list.
When to use : 1. This algorithm scales well for large lists also. 

Coding example : The coding example can be found here.
  • This algorithm's worst case  and best case complexity is O(n log n).

  In Java, the Arrays.sort() methods use merge sort or a tuned quicksort depending on the datatypes and for implementation efficiency switch to insertion sort when fewer than seven array elements are being sorted.


7. Heap Sort : It is variant of Selection sort but accomplish the sort by using a data structure called a heap [a special type of binary tree]. The details for implementation of heap sort are present here and also here . heap sort is not as popular as quick sort algorithm for sorting.

8. Quick Sort : Quicksort is a divide and conquer algorithm which relies on a partition operation: to partition an array an element called a pivot is selected. All elements smaller than the pivot are moved before it and all greater elements are moved after it. This can be done efficiently in linear time and in-place. The lesser and greater sublists are then recursively sorted.

Coding example : The algorithm is here.
  • This algorithm's average case and best case complexity is O(n log n). Worst case is O(n2). The choice of pivot can affects the performance of the algorithm. Prefer way of choosing the pivot is middle or the median of the input array.
9. Counting Sort : Counting sort is applicable when each input is known to belong to a particular set, S, of possibilities. The algorithm runs in O(|S| + n) time and O(|S|) memory where n is the length of the input. It works by creating an integer array of size |S| and using the ith bin to count the occurrences of the ith member of S in the input. Each input is then counted by incrementing the value of its corresponding bin. Afterward, the counting array is looped through to arrange all of the inputs in order.
  • The problem with counting sort is memory requirement  as we have to make S number of integer arrays and then stores their count values.

10. Bucket Sort : It is a generalization of counting sort. However the bucket sort's basic principal that is divide the element in to buckets opens up the comparison of this Sort with other sorts. If the size of the bucket in your implementation is 1 as in this example this effectively becomes a counting sort. However if the number of buckets are  two then it effectively becomes a quick sort.

Coding example :  The coding examples are here and here




No comments:

Post a Comment