If you have a file with one string per line, which sorting algorithm would you use to sort the file?
Assume the available memory RAM is of memory, where .
This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try
refreshing the page, (b) enabling javascript if it is disabled on your browser and,
finally, (c)
loading the
non-javascript version of this page
. We're sorry about the hassle.
The expected answer is External sort.
Bubble sort is only good for data sets that are already almost sorted.
Otherwise, it is time-consuming because it requires O( n 2 ) time, where n is the size of input data.
A size limit of 2 . 2 5 G B , tells you that, don’t bring all the data into memory. You only bring part of the data into memory.
A l g o r i t h m :
Assume that you have N M B of memory available, where N ≤ 1 0 0 .
Divide the 2 . 2 5 G B file into K chunks, where N × K = 2 . 2 5 G B .
Bring each chunk into memory and sort the N M B lines, as usual, using any known O ( N l o g N ) algorithm. Save the lines back to the file.
Bring the next chunk into memory and sort.
Once you’re done, merge them one by one.
The above algorithm is known as e x t e r n a l s o r t algorithm.