Burst Sort Algorithm in C++
Tony P. Hoare and Charles M. Payne created the sophisticated sorting technique known as Burst Sort in 1997. It is a cross between the quicksort and radix sort sorting methods. Burst Sort is especially effective at sorting strings and has a faster duration of operation than either of its predecessor algorithms.
How Burst Sort works?
To accomplish a faster sorting time, Burst Sort integrates the two sorting algorithms, quicksort and radix sort. Algorithmic operations occur in two stages.
The components in the list are sorted in the first phase, known as quicksort, according to each element's first symbol. The components in the list are sorted in the second phase, known as the radix order step, according to the number of characters that remain in each component. In what ways does it differ from the radix sort? Due to its hybrid nature, BurstSort differs from Radix Sort.
String sorting methods that are cache-efficient include burstsort and its variations. Initially released in 2003, these are variations on the classic Radix sort that are quicker for huge quantities that contain standard strings. Later, optimizing versions of the algorithm were also published.
Burstsort algorithms retain string prefixes in a trie, and as end nodes, they employ growable collections of pointers that hold sorted, distinct suffixes (also called buckets). A few variations duplicate the string ends into the pails. The sort gets its name because the group "bursts" into tries when they get larger than a set threshold. To save memory, an updated version employs a bucket index containing fewer sub-buckets.
The majority of systems assign the task of sorting the bucket contents to multikey quicksort, which is a three-way radix petroleum quicksort extension. Sorting can be accomplished in a cache-efficient way by splitting the input into containers with common prefixes.
Burstsort was presented as a sort that is comparable to MSD radix sort. Still, it operates more quickly because it recognizes caching and stores related radixes close to one another because of unique trie structure features. It takes advantage of string characteristics that are typically found in the real world. Additionally, despite having an O(wn) time complexity (where w is the length of a word and n is the total amount of strings being sorted), asymptotically, it is equal to radix sort; however, because of its improved memory shipment, it typically performs twice as quickly on large string sets of information.
Example:
#include <iostream>
#include <vector>
using namespace std;
//Function for creating a single ordered array by combining two ordered arrays
vector<int> merge(vector<int>& a1, vector<int>& a2)
{
vector<int> a3;
int i = 0, j = 0;
while (i < a1.size() && j < a2.size()) {
if (a1[i] < a2[j])
a3.push_back(a1[i++]);
else
a3.push_back(a2[j++]);
}
while (i < a1.size())
a3.push_back(a1[i++]);
while (j < a2.size())
a3.push_back(a2[j++]);
return a3;
}
// A method for dividing a single array into two smaller arrays
vector<vector<int> > split(vector<int>& b)
{
vector<vector<int> > sb(2);
int mid = b.size() / 2;
for (int i = 0; i < mid; i++)
sb[0].push_back(b[i]);
for (int i = mid; i < b.size(); i++)
sb[1].push_back(b[i]);
return sb;
}
// Burstsort algorithm implementation function
vector<int> burstSort(vector<int>& b)
{
// Base case: return it if the array's size is 1.
if (b.size() == 1)
return b;
// Create two distinct sub-arrays out of the array.
vector<vector<int> > s = split(b);
// Make both sub-arrays sorted.
vector<int> a1 = burstSort(s[0]);
vector<int> a2 = burstSort(s[1]);
// Create a single-ordered array by combining the sorted arrays.
return merge(a1, a2);
}
// Driver code
int main()
{
vector<int> arr = { 12, 25, 32, 72, 28, 4, 6 };
arr = burstSort(arr);
for (int i = 0; i < arr.size(); i++)
cout << arr[i] << " ";
return 0;
}
Output:
Conclusion
The expenses associated with growing and popping buckets, however, represent a barrier. There are two issues with the string bursts that have been documented thus far: they only store pointers in a bucket and leave the keys where they were originally found. To start, the average key length varies greatly depending on the bucket. When buckets addressing long keys are able to be sorted inside the cache due to a low burst threshold, buckets holding shorter keys can explode inefficiently. Second, although pointers are localized in buckets, bucket sorting necessitates frequent trips to the keys, which stay scattered in memory at their initial locations.
Since only the inspected terminal pixels of every key are required, it is inefficient to bring all the keys referred by a bucket into the cache when burst thresholds are low. Moreover, every key has been used twice: once for bucket assignment and once more for cache memory loading following bucket sorting.language..