What is the B+ Tree in Data Structures?
We all know that the B+ tree in data structures is nothing but just an extended version of the B tree. It allows the smooth working of all the operations such as insertion, deletion, traversing, and several others. They are generally known for their excellent application in storing and managing a large amount of data in the storage space rather than fitting them all into the main memory.
We already know the fact that the storage space of the main memory is very restricted, so in that case, the nodes present on the inside of the tree will definitely be stored on the main memory of the system while the leaf nodes that are present will definitely be stored in the secondary memory. The leaf nodes that are present on the B+ tree are generally stored in the form of a basic or singly linked list in order to make the usage and queries and operations performed by the same more and more efficient.
The nodes of the B+ tree that are initially present on the inside of the tree are mainly known as the index nodes of the tree. We know that in the B+ tree, the nodes are in a linked manner and so this is the reason why getting access to them becomes quite easy. There is one thing about the B+ tree which is that in these kinds of trees, there might be unnecessary keys present when it comes to search keys.
Why do we require a B+ tree?
- The B+ tree is known for its mechanism in acquiring the increment and decrement in the result.
- In B+ trees, the keys present in them are generally used for their assistance by initializing the search mechanism.
- In these kinds of trees, we can quite easily put a large amount of data on the page as in this, the data or records are not linked with the internal nodes and this is the reason why we can directly access the data on the tree and not on the leaf node.
- A complete full inspection of all the major records in a tree that basically acquires a single and basic linear pass as we know that all the leaf nodes present in a B+ tree are interconnected or linked with each other.
Uses
- These kinds of trees are used in database indexing.
- We can also perform multilevel indexing by making use of the indexing attribute of the B tree and it will prove to be very beneficial for our work.
- The B+ tree is best known for its managing skills as it uses partially occupied containers that eventually lead to maximized applications of various operations such as insertion and deletion.
- With the help of the B+ tree, it becomes a lot easier to collect an adequate amount of data, and that too within limited disk access.
- Keys are generally used for their major role in indexing attributes.
- We can perform the search operation quicker as the information that we want to access is stored in the leaf nodes.
- The major benefit is the height of the tree is in a systematic order and less as compared to the B tree.
- We can access the data stored in this kind of tree directly.
Properties
- The very first property is that all the nodes that are present in the tree are at the same magnitude or say level.
- Next, another trait of the B+ tree is that the root must have at least two children or nodes.
- In order to count the keys that are present in this kind of tree, we use the following formulas. To find out the maximum key present within a node we use m -1 and to find the least number of keys present within a node we use m / 2 – 1.
- Another trait of the B+ tree is that each and every node present in the tree can either acquire a maximum number of nodes that will be calculated by m children or a minimum number of nodes that will be calculated by m / 2 – 1.
Algorithm
In this section of the article, we will encounter the steps that will provide us with an idea of the working of the B+ tree.
- First, we have to call the search method and check if the data or records are on the B+ tree.
- Next, if there is a case where the current search matches with the given record then the exact same result is returned and provided to the user via the display.
- And if there is a case where the current search is not found in the given record then it will display on the screen and return the value “Record couldn’t be found”.
- The exact same rules are followed for other operations as well.
Implementation
#include <climits>
#include <fstream>
#include <iostream>
#include <sstream>
using namespace std;
int MAX = 3;
// BP node
class Node {
bool IS_LEAF;
int *key, size;
Node **ptr;
friend class BPTree;
public:
Node();
};
// BP tree
class BPTree {
Node *root;
void insertInternal(int, Node *, Node *);
Node *findParent(Node *, Node *);
public:
BPTree();
void search(int);
void insert(int);
void display(Node *);
Node *getRoot();
};
Node::Node() {
key = new int[MAX];
ptr = new Node *[MAX + 1];
}
BPTree::BPTree() {
root = NULL;
}
// Search operation
void BPTree::search(int x) {
if (root == NULL) {
cout << "Tree is empty\n";
} else {
Node *cursor = root;
while (cursor->IS_LEAF == false) {
for (int i = 0; i < cursor->size; i++) {
if (x < cursor->key[i]) {
cursor = cursor->ptr[i];
break;
}
if (i == cursor->size - 1) {
cursor = cursor->ptr[i + 1];
break;
}
}
}
for (int i = 0; i < cursor->size; i++) {
if (cursor->key[i] == x) {
cout << "Found\n";
return;
}
}
cout << "Not found\n";
}
}
// Insert Operation
void BPTree::insert(int x) {
if (root == NULL) {
root = new Node;
root->key[0] = x;
root->IS_LEAF = true;
root->size = 1;
} else {
Node *cursor = root;
Node *parent;
while (cursor->IS_LEAF == false) {
parent = cursor;
for (int i = 0; i < cursor->size; i++) {
if (x < cursor->key[i]) {
cursor = cursor->ptr[i];
break;
}
if (i == cursor->size - 1) {
cursor = cursor->ptr[i + 1];
break;
}
}
}
if (cursor->size < MAX) {
int i = 0;
while (x > cursor->key[i] && i < cursor->size)
i++;
for (int j = cursor->size; j > i; j--) {
cursor->key[j] = cursor->key[j - 1];
}
cursor->key[i] = x;
cursor->size++;
cursor->ptr[cursor->size] = cursor->ptr[cursor->size - 1];
cursor->ptr[cursor->size - 1] = NULL;
} else {
Node *newLeaf = new Node;
int virtualNode[MAX + 1];
for (int i = 0; i < MAX; i++) {
virtualNode[i] = cursor->key[i];
}
int i = 0, j;
while (x > virtualNode[i] && i < MAX)
i++;
for (int j = MAX + 1; j > i; j--) {
virtualNode[j] = virtualNode[j - 1];
}
virtualNode[i] = x;
newLeaf->IS_LEAF = true;
cursor->size = (MAX + 1) / 2;
newLeaf->size = MAX + 1 - (MAX + 1) / 2;
cursor->ptr[cursor->size] = newLeaf;
newLeaf->ptr[newLeaf->size] = cursor->ptr[MAX];
cursor->ptr[MAX] = NULL;
for (i = 0; i < cursor->size; i++) {
cursor->key[i] = virtualNode[i];
}
for (i = 0, j = cursor->size; i < newLeaf->size; i++, j++) {
newLeaf->key[i] = virtualNode[j];
}
if (cursor == root) {
Node *newRoot = new Node;
newRoot->key[0] = newLeaf->key[0];
newRoot->ptr[0] = cursor;
newRoot->ptr[1] = newLeaf;
newRoot->IS_LEAF = false;
newRoot->size = 1;
root = newRoot;
} else {
insertInternal(newLeaf->key[0], parent, newLeaf);
}
}
}
}
// Insert Operation
void BPTree::insertInternal(int x, Node *cursor, Node *child) {
if (cursor->size < MAX) {
int i = 0;
while (x > cursor->key[i] && i < cursor->size)
i++;
for (int j = cursor->size; j > i; j--) {
cursor->key[j] = cursor->key[j - 1];
}
for (int j = cursor->size + 1; j > i + 1; j--) {
cursor->ptr[j] = cursor->ptr[j - 1];
}
cursor->key[i] = x;
cursor->size++;
cursor->ptr[i + 1] = child;
} else {
Node *newInternal = new Node;
int virtualKey[MAX + 1];
Node *virtualPtr[MAX + 2];
for (int i = 0; i < MAX; i++) {
virtualKey[i] = cursor->key[i];
}
for (int i = 0; i < MAX + 1; i++) {
virtualPtr[i] = cursor->ptr[i];
}
int i = 0, j;
while (x > virtualKey[i] && i < MAX)
i++;
for (int j = MAX + 1; j > i; j--) {
virtualKey[j] = virtualKey[j - 1];
}
virtualKey[i] = x;
for (int j = MAX + 2; j > i + 1; j--) {
virtualPtr[j] = virtualPtr[j - 1];
}
virtualPtr[i + 1] = child;
newInternal->IS_LEAF = false;
cursor->size = (MAX + 1) / 2;
newInternal->size = MAX - (MAX + 1) / 2;
for (i = 0, j = cursor->size + 1; i < newInternal->size; i++, j++) {
newInternal->key[i] = virtualKey[j];
}
for (i = 0, j = cursor->size + 1; i < newInternal->size + 1; i++, j++) {
newInternal->ptr[i] = virtualPtr[j];
}
if (cursor == root) {
Node *newRoot = new Node;
newRoot->key[0] = cursor->key[cursor->size];
newRoot->ptr[0] = cursor;
newRoot->ptr[1] = newInternal;
newRoot->IS_LEAF = false;
newRoot->size = 1;
root = newRoot;
} else {
insertInternal(cursor->key[cursor->size], findParent(root, cursor), newInternal);
}
}
}
// Find the parent
Node *BPTree::findParent(Node *cursor, Node *child) {
Node *parent;
if (cursor->IS_LEAF || (cursor->ptr[0])->IS_LEAF) {
return NULL;
}
for (int i = 0; i < cursor->size + 1; i++) {
if (cursor->ptr[i] == child) {
parent = cursor;
return parent;
} else {
parent = findParent(cursor->ptr[i], child);
if (parent != NULL)
return parent;
}
}
return parent;
}
// Print the tree
void BPTree::display(Node *cursor) {
if (cursor != NULL) {
for (int i = 0; i < cursor->size; i++) {
cout << cursor->key[i] << " ";
}
cout << "\n";
if (cursor->IS_LEAF != true) {
for (int i = 0; i < cursor->size + 1; i++) {
display(cursor->ptr[i]);
}
}
}
}
// Get the root
Node *BPTree::getRoot() {
return root;
}
int main() {
BPTree node;
node.insert(5);
node.insert(15);
node.insert(25);
node.insert(35);
node.insert(45);
node.insert(55);
node.insert(40);
node.insert(30);
node.insert(20);
node.display(node.getRoot());
node.search(15);
}
Output:
