Hashing in DBMS: Hashing is the technique of the database management system, which directly finds the specific data location on the disk without using the concept of index structure.
In the database systems, data is stored at the blocks whose data address is produced by the hash function. That location of memory where hash files stored these records is called as data bucket or data block.
In this technique, the hash function uses that column, which contains the primary key constraint for generating the address of the data block.
Why is Hashing important?
Following are the various situations when we need to apply hashing in DBMS:
- When the data in the database is so massive, it is difficult to search all the index values, and then we require to reach the desired data block’s destination.
- Without using the indexing technique, it is an ideal method to find the direct location of the data.
- This technique is also helpful for implementing dictionaries.
Essential Terms used in Hashing Technique
Following are the various terms or terminologies used in this technique:
- Bucket Overflow: Bucket overflow is that condition when buckets are filled with the hash result. It is the stage of fatal for any static function.
- Data Bucket: That location of memory where the hash file stored the records. It is also referred to as a Unit of storage.
- Double Hashing: It is a computer programming method, which is similar to the method of linear probing.
- Hash function: This mapping function matches all the set of search keys to those addresses where the actual records are located. It is an easy mathematics function.
- Hash Index: Hash Index is an address of the block of data. It comes from the prefix of the hash value.
- Key: A key in the Database Management system (DBMS) is a field or set of fields that helps the relational database users to uniquely identify the row/records of the database table.
- Linear Probing: It is a concept in which the next available block of data is used for inserting the new record instead of overwriting the older data block.
- Quadratic Probing: It is a method that helps users to determine or find the address of a new data bucket.
Types of Hashing
In the database management system, hashing is mainly categorized into the following two types:
1. Static Hashing
2. Dynamic Hashing
In this type of hashing, the address of the resultant data bucket always will be the same. Let’s take an example; if we want to generate the address for Product_ID = 76 using the mod (5) hash function, it always provides the result in the same bucket address as 3. In memory, the number of data buckets always remains the same or constant.
Operations of Static Hashing
Following are the four operations in the static hashing:
1. Search a Record
2. Insert a Record
3. Delete a Record
4. Update a Record
Search a Record
When we need to search a record, then the same hash function helps us retrieve the bucket’s address where the data is stored.
Insert a Record
When we need to insert the new record in the table, the hash function automatically generated the address bucket for the record based on the hash key.
Delete a Record
When we want to delete the record from the table, we will read or fetch the record, which we want to be deleted using the hash function. And, after that, we have to remove the record for that memory address.
Update a Record
When we want to update the existing record in the table, then, by using the hash function firstly, we have to search the record, which is to be updated. Then the data record is updated automatically.
Following are the two different methods which overcome the problem of bucket overflow situation:
1. Open Hashing
2. Closed Hashing
In this method of hashing, the next free data block is selected for entering the new record, rather than overwriting the data to the previous block. This mechanism or method is also known as linear probing.
Example of open hashing: Suppose C2 is a new record that we want to insert, then the hash function generates the address as 301 for the record C2. But the generated address is already full. So, this method searches the next free data bucket, which is 601 and allocates the record C2 to it.
In this method of hashing, when the data bucket is filled with data, then the new bucket is selected for the same hash result and is linked with the previously filled bucket. This mechanism is called as Overflow-Chaining.
Example of Closed Hashing: Suppose C3 is a new record, which we want to insert into the table. Then, the static hash function automatically generates the address as 130. But, this data bucket is full and cannot receive new data for storing. So, closed hashing adds the new data bucket at the end of 130 and automatically links it. And, then the new C3 record is entered into the new data bucket.
This type of hashing is introduced to solve the basic problem of static hashing. The problem of static hashing technique is that it does not reduce or enlarge its size dynamically according to the database size.
In this type of hashing, the data buckets are dynamically added or removed as the records in the database increase or decrease. This type of hashing is also called as extended hashing. The hash function is created for generating a large number of values.