Collision in hashing
- In this, the hash function is used to compute the index of the array.
- The hash value is used to store the key in the hash table, as an index.
- The hash function can return the same hash value for two or more keys.
- When two or more keys are given the same hash value, it is called a collision. To handle this collision, we use collision resolution techniques.
Collision resolution techniques
There are two types of collision resolution techniques.
- Separate chaining (open hashing)
- Open addressing (closed hashing)
In this technique, a linked list is created from the slot in which collision has occurred, after which the new key is inserted into the linked list. This linked list of slots looks like a chain, so it is called separate chaining. It is used more when we do not know how many keys to insert or delete.
- Its worst-case complexity for searching is o(n).
- Its worst-case complexity for deletion is o(n).
Advantages of separate chaining
- It is easy to implement.
- The hash table never fills full, so we can add more elements to the chain.
- It is less sensitive to the function of the hashing.
Disadvantages of separate chaining
- In this, cache performance of chaining is not good.
- The memory wastage is too much in this method.
- It requires more space for element links.
Open addressing is collision-resolution method that is used to control the collision in the hashing table. There is no key stored outside of the hash table. Therefore, the size of the hash table is always greater than or equal to the number of keys. It is also called closed hashing.
The following techniques are used in open addressing:
- Linear probing
- Quadratic probing
- Double hashing
In this, when the collision occurs, we perform a linear probe for the next slot, and this probing is performed until an empty slot is found. In linear probing, the worst time to search for an element is O(table size). The linear probing gives the best performance of the cache but its problem is clustering. The main advantage of this technique is that it can be easily calculated.
Disadvantages of linear probing
- The main problem is clustering.
- It takes too much time to find an empty slot.
In this, when the collision occurs, we probe for i2th slot in ith iteration, and this probing is performed until an empty slot is found. The cache performance in quadratic probing is lower than the linear probing. Quadratic probing also reduces the problem of clustering.
In this, you use another hash function, and probe for (i * hash 2(x)) in the ith iteration. It takes longer to determine two hash functions. The double probing gives the very poor the cache performance, but there has no clustering problem in it.by