Data Structures Tutorial

Data Structures Tutorial Asymptotic Notation Structure and Union Array Data Structure Linked list Data Structure Type of Linked list Advantages and Disadvantages of linked list Queue Data Structure Implementation of Queue Stack Data Structure Implementation of Stack Sorting Insertion sort Quick sort Selection sort Heap sort Merge sort Bucket sort Count sort Radix sort Shell sort Tree Traversal of the binary tree Binary search tree Graph Spanning tree Linear Search Binary Search Hashing Collision Resolution Techniques

Misc Topic:

Priority Queue in Data Structure Deque in Data Structure Difference Between Linear And Non Linear Data Structures Queue Operations In Data Structure About Data Structures Data Structures Algorithms Types of Data Structures Big O Notations Introduction to Arrays Introduction to 1D-Arrays Operations on 1D-Arrays Introduction to 2D-Arrays Operations on 2D-Arrays Strings in Data Structures String Operations Application of 2D array Bubble Sort Insertion Sort Sorting Algorithms What is DFS Algorithm What Is Graph Data Structure What is the difference between Tree and Graph What is the difference between DFS and BFS Bucket Sort Dijkstra’s vs Bellman-Ford Algorithm Linear Queue Data Structure in C Stack Using Array Stack Using Linked List Recursion in Fibonacci Stack vs Array What is Skewed Binary Tree Primitive Data Structure in C Dynamic memory allocation of structure in C Application of Stack in Data Structures Binary Tree in Data Structures Heap Data Structure Recursion - Factorial and Fibonacci What is B tree what is B+ tree Huffman tree in Data Structures Insertion Sort vs Bubble Sort Adding one to the number represented an array of digits Bitwise Operators and their Important Tricks Blowfish algorithm Bubble Sort vs Selection Sort Hashing and its Applications Heap Sort vs Merge Sort Insertion Sort vs Selection Sort Merge Conflicts and ways to handle them Difference between Stack and Queue AVL tree in data structure c++ Bubble sort algorithm using Javascript Buffer overflow attack with examples Find out the area between two concentric circles Lowest common ancestor in a binary search tree Number of visible boxes putting one inside another Program to calculate the area of the circumcircle of an equilateral triangle Red-black Tree in Data Structures Strictly binary tree in Data Structures 2-3 Trees and Basic Operations on them Asynchronous advantage actor-critic (A3C) Algorithm Bubble Sort vs Heap Sort Digital Search Tree in Data Structures Minimum Spanning Tree Permutation Sort or Bogo Sort Quick Sort vs Merge Sort Boruvkas algorithm Bubble Sort vs Quick Sort Common Operations on various Data Structures Detect and Remove Loop in a Linked List How to Start Learning DSA Print kth least significant bit number Why is Binary Heap Preferred over BST for Priority Queue Bin Packing Problem Binary Tree Inorder Traversal Burning binary tree Equal Sum What is a Threaded Binary Tree? What is a full Binary Tree? Bubble Sort vs Merge Sort B+ Tree Program in Q language Deletion Operation from A B Tree Deletion Operation of the binary search tree in C++ language Does Overloading Work with Inheritance Balanced Binary Tree Binary tree deletion Binary tree insertion Cocktail Sort Comb Sort FIFO approach Operations of B Tree in C++ Language Recaman’s Sequence Tim Sort Understanding Data Processing Applications of trees in data structures Binary Tree Implementation Using Arrays Convert a Binary Tree into a Binary Search Tree Create a binary search tree Horizontal and Vertical Scaling Invert binary tree LCA of binary tree Linked List Representation of Binary Tree Optimal binary search tree in DSA Serialize and Deserialize a Binary Tree Tree terminology in Data structures Vertical Order Traversal of Binary Tree What is a Height-Balanced Tree in Data Structure Convert binary tree to a doubly linked list Fundamental of Algorithms Introduction and Implementation of Bloom Filter Optimal binary search tree using dynamic programming Right side view of binary tree Symmetric binary tree Trim a binary search tree What is a Sparse Matrix in Data Structure What is a Tree in Terms of a Graph What is the Use of Segment Trees in Data Structure What Should We Learn First Trees or Graphs in Data Structures All About Minimum Cost Spanning Trees in Data Structure Convert Binary Tree into a Threaded Binary Tree Difference between Structured and Object-Oriented Analysis FLEX (Fast Lexical Analyzer Generator) Object-Oriented Analysis and Design Sum of Nodes in a Binary Tree What are the types of Trees in Data Structure What is a 2-3 Tree in Data Structure What is a Spanning Tree in Data Structure What is an AVL Tree in Data Structure Given a Binary Tree, Check if it's balanced B Tree in Data Structure Convert Sorted List to Binary Search Tree Flattening a Linked List Given a Perfect Binary Tree, Reverse Alternate Levels Left View of Binary Tree What are Forest Trees in Data Structure Compare Balanced Binary Tree and Complete Binary Tree Diameter of a Binary Tree Given a Binary Tree Check the Zig Zag Traversal Given a Binary Tree Print the Shortest Path Given a Binary Tree Return All Root To Leaf Paths Given a Binary Tree Swap Nodes at K Height Given a Binary Tree Find Its Minimum Depth Given a Binary Tree Print the Pre Order Traversal in Recursive Given a Generate all Structurally Unique Binary Search Trees Perfect Binary Tree Threaded Binary Trees Function to Create a Copy of Binary Search Tree Function to Delete a Leaf Node from a Binary Tree Function to Insert a Node in a Binary Search Tree Given Two Binary Trees, Check if it is Symmetric A Full Binary Tree with n Nodes Applications of Different Linked Lists in Data Structure B+ Tree in Data Structure Construction of B tree in Data Structure Difference between B-tree and Binary Tree Finding Rank in a Binary Search Tree Finding the Maximum Element in a Binary Tree Finding the Minimum and Maximum Value of a Binary Tree Finding the Sum of All Paths in a Binary Tree Time Complexity of Selection Sort in Data Structure How to get Better in Data Structures and Algorithms Binary Tree Leaf Nodes Classification of Data Structure Difference between Static and Dynamic Data Structure Find the Union and Intersection of the Binary Search Tree Find the Vertical Next in a Binary Tree Finding a Deadlock in a Binary Search Tree Finding all Node of k Distance in a Binary Tree Finding Diagonal Sum in a Binary Tree Finding Diagonal Traversal of The Binary Tree Finding In-Order Successor Binary Tree Finding the gcd of Each Sibling of the Binary Tree Greedy Algorithm in Data Structure How to Calculate Space Complexity in Data Structure How to find missing numbers in an Array Kth Ancestor Node of Binary Tree Minimum Depth Binary Tree Mirror Binary Tree in Data Structure Red-Black Tree Insertion Binary Tree to Mirror Image in Data Structure Calculating the Height of a Binary Search Tree in Data Structure Characteristics of Binary Tree in Data Structure Create a Complete Binary Tree from its Linked List Field in Tree Data Structure Find a Specified Element in a binary Search Tree Find Descendant in Tree Data Structure Find Siblings in a Binary Tree Given as an Array Find the Height of a Node in a Binary Tree Find the Second-Largest Element in a Binary Tree Find the Successor Predecessor of a Binary Search Tree Forest of a Tree in Data Structure In Order Traversal of Threaded Binary Tree Introduction to Huffman Coding Limitations of a Binary Search Tree Link State Routing Algorithm in Data Structure Map Reduce Algorithm for Binary Search Tree in Data Structure Non-Binary Tree in Data Structure Quadratic Probing Example in Hashing Scope and Lifetime of Variables in Data Structure Separate Chaining in Data Structure What is Dynamic Data Structure Separate Chaining vs Open Addressing Time and Space Complexity of Linear Data Structures Abstract Data Types in Data Structures Binary Tree to Single Linked List Count the Number of Nodes in the Binary Tree Count Total No. of Ancestors in a Binary Search Tree Elements of Dynamic Programming in Data Structures Find cost of tree with prims algorithm in data structures Find Preorder Successor in a Threaded Binary Tree Find Prime Nodes Sum Count in Non-Binary Tree Find the Right Sibling of a Binary Tree with Parent Pointers Find the Width of the Binary Search Tree Forest trees in Data Structures Free Tree in Data Structures Frequently asked questions in Tree Data Structures Infix, Postfix and Prefix Conversion Time Complexity of Fibonacci Series What is Weighted Graph in Data Structure What is the Advantage of Linear Search?

Hashing and its Applications

Hashing

Hashing refers to transforming plain text data in such a way that even if it is leaked for some reason, no one would be able to make sense of it. The encrypted data is used to store data in various data structures where the information is not supposed to be decoded or needs to be stored safely.

As the name suggests, Hashing is widely used to secure various data in different programs. Hashing the data requires a key, and the complete encryption and decryption process depends on it. The value referred to as the 'key' of your data is usually a mixed and reduced length form of the original value. The reduced length value or the key is generally used for searching the individual or particular records, making the program much faster and simpler.

Hashing can also be used for the database indexing process, deriving the data from it using keys and making it way faster and more secure. In general, a key can be anything safe which could be used to identify the information; it could be the combination of symbols, digits or alphabets; it can even be as unique as a fingerprint. There are various applications of hashing; let us look at this in more detail.

Various applications of hashing

1. Password verification

2. Integrity Checks/Message digests

3. Data Structures

4. Compiler Operation

5. Robin-Karp algorithm

Password Verification

Ever wondered what if the database of any social media website gets hacked? Would the hacker have access to your passwords and retrieve all your information? The apparent response to this question is: Absolutely, NO!

As of today, almost every good developer uses hash functions to secure their data in the programs. Similarly, the passwords of all the users of such websites are not just stored in the database as plain texts. Neither is encrypted as encrypted data can be accessed by decrypting it, and the only secure way to keep the information is to hash them.

For example:

A one-way hashing function MD5 is applied so that the stored value of your password is hashed. Usually, this function is not used for passwords as storing the data would be simple, but once the password is hashed, there is no possible way to retrieve or get the original value of your password. To verify the password, you would have to try every possible password until the hash value of the attempted password matches with the stored hash value. You can find various functions like MD5 in the hashlib module of Python documentation.

But the significant advantages of the MD5 function that you can see here are that:

  1. Whatever may be the length of the input or the plain text, the hashed value is of fixed size and does not depend on your input format
  2. Uniqueness; for every unique input, the mapped hash value is also distinct.

Integrity checks or message digest

  • Hash collisions generally refer to the matching of two hash values. Hash collisions in a well-programmed hash algorithm are rare; hence, hashing functions can be easily used for integrity checks.
  • Sometimes the user generally needs to check whether the two files received have the same content. Still, the problem in this is that to check each and every sub directly of the file using any traditional approach would be very messy and time-consuming, but by hashing, this can be done quickly.
  • Another example could be of storing data online. While storing or uploading some of your data on various cloud services available today, integrity checks are essential to know in case someone disrupts your file content or not.
  • This can be done by calculating the hash value before uploading the file on a particular cloud storage platform.
  • After uploading, download your file again and calculate its hash value. Suppose the file would have been tempered by someone; that 'change' would change the hash value of the new file.
  • Tempering with any file and not changing its hash value is almost not possible. This is a simple way to check the integrity of such platforms.
  • One of many cryptographic hash algorithms used for integrity checks or message digests is SHA256.

Data structures

Various data structures in different programming languages use hash tables. The main motive behind the approach is to reduce the time complexity of searching data from such data structures. A key-value pair is created and stored in a table in such a way that each key against every value is supposed to be unique. Two different keys could point to the same value. Some examples of data structures which use the same approach are:

  1. Unordered set and an Unordered map in C++.
  2. Dictionary in Python.
  3. Hash set and Hash map in Java.

Linking file name and path together

As we all know that while dealing with several tasks searching a particular file stored in your system is a commonly used operation for the search operations, Hash values are always proven to be the better choice. So connecting or linking a directory's path with its name would help locate it in the system quickly.

As for today, various model file systems allow you to store many files in the same folder or directly various operations like following parts listing files or finding five from a directory usually can say a lot about the file system's performance.

Now the common question is how one would design such a file system that allows a higher file capacity for each directory, and the performance will be unbeatable. The valid response for this problem would be using File Name Hashing. Simply describing File Name Hashing is the process of deriving a reproducing, known path from the name of your file. For example, a file named JavaTpoint.GIF would be stored in the file system as:

 /J/Ja/JavaTpoint.GIF

This approach would provide you with a lot of subdirectories like; say; if we even limit the number of files in each directory to 1000, it would provide us with a capacity of

[26 *(26*26) ]* 1000 = 17576K files.

This technique can be used to solve various problems in traditional file systems.

But, using just the name of your file directly to create the path would mess up the distribution in your file system, making it unbalanced. For instance, if almost 90% of files start with the letters 'WA', this approach will put all of them in the same directory.

W/WA/ files.

Here comes the application of Hashing. Using the hash code of the value of the string object 'file name' would be an appropriate way to solve this problem.

Hashing file names is an unbeatable technique for creating a discrete or diverse file system with a wide range of directories and convenient functions for accessing data as Hashing provides constant time searching.

Hence while searching for a particular file, specify its path by calculating its hash code and using it to access the file. It is used in various professional file systems today.

Compiler operation

Everyone uses compilers. But you may not know that behind the scenes of compiler design, Hashing is one of the critical techniques used to make it efficient and faster. Did the question ever cross your mind? How does your compiler understand or identify every keyword in a constant amount of time? Whatever may be the size of the keyword, how is it identified in a matter of nanoseconds?

As of now, we can expect you to know the answer to this.

YES, you are absolutely correct!

Hashing is used here too.

Two process keywords differently and identify other identifiers or literals in your program. An efficient way to distinguish between the keywords and the literals is required to compile the program successfully. For this purpose,  all the keywords are already stored in a set while designing the compiler. The set is implemented here using a hash table, assigning a unique hash code to every keyword and providing constant time search efficiency.

Compilers hence work efficiently and provide you with a great experience.

Rabin Karp algorithm

One of the several vital applications of hashing is this Rabin Karp algorithm. This algorithm is a pretty famous technique used for string searching, and it is used to find any set of patterns from a string. Plagiarism checkers today use this algorithm to find the similarities between different texts, efficiently saving your time. Rabin Karp algorithm uses hash functions to identify similar string patterns.