How to Calculate the Levenshtein Distance Between Two Strings in Java Using Recursion?
The Levenshtein Distance, also known as the "edit distance”. The Levenshtein Distance measures the minimum number of operations (insertion, deletion, or substitution) required to transform one string into another.
Given two strings, the task is to find the minimum number of edits (insertions, deletions, or substitutions) required to make them equal.
Example 1
Input: str1 = ”kitten” str2 = “sitting”
Output: 3
Explanation: str1 is converted into str2 by replacing ‘k’ with ‘s’, 'e' with 'i', and insert 'g' at the end of the first string.
Example 2
Input: str1 = “apple” str2 = “orange”
Output: 5
Explanation: str1 is converted into str2 by replacing 'a' with 'o', inserting 'r' and 'a', and adding 'n' and 'g' to transform the first string into the second.
Example 3
Input: str1 = “Intention” str2 = “execution”
Output: 5
Explanation: str1 is converted into str2 by Changing 'i' to 'e', inserting 'x', 'e', and 'u', and replacing 'n' with 'c'.
Example 4
Input: str1 = “abcd” str2 = “dcba”
Output: 4
Explanation: str1 is converted into str2 by replacing each character in one string with its corresponding character in the other requires 4 edits, 'a' with 'd', 'b' with 'c', 'c' with 'b', and 'd' with 'a'.
Approach: Using Recursion
Levenshtein Distance algorithm implemented using recursion, to calculate the minimum number of operations required to transform one string into another. The algorithm works by considering three possible operations: insertion, deletion, and substitution.
Algorithm
Step 1: Initialize a recursive function that calculates the Levenshtein Distance between the substrings of the input strings.
Step 2: Base cases are defined for when one of the strings becomes empty, where the distance is simply the length of the other string.
Step 3: If the last characters of the strings are equal, no operation is needed, and the function recurses on the remaining substrings.
Step 4: If the last characters are different, three recursive calls are made for insertion, deletion, and substitution, and the minimum distance among these operations is returned.
Step 5: Return the minimum distance required to transform the entire strings.
Filename: LevenshteinDistance.java
public class LevenshteinDistance { // Method to calculate Levenshtein Distance between two strings public static int calculateDistance(String str1, String str2) { return calculateDistanceHelper(str1, str2, str1.length(), str2.length()); } // Helper method for recursive calculation of Levenshtein Distance private static int calculateDistanceHelper(String str1, String str2, int m, int n) { // Base cases: if one string is empty, return the length of the other string if (m == 0) { return n; // If str1 is empty, insert all characters of str2 } if (n == 0) { return m; // If str2 is empty, delete all characters of str1 } // If last characters of the strings are equal, no operation needed if (str1.charAt(m - 1) == str2.charAt(n - 1)) { return calculateDistanceHelper(str1, str2, m - 1, n - 1); } // Consider all three operations: insertion, deletion, substitution int insert = calculateDistanceHelper(str1, str2, m, n - 1); // Insertion int delete = calculateDistanceHelper(str1, str2, m - 1, n); // Deletion int substitute = calculateDistanceHelper(str1, str2, m - 1, n - 1); // Substitution // Return the minimum distance among the three operations return 1 + Math.min(Math.min(insert, delete), substitute); } // Main method to test the Levenshtein Distance calculation public static void main(String[] args) { String str1 = "kitten"; String str2 = "sitting"; // Calculate and print the Levenshtein Distance int distance = calculateDistance(str1, str2); System.out.println("Levenshtein Distance between "" + str1 + "" and "" + str2 + "" is " + distance); } }
Output
Levenshtein Distance between "kitten" and "sitting" is 3
Time Complexity: The time complexity of the Levenshtein Distance algorithm using recursion is O(3^n), where n is the length of the input string. This is because for each character position in the strings, three recursive calls are made (insertion, deletion, substitution).
Space Complexity: The space complexity of the algorithm is also O(3^n), where n is the length of the input string. This is because recursive calls and the stack space required to store intermediate results during computation. However, this can be optimized using dynamic programming to store and reuse previously computed distances, reducing the space complexity to O(m*n).