Finding duplicates in an array is a common programming problem that can be solved using various approaches. One efficient method is to use a bit array to keep track of the presence or absence of elements. In Java, you can use a BitSet to implement a bit array. A bit array, also known as a bitset or bit vector, is a data structure that represents an array of bits, where each bit can be either 0 or 1. It is a compact way to store information about the presence or absence of elements in a set.

Approach 1: Simple Bit Array Approach

The Bit Array approach for finding duplicates in an array is a technique that combines the principles of set-based operations and bitwise manipulation. The approach involves initializing a data structure, often a BitSet or boolean array, to represent the presence or absence of elements within a specified range. By iterating through the array and manipulating the bits or boolean values at corresponding positions, the algorithm efficiently identifies duplicate elements.

Algorithm

Step 1: Create a BitSet or boolean array named bitSet to represent the presence or absence of elements. Create an empty list named duplicates to store the duplicate elements.

Step 2: For each element value in the input array arr, calculate the adjusted index (index = value - 1) to map the array values to the zero-based index of the data structure.

Step 3: Identify and Store Duplicates: For each element value, if bitSet at index position is set then add value to the duplicates list.

Step 4: Else, set the bit at index position in bitSet. Return the duplicates list containing the duplicate elements.

Filename: FindDuplicatesUsingBitArray.java

import java.util.BitSet;

import java.util.ArrayList;

import java.util.List;

public class FindDuplicatesUsingBitArray {

    public static List<Integer> findAndStoreDuplicates(int[] arr) {

        BitSet bitSet = new BitSet();

        List<Integer> duplicates = new ArrayList<>();

        for (int value : arr) {

            int position = value - 1;  // Adjust to 0-based index

            if (bitSet.get(position)) {

                // Duplicate found, add to the output list

                duplicates.add(value);

            } else {

                // Set the bit corresponding to the current element

                bitSet.set(position);

            }

        }

        return duplicates;

    }

    public static void main(String[] args) {

        int[] arr = {1, 2, 3, 4, 5, 2, 7, 8, 3, 9, 10, 1};

        List<Integer> duplicates = findAndStoreDuplicates(arr);

        System.out.println("Duplicates in the array: " + duplicates);

    }

}

Output:

Duplicates in the array: [2, 3, 1]

Time Complexity

The time complexity of the Set or Bit Array approach for finding duplicates in an array is O(N), where N is the number of elements in the array. This linear time complexity arises from iterating through the array once and performing constant-time operations for each element.

Space Complexity

The space complexity of the Set or Bit Array approach is O(R), where R is the range of non-negative integers in the array. This complexity arises from the use of a BitSet or boolean array to represent the presence or absence of elements.

Approach 2: Optimized Bit Array Approach

The Optimized Bit Array approach for finding duplicates in an array introduces a more memory-efficient strategy by using an array, BIT_ARRAY, to represent the presence or absence of elements. Each index of BIT_ARRAY corresponds to 32 elements, allowing for the efficient addressing of a larger range of values.

Algorithm

Step 1: Create an integer array named BIT_ARRAY with a size of 1000, initialized to 0. Create an empty list named duplicates to store duplicate elements.

Step 2: Iterate through the array, for each element ARR[i] in the input array arr, Calculate the index in BIT_ARRAY corresponding to the address of ARR[i] (index = ARR[i] / 32).

Step 3: Find the bit within the index (BIT_NO) representing the address of ARR[i] (BIT_NO = ARR[i] & 31).

Step 4: Check for duplicates, retrieve the value at BIT_ARRAY[2^BIT_NO] and check if it is equal to 1. If true, the current element is a duplicate, so append ARR[i] to duplicates.

Step 5: If false, set BIT_ARRAY[2^BIT_NO] to 1. Return the list duplicates containing the duplicate elements.

Filename: FindDuplicatesUsingOptimizedBitArray.java

import java.util.ArrayList;

import java.util.List;

public class FindDuplicatesUsingOptimizedBitArray {

    public static List<Integer> findAndStoreDuplicates(int[] arr) {

        int[] BIT_ARRAY = new int[1000];

        List<Integer> duplicates = new ArrayList<>();

        for (int i = 0; i < arr.length; i++) {

            int index = arr[i] / 32;

            int bitNo = arr[i] & 31;

            int val = BIT_ARRAY[index] & (1 << bitNo);

            if (val != 0) {

                // Duplicate found, add to the output list

                duplicates.add(arr[i]);

            } else {

                // Set the bit corresponding to the current element

                BIT_ARRAY[index] |= (1 << bitNo);

            }

        }

        return duplicates;

    }

    public static void main(String[] args) {

        int[] arr = {1, 2, 3, 4, 5, 2, 7, 8, 3, 9, 10, 1};

        List<Integer> duplicates = findAndStoreDuplicates(arr);

        System.out.println("Duplicates in the array: " + duplicates);

    }

}

Output:

Duplicates in the array: [2, 3, 1]

Time Complexity

The time complexity of the Optimized Set or Bit Array approach for finding duplicates in an array is O(N), where N is the number of elements in the array. This linear time complexity arises from iterating through the array once, and the operations performed for each element are constant time.

Space Complexity

The space complexity is O(1), which means it is constant. The primary data structure used is the BIT_ARRAY, which has a fixed size of 1000. The size of this array is independent of the size of the input array arr. As a result, the space required remains constant, making the algorithm space-efficient.

Approach 3: BitSet and HashSet

This approach involves maintaining a BitSet (or Bit Array) to mark the presence or absence of elements in the array. The BitSet is used to identify and store unique duplicates efficiently. The algorithm iterates through the array, checks for duplicates using the BitSet, and stores unique duplicates in a HashSet. Finally, the unique duplicates are printed.

Algorithm

Step 1: Initialize Bit Array: Create an instance of the DuplicateFinder class, which initializes a bit array (bitArray) with a size based on the given array's maximum possible value.

Step 2: Create a HashSet named Duplicates to store unique duplicates found during the iteration.

Step 3: Iterate Through the Input Array: For each element inputArray[i] in the given input array: Calculate currentNumber as inputArray[i] - 1 to adjust for 1-based indexing.

Step 4: Check if the corresponding bit for currentNumber is already set in the bit array using isBitSet method. If set, add currentNumber + 1 to the Duplicates HashSet. If not set, set the corresponding bit using the setBit method.

Step 5: Print the contents of the Duplicates HashSet, which contains unique duplicate values. In the main method, create an example input array (inputArray). Call the findAndPrintDuplicates method with the input array.

Filename: DuplicateFinder.java

import java.util.HashSet;

public class DuplicateFinder {

    private int[] bitArray;

    public DuplicateFinder(int arraySize) {

        bitArray = new int[(arraySize >> 5) + 1];

    }

    public boolean isBitSet(int position) {

        int index = (position >> 5);

        int bitNumber = (position & 0x1F);

        return (bitArray[index] & (1 << bitNumber)) != 0;

    }

    public void setBit(int position) {

        int index = (position >> 5);

        int bitNumber = (position & 0x1F);

        bitArray[index] |= (1 << bitNumber);

    }

    public static void findAndPrintDuplicates(int[] inputArray) {

        DuplicateFinder duplicateFinder = new DuplicateFinder(320000);

        HashSet<Integer> Duplicates = new HashSet<>();

        for (int i = 0; i < inputArray.length; i++) {

            int currentNumber = inputArray[i] - 1;

            if (duplicateFinder.isBitSet(currentNumber)) {

                // Duplicate found, add to the set

                     Duplicates.add(currentNumber + 1);

            } else {

                duplicateFinder.setBit(currentNumber);

            }

        }

        // Print the duplicates

        System.out.print("Duplicates in the array: ");

        for (int duplicate : Duplicates) {

            System.out.print(duplicate + " ");

        }

    }

    public static void main(String[] args) {

        int[] inputArray = {1, 6, 4, 10, 9, 5, 4, 5, 6, 4};

        findAndPrintDuplicates(inputArray);

    }

}

Output:

Duplicates in the array: 4 5 6

Time Complexity

The time complexity of the provided algorithm is O(N), where N is the size of the input array. The for loop iterates through each element in the input array once, performing constant-time operations within the loop. The operations inside the loop involve bitwise operations (isBitSet, setBit), which have constant time complexity.

Space Complexity

The space complexity is primarily determined by the size of the bit array. The space complexity is O(M), where M is the maximum possible value in the input array. The HashSet Duplicates also contributes to the space complexity, but its size is determined by the number of unique duplicates, which is usually smaller than the size of the input array.

Conclusion

In conclusion, the Bit Array approach offers an elegant solution to the problem of finding duplicates, particularly when memory efficiency is crucial. It strikes a balance between time complexity and space complexity, making it well-suited for scenarios where a compact representation of element presence is required.

← Prev Next →

Java Tutorial Index

Java Loops

Java Programs

Java Sorting

Java OOPs Concepts

Java Strings

Java Exceptions

Garbage Collection

Multithreading

Java IO

Serialization

Networking

AWT

Swing

Java Collections

Java Generics

Java Annotations

Java JDBC

Java Differences

How to

Java 8 Features

Java 9 Features

Java 12

Java 13

Java 14

Java 15

Java 16

Java 17

Java Math Methods

Java String Methods

Java Conversion

Java Keywords

Java Problems

Java Questions

Java Interview Questions

Misc

Find Duplicates Using Bit Array in Java

Approach 1: Simple Bit Array Approach

Algorithm

Approach 2: Optimized Bit Array Approach

Algorithm

Approach 3: BitSet and HashSet

Algorithm

Conclusion