How to check nan values in pandas

Pandas is one of the best libraries in python used to work with data sets containing functions for analyzing, cleaning, exploring, and manipulating the given data. The definition of "Pandas" is a reference from both "Panel Data" and "Python Data Analysis" and was created by Wes McKinney in 2008.

Python with Pandas is used in many fields, including academic and commercial, finance, economics, Statistics, analytics, etc.

Let us see how to check NAN values in Pandas.

NAN represents a missing value in the data, NAN- Not A Number. This is one of the special values (float) which cannot be converted to any other form, like an integer or string. Having a NAN will create a big problem in data analysis. So it is very important to check NAN values and clear them to get the required output. There are two methods used to check NAN values. They are

isnull().values.any() method
isnull().values.sum() method
now let us understand these methods

Method 1: isnull().values.any() method

In this method, we get the exact position of the NAN value present. We also have to import NumPy because we deal with numbers. We will look at an example code below.

Example

# importing required libraries
import pandas as pd
import numpy as np
 
 
num = {'Int': [15, 25, 35, 60, 55, np.nan,
                    75, np.nan, 95, 120, np.nan]}
 
# Creating the data frame
df = pd.DataFrame(num, columns=['Int'])
 
# using the method
check_nan = df['Int'].isnull().values.any()
 
# printing the result
print(check_nan)

Output

How to check nan values in pandas

In order to get the exact position of the NAN value, we use another command [ isnull()]

Example

# importing required libraries
import pandas as pd
import numpy as np
 
 
num = {'Int': [15, 25, 35, 60, 55, np.nan,
                    75, np.nan, 95, 120, np.nan]}
 
# Creating the data frame
df = pd.DataFrame(num, columns=['Int'])
 
# using the method
check_nan = df['Int'].isnull()
 
# printing the result
print(check_nan)

Output:

How to check nan values in pandas

Here instead of have all the code same we just changed df[‘Int’].isnull().values.any() into only df[‘Int’].isnull() to get the positions of the NAN values.

For the above code, we first imported the panda's library and NumPy. Next, we created a data set called "num" and designed it with an integer. We have defined the data set as a data frame. Then we used the method isnull().values to check NAN values.any(), then printed the check, NAN.

Method 2: isnull().sum() method

In this method, we use the isnull().sum(). Using this method, we get the number of NAN values in the given data set. We need NumPy in this method also cause we are dealing with numbers. So now, let us look at an example code to check the method used.

Example

# importing required libraries
import pandas as pd
import numpy as np
 
 #creating a data set 
num = {'Int': [15, 25, 35, 48, 50, np.nan,
                    75, np.nan, 98, 120, np.nan]}
 
# Creating the data frame
df = pd.DataFrame(num, columns=['Int'])
 
# applying the required  method
count_nan = df['Int'].isnull().sum()
 
# printing the number of values present
# in the column
print('Number of NaN values present: ' + str(count_nan))

Output:

How to check nan values in pandas

We created a data set and then imported the required libraries called pandas and NumPy. Then we made a data set and initialized a data frame with the datatype. Now we used the isnull().sum() method to the data set, and last we print the code. After execution, we get the required number of NAN values in the given data set. So in the input data set, we mentioned 3 NAN values; hence, the code's output will be given like "3 NAN values present". And this is how we use this method.

Conclusion

These are a few methods to check the NAN values in the given dataset. In both methods, we get the NAN values, but in the first method, we can only know that NAN values are present, and also, by changing the command, we know the positions of those NAN values. Coming to the second method, we will see the number of NAN values present in the data set. Both these methods can be used without any errors in the execution. These methods are one of the easiest methods used to check the NAN values from the data set.