Convert Float to Int in Python using Pandas
Introduction
To play with huge amounts of data, in python we require a tool. The tool which is available in Python is pandas. A panda is an open-source library. It is used to analyze huge datasets without any difficulty. This is the biggest advantage of pandas.
By using pandas a DataFrame can be created. A DataFrame is a table which contains all the contents of a dataset. This DataFrame helps us to visualize the ugly looking dataset into a good-looking table. So, now let us learn about how to convert the data types of the contents present in the dataset.
Creation of DataFrames
To create a DataFrame, there are few steps. They are
- Install pandas (ignore if available)
The command to install pandas is:
pip install pandas
or
python –m pip install pandas
- Open Python idle window or any Python compiler
- The first thing we have to do is to import pandas
We can import pandas into the program with a command import pandas as pd
- Create a dictionary or list or tuple for creating a DataFrame. If we have a dataset, we export it into the program with a command like:
ds=pd.read_csv(‘File-Name’)
- Now with a command like:
df=pd.DataFrame(dict_name/tuple_name/list_name,columns=’Col_1’,’Col_2’….)
Example:
# step-3 importing pandas
import pandas as pd
#step-4 creating a dictionary for my convenience
DataSet={
'players' : ['ROOT','PUJARA','ROHIT'],
'performance':[564,227,368],
'average':[94.00,32.43,52.57]
}
#step-5 creating a data frame
df=pd.DataFrame(DataSet);
print(df)
print(df.dtypes)
Output:
players performance average
0 ROOT 564 94.00
1 PUJARA 227 32.43
2 ROHIT 368 52.57
players object
performance int64
average float64
dtype: object
Now we have learnt how to create a DataFrame in Python using pandas. Now we should convert the data types of the given dataset.
Here, we have already mentioned the column names in the dictionary itself. So, there is no need for creating the names for columns in the DataFrame block.
Our first conversion will be conversion of float values to integer values.
- Explicit Type Conversion:
This is the first way of using the explicit type conversion method. We are going to take a loop and perform explicit type conversion method. As this is the beginner’s way of changing the data_type of given data.
Let us understand this method with the help of an example.
Example:
# step-3 importing pandas
import pandas as pd
#step-4 creating a dictionary for my convenience
DataSet={
'players' : ['ROOT','PUJARA','ROHIT'],
'performance':[564,227,368],
'average':[94.00,32.43,52.57]
}
#step-5 creating a data frame
df=pd.DataFrame(DataSet);
print(df)
print('average datatype is',df['average'].dtypes)
for i in range(len(df['average'])):
df['average'][i]=int(df['average'][i])
print(df['average'].values)
print('average datatype is',df['average'].dtypes)
Output:
players performance average
0 ROOT 564 94.00
1 PUJARA 227 32.43
2 ROHIT 368 52.57
average datatype is float64
Warning (from warnings module):
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
[94. 32. 52.]
average datatype is float64
But this is method is not suitable and does not yield good results. See the values [94.00,32.43,52.57] have been changed into [94. 32. 52.]. Here only the value after the given number is erased but the data type is remained same as before.
So do not prefer this method of changing datatypes in a DataFrame. A warning is occurred in the compiler because the DataFrame is read only type. So if we change the values of the DataFrame, it is against the rules of DataFrame. So, avoid this method.
- astype() inbuilt Method:
This method helps us in changing the datatype of float to int directly in the DataFrame. astype() is inbuilt method which is present in the pandas library. This inbuilt method helps us to convert float to int, int to float etc.
Syntax:
This is the syntax for astype() method:
DataFrame_name[‘column_name’]= DataFrame_name[‘column_name’].astype(data_type)
Example:
print(df)
print('average datatype is',df['average'].dtypes)
df['average']=df['average'].astype(int);
print(df.dtypes)
print(df)
Output:
players performance average
0 ROOT 564 94.00
1 PUJARA 227 32.43
2 ROHIT 368 52.57
average datatype is float64
players object
performance int64
average int32
dtype: object
players performance average
0 ROOT 564 94
1 PUJARA 227 32
2 ROHIT 368 52
The astype() inbuilt method can be applied for multiple columns too.
Syntax (for multiple columns):
DataFrame_name=DataFrame_name.astype(
{‘column1’:datatype1 , ‘column2’: datatype2,…}
)
Example:
print(df)
df=df.astype({'average':int,'performance':float});
print(df.dtypes)
print(df)
Output:
players performance average
0 ROOT 564 94.00
1 PUJARA 227 32.43
2 ROHIT 368 52.57
average datatype is float64
players object
performance float64
average int32
dtype: object
players performance average
0 ROOT 564.0 94
1 PUJARA 227.0 32
2 ROHIT 368.0 52
Conclusion:
This is the way how a datatype can be changed from float to int inside a DataFrame only using pandas.