How to change the names of Columns in Python
Introduction
To play with huge amounts of data, in Python we require a tool. The tool which is available in Python is Pandas. Pandas is an open-source library. It is used to analyze huge datasets without any difficulty. This is the biggest advantage of Pandas.
By using Pandas, a Data Frame can be created. A Data Frame is a table which contains all the contents of a dataset. This Data Frame helps us to visualize the ugly looking dataset into a good-looking table. So, now let us learn about how to convert the data types of the contents present in the dataset.
Creation of Data Frames
To create a Data Frame, there are few steps. They are
1. Install Pandas (ignore if available)
The command to install Pandas is
pip install pandas
or
python –m pip install pandas
2. Open Python idle window or any Python compiler
3. The first thing we have to do is to import Pandas
We can import Pandas into the program with a command import Pandas as pd
4. Create a dictionary or list or tuple for creating a Data Frame. If we have a dataset we export it into the program with a command like
ds = pd.read_csv(‘File-Name’)
5. Now with a command like
df = pd.DataFrame(dict_name/tuple_name/list_name,columns=’Col_1’,’Col_2’….)
Example:
'''
1.) downloaded pandas library
2.) Opened pandas library
3.) Import pandas library
'''
import pandas as pd
# 4.) A Dataset needs to be created
DataSet={
‘Players’: ['Root','Smith','Kohli','Kane'],
‘Matches’: [24,11,15,6],
‘Runs’: [2595, 779,756,491],
‘AVG’ :[60.25,45.82,29.07,49.10],
‘100s’: [11,1,0,1]
}
#5.) Data Frame Creation
df = pd.DataFrame(DataSet)
print(df)
Output:
Players Matches Runs AVG 100s
0 Root 24 2595 60.25 11
1 Smith 11 779 45.82 1
2 Kohli 15 756 29.07 0
3 Kane 6 491 49.10 1
Now we have created a Data set and also a Data Frame using pandas. Now let us know how to change the names of the rows using some pre-defined methods.
The ways to change the names of columns are:
1. Changing the Name of Column using a Method Columns
This is one of the methods used to change the names columns. This is in-built method used in Data Frames.
Syntax:
DataFrame_name.columns [‘col1_name’, ‘col2_name’,…………………]
Example:
Let us apply this method to the previous data set created.
print(df)
df.columns =['Sportsmen' , 'Matches_Played' , 'Runs_Scored' , 'Average' , 'Centuries' ]
print(df)
Output:
Players Matches Runs AVG 100s
0 Root 24 2595 60.25 11
1 Smith 11 779 45.82 1
2 Kohli 15 756 29.07 0
3 Kane 6 491 49.10 1
Sportsmen Matches_Played Runs_Scored Average Centuries
0 Root 24 2595 60.25 11
1 Smith 11 779 45.82 1
2 Kohli 15 756 29.07 0
3 Kane 6 491 49.10 1
Explanation:
The above output shows the change in the names of the columns.
Here,
Players -> Sportsmen
Matches -> Matches_Played
Runs -> Runs_Scored
AVG -> Average
100s -> Centuries
This is how this method is used to change the names of the columns
2. Rename Method
This is also in-built method present in a Data Frame. This method is also used to change the name columns present in the Data Frame
Syntax:
Data Frame name = Data Frame name .rename( columns = {‘old column name1’ : ‘new column name1}, ‘ old column name 2’ ,’ new columns name 2 } )
Example:
We are again using the same Data Frame created previously.
print(df)
df=df.rename( columns={'AVG': 'Average','100s':'Centuries'})
print(df)
Output:
Players Matches Runs AVG 100s
0 Root 24 2595 60.25 11
1 Smith 11 779 45.82 1
2 Kohli 15 756 29.07 0
3 Kane 6 491 49.10 1
Players Matches Runs Average Centuries
0 Root 24 2595 60.25 11
1 Smith 11 779 45.82 1
2 Kohli 15 756 29.07 0
3 Kane 6 491 49.10 1
This method is more convenient than previous method. This is because in the previous columns method we have to convert all the old column names to new column names.
Whereas in this rename method we can convert only specified column names with mentioning them.
3. Using Indexing Method
Here in this method, we will directly approach the column and change its method with the help assignment operator.
Syntax:
Data Frame Name. column.values[index value] =’ new column name’
Example:
print(df)
df.columns.values[3] = 'Average'
print(df)
Output:
Players Matches Runs AVG 100s
0 Root 24 2595 60.25 11
1 Smith 11 779 45.82 1
2 Kohli 15 756 29.07 0
3 Kane 6 491 49.10 1
Players Matches Runs Average 100s
0 Root 24 2595 60.25 11
1 Smith 11 779 45.82 1
2 Kohli 15 756 29.07 0
3 Kane 6 491 49.10 1
These are the ways through which we can change the names of columns in python.