Pandas Commands

In this article, we will discuss Pandas commands, but we should know about the pandas library in Python before knowing this.

What is Pandas in Python?

Pandas is a library typically used for data analysis and manipulation in Python programming. It is a dataFrames in Python.

To manage tabular data, utilize Pandas. In particular, it offers the data format and management tools for time series and monetary tables. "panel data" is where the name "Pandas" originates. Pandas should be used extensively for machine learning model creation, data manipulation, and visualization.

  1. Dataframes
  2. Series

How to Install Pandas

Pandas didn't come by default while installing Python. To install pandas, you have to open your command prompt and write the given command:

pip install pandas

Or

If you are using anaconda, then open the anaconda prompt and write the command given below:

conda install pandas

To install a specific pandas version:

conda install pandas=0.20.3

NOTE: Pandas supports the Python programming language's 3.8, 3.9, and 3.10 versions.

Getting Started with Pandas

After installing Pandas on your device, import Pandas and write this on top of the code, you have to remember some important points about pandas.

  • Pandas use DataFrame to store a table. A DataFrame is a 2-dimensional data structure, and the size of the DataFrame is mutable.
  • Every column in a DataFrame is a Series.
  • A DataFrame or Series can be used to perform operations by applying methods.

Some Useful Commands of Pandas

To work with tabular data, pandas provide many functions. Let’s have a look at some of those commands.

1. Commands to read tabular data

Pandas support many file formats such as CSV files, excel files, JSON files, SQL files etc. Pandas have different functions to read files according to their format. All read functions have the prefix "read_". These read functions return a DataFrame.

General Syntax of reading a tabular data:

Reading a csv file:

my_data = pandas.read_csv("path of file")

Reading a excel file:

my_data = pandas.read_excel("path of file", sheet_name="name of sheet")

Reading a JSON file:

my_data = pandas.read_json("path of file")

Example: Let’s understand it by taking an example.

Suppose we have tabular data like this in CSV file format. Suppose the name of the file is solarSystem.csv and in the same directory. Let’s try to read it.

Pandas Commands
# code goes from here
import pandas as pd


my_data = pd.read_csv("solarSystem.csv")
print(my_data)

Output:

    planet          mass(kg)          distance from sun(10^6 m)     maximum distance(10^6m)
0      sun        1.99 * 10^10                      0                       	 0.0
1  mercury      3.3 * 10^3              	     57.9                    		 70.0
2    venus       4.9*10^4                       108                      		108.9
3    earth         6* 10^4                         150                     		152.1
4     mars        6.4*10^3                        228                    		249.0
5  jupiter        1.9*10^7                        779                    		817.0
6   saturn        5.7*10^6                       1433                  		1500.0
7   uranus       8.7*10^5                       2,867                 		3000.0
8  neptune     1.02*10^6                      4500               		4540.0

2.    Commands to write data in the file

Using this command, we can write Excel or CSV files in the Pandas framework.

If you want to write to Excel file, command as follow:

DataFrame.to_excel(" path of the file/file_name.xlsx")

Note: to write in excel format, you must install the openpyxl module from pip.

If you want to write to CSV file, command as follow:

DataFrame.to_csv(" path of the file/file_name.csv")

Example: Let’s understand this by implementing an example given below:

# code goes from here
import pandas as pd


data = pd.DataFrame([
        ["pid 1", 3, 150],
        ["pid 2", 3, 500],
        ["pid 3", 2, 650],
        ["pid 4", 2, 150]
    ]
    , columns=["product_id", "category", "price"]
)
data.to_excel("products.xlsx", sheet_name="sheet1", index=False)

Output:

Pandas Commands

3.    Commands for getting data from Table

To select a particular column:

column_1 = df["column_name"]

Or

column_1 = df.get("column_name")

Example: Let’s understand it by taking an example.

Let’s try to fetch the names of all planets from the above Table.

# code goes from here
#import pandas
import pandas as pd


my_data = pd.read_csv("solarSystem.csv")


planets = my_data["planet"]
print(planets)

Output:

0        sun
1    mercury
2      venus
3      earth
4       mars
5    Jupiter
6     Saturn
7     Uranus
8    Neptune
Name: planet, dtype: object

To select a particular row:

rows = df["condition that should followed by rows"]

Example: Let’s understand it by taking an example.

Let’s select the rows with a distance greater than 300 from the table.

# code goes from here
import pandas as pd


my_data = pd.read_csv("solarSystem.csv")


planets = my_data[my_data["distance from sun(10^6 m)"] > 300]
print(planets)

Output:

       planet   mass(kg)      distance from sun(10^6 m)  maximum distance(10^6m)
5      jupiter   1.9*10^7                      779.0                    817.0
6      saturn   5.7*10^6                     1433.0                   1500.0
7      uranus   8.7*10^5                     2867.0                   3000.0
8     neptune  1.02*10^6                   4500.0                   4540.0

To Select particular rows and columns:

new_data = df.loc["condition followed by rows", ["column_name"]]

Or

new_data = df.iloc[start_row : end_row, start_column : end_column]

Example:

Let’s try to select the planet name and mass, and the distance is greater than 300 from the Table.

# code goes from here
#import pandas
import pandas as pd


my_data = pd.read_csv("solarSystem.csv")


planets = my_data.loc[my_data["distance from sun(10^6 m)"] > 300, ["planet", "mass(kg)"]]
print(planets)

Output:

	planet   	mass(kg)
5  	jupiter   	1.9*10^7
6   	saturn   	5.7*10^6
7   	uranus   	8.7*10^5
8  	neptune  	1.02*10^6

4.    Transform command in pandas

To modify values of DataFrame:

df = df.transform(function)

Example: Let’s capitalize the names of planets

# code goes from here
#import pandas
import pandas as pd


my_data = pd.read_csv("solarSystem.csv")


my_data["planet"] = my_data["planet"].transform(lambda x: str(x).capitalize())
print(my_data["planet"])

Output:

0        Sun
1    Mercury
2      Venus
3      Earth
4       Mars
5    Jupiter
6     Saturn
7     Uranus
8    Neptune
Name: planet, dtype: object

5.    Command to merge two data frames

In this, we can merge two data frames to create a single data frame. And also we can choose which columns to make common as well.

We write this command as:

pd.merge(dataframe_1,dataframe_2)

Let’s understand this by implementing an example given below:

Example: Let’s try to merge a new table PlanetRadius with the old planet table.

Pandas Commands
# code goes from here
#import pandas
import pandas as pd


df1 = pd.read_csv("solarSystem.csv")
df2 = pd.read_csv("PlanetRadius.csv")


new_data = pd.merge(df1, df2)
print(new_data)


Output:

planet      	mass(kg)  ...  maximum distance(10^6m)  radius(km)
0      sun  	1.99 * 10^10  ...                   0.0         	NaN
1  mercury    3.3 * 10^3  ...                     	70.0      	2439.7
2    venus      4.9*10^4  ...                    	108.9     	6051.8
3    earth       6* 10^4  ...                    	152.1      	6371.0
4     mars      6.4*10^3  ...                    	249.0      	3389.5
5  jupiter      1.9*10^7  ...                    	817.0     	69911.0
6   saturn      5.7*10^6  ...                   	1500.0     	58232.0
7   uranus      8.7*10^5  ...                   	3000.0     	25362.0
8  neptune     1.02*10^6  ...                   	4540.0     	24622.0


[9 rows x 5 columns]

Note: In this, two frames are merged into a single DataFrame, but if we want to specify the common column, then it will be done by making a column. We write this command as:

pd.merge(dataframe_1,dataframe_2,on=”Column_name”)

Example: Let’s understand this by implementing an example given below:

# code goes from here
#import pandas
import pandas as pd


df1 = pd.read_csv("solarSystem.csv")
df2 = pd.read_csv("PlanetRadius.csv")


new_data = pd.merge(df1, df2, on="planet")
print(new_data)

Output:

planet      	mass(kg)  ...  maximum distance(10^6m)  radius(km)
0      sun  	1.99 * 10^10  ...                   0.0         	NaN
1  mercury    3.3 * 10^3  ...                     	70.0      	2439.7
2    venus      4.9*10^4  ...                    	108.9     	6051.8
3    earth       6* 10^4  ...                    	152.1      	6371.0
4     mars      6.4*10^3  ...                    	249.0      	3389.5
5  jupiter      1.9*10^7  ...                    	817.0     	69911.0
6   saturn      5.7*10^6  ...                   	1500.0     	58232.0
7   uranus      8.7*10^5  ...                   	3000.0     	25362.0
8  neptune     1.02*10^6  ...                   	4540.0     	24622.0


[9 rows x 5 columns]

6. Command to join DataFrames

You can combine two data frames by joining the index into a single dataFrame. This command is similar to the merge command, but there is one difference. The difference is that the join command is on the index and the merge command is on the column.

We write this command as:

dataframe_1.join(dataframe_2)

Example: Let’s understand this by implementing an example given below:

#code goes from here
import pandas as pd
import numpy as np
df1 = pd.read_csv("solarSystem.csv")
radius = pd.DataFrame([np.nan, 2439.7, 6051.8, np.nan, 3389.5, 69911, np.nan, 25362, 24622])


df1 = df1.join(radius)
print(df1)

Output:

    planet      mass(kg)  ...  maximum distance(10^6m)   radius
0      sun  	1.99 * 10^10  ...                   0.0      	NaN
1  mercury    	3.3 * 10^3  ...                     	70.0   		2439.7
2    venus      	4.9*10^4  ...                    	108.9   	6051.8
3    earth       	6* 10^4  ...                    	152.1      	NaN
4     mars      	6.4*10^3  ...                    	249.0   	3389.5
5  jupiter      	1.9*10^7  ...                    	817.0  		69911.0
6   saturn      	5.7*10^6  ...                   	1500.0      	NaN
7   uranus      	8.7*10^5  ...                   	3000.0  	25362.0
8  neptune     	1.02*10^6  ...                   	4540.0  	24622.0


[9 rows x 5 columns]

We can use this join command for left, right, inner, or outer join the DataFrames.

To right-join with two other DataFrames, we write commands as:

dataframe_1.join(dataframe_2,how=’right’)

7. Command to Concatenate DataFrames

In this, The DataFrames can be concatenated together. The dimension on which you want to concatenate can be chosen.

We write this command as:

pd.concat([dataframe_1,dataframe_2])

Example: Let’s understand this by implementing an example given below:

# code  goes from here
import pandas as pd
df = pd.read_csv("solarSystem.csv")


pluto = pd.DataFrame([['pluto', '1.3 * 10^22']], columns=["planet", "mass(kg)"])


new_df = pd.concat([df, pluto])
print(new_df)

Output:

    planet      mass(kg)       distance from sun(10^6 m)  maximum distance(10^6m)
0      sun  	1.99 * 10^10             0.0                      	0.0
1  mercury    	3.3 * 10^3        	57.9                     	70.0
2    venus      	4.9*10^4                 	108.0                    	108.9
3    earth       	6* 10^4                  	150.0                    	152.1
4     mars      	6.4*10^3                  	228.0                    	249.0
5  jupiter      	1.9*10^7                  	779.0                    	817.0
6   saturn      	5.7*10^6                  	1433.0                   	1500.0
7   uranus      	8.7*10^5                  	2867.0                   	3000.0
8  neptune     	1.02*10^6                	        4500.0                   	4540.0
0    pluto   	1.3 * 10^22              	         NaN                      	NaN

8. Info Command

The non-null values for each column, the index dtype and columns, and the total

The DataFrame's memory usage is all printed out by using this command.

We write this command as:

Dataframe.info()

Example: Let’s understand this by implementing an example given below:

# code goes from here
import pandas as pd
df = pd.read_csv("solarSystem.csv")
print(df.info())

Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9 entries, 0 to 8
Data columns (total 4 columns):
 #   Column                Non-Null Count  	Dtype  
---  ------                     		-------------- 	 -----  
 0   planet                     9 non-null      object 
 1   mass(kg)                   9 non-null      object 
 2   distance from sun(10^6 m) 	 9 non-null      float64
 3   maximum distance(10^6m)   	 9 non-null      float64
dtypes: float64(2), object(2)
memory usage: 416.0+ bytes
None

9. Describe Command

The summary information in descriptive statistics, which excludes null values for the numerical column, comprises the central tendency, minimum and maximum values, and the shape of a dataset's distribution. For category columns, summary information like count, frequency, etc.

It provides us with a DataFrame's summary statistic.

We write this command as:

Dataframe.describe()

Example: Let’s understand this by implementing an example given below:

# code goes from here
import pandas as pd
df = pd.read_csv("solarSystem.csv")
print(df.describe())

Output:

               distance from sun(10^6 m)  	maximum distance(10^6m)
count          	9.000000                 	9.000000
mean            1124.766667              	1159.666667
std             1574.548305              	1601.856057
min             0.000000                 	0.000000
25%             108.000000               	108.900000
50%             228.000000               	249.000000
75%             1433.000000              	1500.000000
max             4500.000000              	4540.000000

10.  Replace Command in Pandas

The replace command in pandas changes the values of specified rows.

We write this command as:

#use the inplace keyword and set its value to True for a permanent replacement.
DataFrame.replace([value_1, value_2], numpy.nan, inplace=True)

Example: Let’s understand this by implementing an example given below:

# code goes from here
import pandas as pd
import numpy as np
df1 = pd.read_csv("solarSystem.csv")
print("before replacing \n")
print(df1)
df1.replace([228, 1433], np.nan, inplace=True)
print("\nafter replacing\n")
print(df1)

Output:

before replacing 


    planet      mass(kg)         distance from sun(10^6 m)  maximum distance(10^6m)
0      sun  	1.99 * 10^10            	0.0                      		0.0
1  mercury    	3.3 * 10^3                	57.9                     		70.0
2    venus      	4.9*10^4                  	108.0                    		108.9
3    earth       	6* 10^4                    	150.0                    		152.1
4     mars      	6.4*10^3                  	228.0                    		249.0
5  jupiter      	1.9*10^7                  	779.0                    		817.0
6   saturn      	5.7*10^6                  	1433.0                   		1500.0
7   uranus      	8.7*10^5                  	2867.0                   		3000.0
8  neptune     	1.02*10^6                	4500.0                   		4540.0


after replacing


    planet      mass(kg)         distance from sun(10^6 m)  maximum distance(10^6m)
0      sun  	1.99 * 10^10         	0.0                      		0.0
1  mercury    	3.3 * 10^3                	57.9                     		70.0
2    venus      	4.9*10^4                  	108.0                    		108.9
3    earth       	6* 10^4                    	150.0                    		152.1
4     mars      	6.4*10^3                  	NaN                    		249.0
5  jupiter      	1.9*10^7                  	779.0                    		817.0
6   saturn      	5.7*10^6                  	NaN                   		1500.0
7   uranus      	8.7*10^5                  	2867.0                   		3000.0
8  neptune     	1.02*10^6                	4500.0                   		4540.0

11.  Data Munging in Pandas

In this, we can change the format of a specific piece of data. We can convert this command “.csv file” into “.html file”.

Example: Let’s understand this by implementing an example given below:

# code goes from here
import pandas as pd


df = pd.read_csv("solarSystem.csv")
df.to_html("solarSystem.html", index=False)

Output:

Pandas Commands

12.  Command to find Mean, Median and Mode

Using this command, we can find the mean, median and mode of each data frame's table column.

If we want to find the mean of each column, then write the command as:

Dataframe.mean()

If we want to find the median of each column, then write the command as:

Dataframe.median()

If we want to find the mode of each column, then write the command as:

Dataframe.mode()

Example: Let’s understand this by implementing an example given below:

# code goes from here
import pandas as pd


df = pd.read_csv("solarSystem.csv")
mean_distance = df["distance from sun(10^6 m)"].mean()
median_of_distance = df["distance from sun(10^6 m)"].median()
mode_of_distance = df["distance from sun(10^6 m)"].mode()
print(
    f'''
mean of the distance from the sun:
{mean_distance}


median of the distance from the sun:
{median_of_distance}


mode of the distance from the sun:
{mode_of_distance}
'''
)

Output:

mean of the distance from the sun:
1124.7666666666667


median of the distance from the sun:
228.0


mode of the distance from the sun:
0       0.0
1      57.9
2     108.0
3     150.0
4     228.0
5     779.0
6    1433.0
7    2867.0
8    4500.0
Name: distance from sun(10^6 m), dtype: float64

13.  Isnull command in Pandas

This command flags all rows with null values as True and produces a Boolean value.

We write this command as:

Dataframe.isnull()

Example: Let’s understand this by implementing an example given below:

# code goes from here
import pandas as pd


df = pd.read_csv("PlanetRadius.csv")
print(df.isnull())

Output:

	planet  	radius(km)
0   	False        	True
1   	False       	False
2   	False       	False
3   	False       	False
4   	False       	False
5   	False       	False
6   	False       	False
7   	False       	False
8   	False       	False

14.  Fillna command in Pandas

With the help of this command, we can fill blank rows with a certain value.

 A dataset's Nan rows can all be filled with the mean value.

We write this command as:

DataFrame.fillna(df.mean(), inplace = True)

You can also apply it to columns.

We write this command for the column as:

DataFrame['column_name'].fillna(df[column_name].mean(), inplace = True)

Example: Let’s understand this by implementing an example given below:

import pandas as pd


df = pd.read_csv("PlanetRadius.csv")
print("before updating\n")
print(df)
df["radius(km)"] = df["radius(km)"].fillna("unknown")
print("\nafter updating\n")
print(df)

Output:

before updating


    	planet  	radius(km)
0      	sun         	NaN
1  	mercury     	2439.7
2    	venus      	6051.8
3    	earth      	6371.0
4     	mars      	3389.5
5  	jupiter     	69911.0
6   	saturn     	58232.0
7   	uranus     	25362.0
8  	neptune     	24622.0


after updating


    	planet 		radius(km)
0      	sun    		unknown
1  	mercury     	2439.7
2    	venus     	6051.8
3    	earth     	6371.0
4    	mars     	3389.5
5  	jupiter    	69911.0
6   	saturn    	58232.0
7   	uranus    	25362.0
8  	neptune    	24622.0

15. Head command in Pandas

The data frame's first n rows will be returned by using this command.

We write this command as:

Dataframe.head()

Example: Let’s understand this by implementing an example given below:

# code goes from here
import pandas as pd


df = pd.read_csv("solarSystem.csv")
print(df.head())

Output:

    planet      mass(kg)         distance from sun(10^6 m)  maximum distance(10^6m)
0      sun  	1.99 * 10^10         	 0.0                      		0.0
1  mercury    	3.3 * 10^3                	57.9                     		70.0
2    venus      	4.9*10^4                  	108.0                    		108.9
3    earth       	6* 10^4                    	150.0                    		152.1
4     mars      	6.4*10^3                  	228.0                    		249.0

16.  Tail command in Pandas

The data frame's last n rows will be returned by using this command.

We write this command as:

Dataframe.tail()

Example: Let’s understand this by implementing an example given below:

# code goes from here
import pandas as pd


df = pd.read_csv("solarSystem.csv")
print(df.tail())

Output:

    planet     mass(kg)       distance from sun(10^6 m)  maximum distance(10^6m)
4     mars   6.4*10^3                      228.0                    	249.0
5  jupiter   1.9*10^7                      779.0                    	817.0
6   saturn   5.7*10^6                     1433.0                   	1500.0
7   uranus  8.7*10^5                     2867.0                   	3000.0
8  neptune 1.02*10^6                   4500.0                   	4540.0

17.  Shape command in Pandas

The total number of rows and columns will be returned as a tuple. This means that it is examining the data frame's shape.

We write this command as:

          Dataframe.shape

Example: Let’s understand this by implementing an example given below:

#code goes from here
#import pandas
import pandas as pd
#reads .csv file
df = pd.read_csv("solarSystem.csv")


#It prints rows and columns in a data frame
print(df.shape)

Output:

(9, 4)

This output shows 9 rows and 4 columns in the given data frame.

18.  Insert command in Pandas

To include a new column in a DataFrame, we can use the insert() command in Pandas. It accepts the column name, a list of the column's values and location, and a column index as its three keywords.

We write this command as:

DataFrame.insert(column = 'define_name', value = [], loc=0)

Example: Let’s understand this by implementing an example given below:

# code goes from here
import pandas as pd
import numpy as np


df = pd.read_csv("solarSystem.csv")
radius = [np.nan, 2439.7, 6051.8, 6371.0, 3389.5, 69911, 58232, 25362, 24622]
df.insert(column="radius", value=radius, loc=1)
print(df)

Output:

    planet   	radius  ... distance from sun(10^6 m)  maximum distance(10^6m)
0      sun      	NaN  ...                   	0.0                      	0.0
1  mercury   	2439.7  ...                 	57.9                     	70.0
2    venus   	6051.8  ...                 	108.0                    	108.9
3    earth   	6371.0  ...                 	150.0                    	152.1
4     mars   	3389.5  ...                 	228.0                    	249.0
5  jupiter  	69911.0  ...               	779.0                    	817.0
6   saturn  	58232.0  ...               	1433.0                   	1500.0
7   uranus  	25362.0  ...               	2867.0                   	3000.0
8  neptune  	24622.0  ...               	4500.0                   	4540.0


[9 rows x 5 columns]

19.  Sub-command in Pandas

The sub() command allows us to subtract a specific number from each value in the Data frame. It operates on each item in a Data frame by iterating through it.

We write this command as:

DataFrame['column'].sub()

Here, in the sub() command, we can write any number we want to subtract from the column numbers in the Data frame.

Example: Let’s understand this by implementing an example given below:

#code goes from here
#import pandas
import pandas as pd


df = pd.read_csv("PlanetRadius.csv")
print("before updating\n")
print(df)


#print after subtracting 20 from the given column


df["radius(km)"] = df["radius(km)"].sub(20)
print("\nafter updating\n")
print(df)

Output:

before updating


    planet  radius(km)
0      sun         NaN
1  mercury      2439.7
2    venus      6051.8
3    earth      6371.0
4     mars      3389.5
5  jupiter     69911.0
6   saturn     58232.0
7   uranus     25362.0
8  neptune     24622.0


after updating


    planet  radius(km)
0      sun         NaN
1  mercury      2419.7
2    venus      6031.8
3    earth      6351.0
4     mars      3369.5
5  jupiter     69891.0
6   saturn     58212.0
7   uranus     25342.0
8  neptune     24602.0

In the above example, we subtract 20 from the radius (km) column and print a new data frame after subtracting.

20.  Add command in Pandas

The add() command allows us to add a specific number with each value in the Data frame. It operates on each item in a Data frame by iterating through it.

We write this command as:

DataFrame['column'].add()

Here, in the add() command, we can write any number we want to add with the column numbers in the Data frame.

Example: Let’s understand this by implementing an example given below:

#code goes from here
#import pandas
import pandas as pd
 
df = pd.read_csv("PlanetRadius.csv")
print("before updating\n")
print(df)
 
#print after adding 20 from the given column
 
df["radius(km)"] = df["radius(km)"].add(20)
print("\nafter updating\n")
print(df)

Output:

before updating


    planet  radius(km)
0      sun         NaN
1  mercury      2439.7
2    venus      6051.8
3    earth      6371.0
4     mars      3389.5
5  jupiter     69911.0
6   saturn     58232.0
7   uranus     25362.0
8  neptune     24622.0


after updating


    planet  radius(km)
0      sun         NaN
1  mercury      2459.7
2    venus      6071.8
3    earth      6391.0
4     mars      3409.5
5  jupiter     69931.0
6   saturn     58252.0
7   uranus     25382.0
8  neptune     24642.0

In the above example, we add 20 from the radius(km) column and print a new data frame after adding.

21.  Mul command in Pandas

The mul() command allows us to multiply a specific number from each value in the Data frame. It operates on each item in a Data frame by iterating through it.

We write this command as:

DataFrame['column'].mul()

Here, in the mul() command, we can write any number we want to multiply from the column numbers in the Data frame.

Example: Let’s understand this by implementing an example given below:

#code goes from here
#import pandas
import pandas as pd
 
df = pd.read_csv("PlanetRadius.csv")
print("before updating\n")
print(df)
 
#print after multiplying 20 from the given column
 
df["radius(km)"] = df["radius(km)"].mul(20)
print("\nafter updating\n")
print(df)

Output:

before updating


    planet  radius(km)
0      sun         NaN
1  mercury      2439.7
2    venus      6051.8
3    earth      6371.0
4     mars      3389.5
5  jupiter     69911.0
6   saturn     58232.0
7   uranus     25362.0
8  neptune     24622.0


after updating


    planet  radius(km)
0      sun         NaN
1  mercury     48794.0
2    venus    121036.0
3    earth    127420.0
4     mars     67790.0
5  jupiter   1398220.0
6   saturn   1164640.0
7   uranus    507240.0
8  neptune    492440.0

 In the above example, we multiply 20 from the radius (km) column and print a

new data frame after multiplying.

22.  Div command in Pandas

   The div() command allows us to divide a specific number from each value in the

   Data frame. It operates on each item in a Data frame by iterating through it.

We write this command as:

DataFrame['column'].div()

Here, in the div() command, we can write any number we want to divide from the column numbers in the Data frame.

Example: Let’s understand this by implementing an example given below:

#code goes from here
#import pandas
import pandas as pd
 
df = pd.read_csv("PlanetRadius.csv")
print("before updating\n")
print(df)
 
#print after dividing 20 from the given column
 
df["radius(km)"] = df["radius(km)"].div(20)
print("\nafter updating\n")
print(df)


Output:

before updating


    planet  radius(km)
0      sun         NaN
1  mercury      2439.7
2    venus      6051.8
3    earth      6371.0
4     mars      3389.5
5  jupiter     69911.0
6   saturn     58232.0
7   uranus     25362.0
8  neptune     24622.0


after updating


    planet  radius(km)
0      sun         NaN
1  mercury     121.985
2    venus     302.590
3    earth     318.550
4     mars     169.475
5  Jupiter    3495.550
6   Saturn    2911.600
7   Uranus    1268.100
8  Neptune    1231.100

 In the above example, we divide 20 from the radius (km) column and print a new data frame after the divide.

23.  Count command in Pandas

This command returned the total number of items in each column.

We write this command as:

DataFrame.count()

Example: Let’s understand this by implementing an example given below:

#code goes from here
import pandas as pd
df = pd.read_csv("PlanetRadius.csv")
#prints count of total number of items in column
print(df.count())

Output:

planet        9
radius(km)    8
dtype: int64

In the above example, the count() command prints the total number of items in the PlanetRadius data frame.

24.  Series.std() command in Pandas

You can also determine the standard deviation for each column in a DataFrame using pandas' std() command. It operates by computing the standard deviation for each column in a dataset iteratively.

We write this command as:

DataFrame.std()

Example: Let’s understand this by implementing an example given below:

# code goes from here
import pandas as pd


df = pd.read_csv("PlanetRadius.csv")


sd = df["radius(km)"].std()
print(sd)

Output:

26191.747578347215

In the above example,std() command prints the standard deviation of radius(km) column.

25.  Sort_values command in Pandas

This command allows us to sort values, ascending or descending, according to a certain column.

We write this command as:

new_DataFrame = DataFrame.sort_values(by = "colmun", descending = True)

Example: Let’s understand this by implementing an example given below:

import pandas as pd


df = pd.read_csv("PlanetRadius.csv")


new_df = df.sort_values("radius(km)")
print(f'''before updating
{df}


after updating
{new_df}
''')

Output:

before updating
    planet  radius(km)
0      sun         NaN
1  mercury      2439.7
2    venus      6051.8
3    earth      6371.0
4     mars      3389.5
5  jupiter     69911.0
6   saturn     58232.0
7   uranus     25362.0
8  neptune     24622.0


after updating
    planet  radius(km)
1  mercury      2439.7
4     mars      3389.5
2    venus      6051.8
3    earth      6371.0
8  neptune     24622.0
7   uranus     25362.0
6   saturn     58232.0
5  jupiter     69911.0
0      sun         NaN


In the above example, we sort the values of the radius(km) column in the PlanetRadius.csv data frame with the sort_values() command.

26.  Where command in Pandas

A pandas query that accepts a predicate to get certain values in a column is where() command.

We write this command as:

DataFrame.where(DataFrame['Column_name'] < ‘number’)

In number, we can enter any number we must find from this.

Example: Let’s understand this by implementing an example given below:

#code goes from here
import pandas as pd


df = pd.read_csv("PlanetRadius.csv")
#findout values less than 4000 in radius(km)column
new_df=df.where(df["radius(km)"] < 4000)


#print dataframe before using where command 
print(f'''before updating
{df}
#print data frame after using where command
after updating
{new_df}
''')


Output:

     before updating
    planet  radius(km)
0      sun         NaN
1  mercury      2439.7
2    venus      6051.8
3    earth      6371.0
4     mars      3389.5
5  jupiter     69911.0
6   saturn     58232.0
7   uranus     25362.0
8  neptune     24622.0


      after updating
    planet  radius(km)
0      NaN         NaN
1  mercury      2439.7
2      NaN         NaN
3      NaN         NaN
4     mars      3389.5
5      NaN         NaN
6      NaN         NaN
7      NaN         NaN
8      NaN         NaN

The above example prints values less than 4000 in the radius (km) column in the PlanetRadius.csv data frame using the where() command.

27.  Query command in Pandas

We can call items using their index number with pandas' query() command.

For example, if we want to obtain the second row's item, then we write this command:DataFrame.query('3')

Example: Let’s understand this by implementing an example given below:

#code goes from here
	import pandas as pd
df = pd.read_csv("PlanetRadius.csv")
#print sixth row’s item 
print(df.query('5')) 


Output:

planet        jupiter
radius(km)    69911.0
Name: 5, dtype: object

In the above example, it prints the sixth row’s item in the PlanetRadius.csv data frame by using the query() command.

28.  melt command in Pandas

The pandas' melt() command changes a DataFrame's columns to individual rows.

It involves revealing a DataFrame's internal structure. As a result, it enables you to explicitly inspect the value assigned to each column.

We write this command as:

new_DataFrame = DataFrame.melt()

Example: Let’s understand this by implementing an example given below:

#code goes from here
import pandas as pd
df = pd.read_csv("PlanetRadius.csv")
new_df=df.melt()
#print new data frame after using melt() command
print(new_df)


Output:

      variable    value
0       planet      sun
1       planet  mercury
2       planet    venus
3       planet    earth
4       planet     mars
5       planet  jupiter
6       planet   saturn
7       planet   uranus
8       planet  neptune
9   radius(km)      NaN
10  radius(km)   2439.7
11  radius(km)   6051.8
12  radius(km)   6371.0
13  radius(km)   3389.5
14  radius(km)  69911.0
15  radius(km)  58232.0
16  radius(km)  25362.0
17  radius(km)  24622.0

In the above example, we convert DataFrame's columns to individual rows and print new dataFrame after using the melt() command.

29.  Corr command in Pandas

Do you need to determine whether float or integer columns are correlated? You   

Can accomplish it with the use of pandas' the corr() command:

We write this command as:

DataFrame.corr()

The above code generates a new DataFrame with the correlation order for each integer or floats column using the corr() command in Pandas.

30.  Drop command in Pandas

In a DataFrame, pandas' drop() command deletes particular rows or columns. You need to provide the axis, row index, and column names to use it.

We write this command as:

DataFrame.drop()

Example: Let’s understand this by implementing an example given below:

#code goes from here
import pandas as pd
df = pd.read_csv("solarSystem.csv")
print("Before using the drop command\n")
print(df)
new_df=df.drop(columns=['planet', 'mass(kg)'], axis=0)
print("\nAfter using the drop command\n")
print(new_df)


Output:

Before using the drop command


    planet      mass(kg)  distance from sun(10^6 m)  maximum distance(10^6m)
0      sun  1.99 * 10^10                        0.0                      0.0
1  mercury    3.3 * 10^3                       57.9                     70.0
2    venus      4.9*10^4                      108.0                    108.9
3    earth       6* 10^4                      150.0                    152.1
4     mars      6.4*10^3                      228.0                    249.0
5  jupiter      1.9*10^7                      779.0                    817.0
6   saturn      5.7*10^6                     1433.0                   1500.0
7   uranus      8.7*10^5                     2867.0                   3000.0
8  neptune     1.02*10^6                     4500.0                   4540.0


After using the drop command


distance from sun(10^6 m)  maximum distance(10^6m)
0                        0.0                      0.0
1                       57.9                     70.0
2                      108.0                    108.9
3                      150.0                    152.1
4                      228.0                    249.0
5                      779.0                    817.0
6                     1433.0                   1500.0
7                     2867.0                   3000.0
8                     4500.0                   4540.0

In the above example, we delete the planet and mass (kg) column using the drop() command and print a new DataFrame after deleting.