Iterate over rows in Pandas

In this tutorial, we will learn how to iterate over rows using Pandas in Python.

Pandas DataFrame:

Two-dimensional data and its related labels are stored in a structure called a Pandas DataFrame. In data science and many other domains that deal with large amounts of data, DataFrames are often utilised.

The spreadsheets we deal with in Excel or SQL tables are comparable to DataFrames. Due to their central role in the Python and NumPy ecosystems, DataFrames frequently outperform tables and spreadsheets in terms of speed, usability, and power.

Pandas DataFrames include:

  • Rows and columns of two-dimensional data.
  • Row and column labels.

Create a Pandas DataFrame

Creating DataFrame using dictionary:

Code:

import numpy as np
import pandas as pd
d = {'x': [2, 3, 4], 'y': 100}
# Creating  DataFrame
f = pd.DataFrame(d)
print(f)

Output:

     x      y
0   2    100
1   3    100
2   4    100
  • Dictionary values are the data values in the relevant DataFrame columns, and dictionary keys are the names of the columns in the DataFrame. The values may be stored as tuples, lists or other data types. Additionally, you may enter a single value to be copied along the full column.

Iteration

Using Index labels:

We can cycle through the rows of the DataFrame using loops.

import numpy as np
import pandas as pd
d = {'x': [2, 3, 4], 'y': 100}
f = pd.DataFrame(d)
for i in range(len(f)):
    print(f[‘x’][i])

Output:

2
3
4   

Using the loc() Method:

  • A DataFrame's rows and columns can be accessed via its index labels by utilising the .loc() function.
  • By supplying the row and column names of the DataFrame, we can loop through its rows.

Code:

import numpy as np
import pandas as pd
d = {'x': [2, 3, 4], 'y': 100}
f = pd.DataFrame(d)
for i in range(len(f)):
    print(f.loc[i, 'x'], f.loc[i, ‘y'])

Output:

2 100
3 100
4 100

Using the iloc() method:

The rows and columns of the DataFrame can be accessed using their integer-value positions in the DataFrame by using the .iloc.() function.

In order to iterate across the rows of the pandas DataFrame, we need to provide the integer values for the row and column indices.

Code:

import numpy as np
import pandas as pd
d = {'x': [2, 3, 4], 'y': 100}
f = pd.DataFrame(d)
for i in range(len(f)):
      print(f.iloc[i, 0], f.iloc[i, 1])
    

Output:

2 100
3 100
4 100

Using iterrows() Method:

The pandas DataFrame's rows are iterated over using the iterrows() method.

Code:

import numpy as np
import pandas as pd
d = {'x': [2, 3, 4], 'y': 100}
f = pd.DataFrame(d)
for i, row in f.iterrows():
    print(row)      

Output:

x    2 
y    100
x: 0, dtype: object


x    3 
y    100
x: 1, dtype: object


x    4 
y    100
x: , dtype: object

Using the itertuples():

The pandas DataFrame's rows are iterated through as named tuples using the itertuples() function. Pandas.DataFrame.itertuples(index=True, name='Pandas') is the syntax.

Parameters:

  • index: boolean (default: True). Whether or not the row's index label should be returned is determined by this value If the value 'True' is specified, the index label will be returned as the first index of the tuple.
  • name: String or None (Pandas by default). This is how we specifically name the tuples.
  • Returns: Iterator

Code:

import numpy as np
import pandas as pd
d = {'x': [2, 3, 4], 'y': 100}
f = pd.DataFrame(d)
for row in f.itertuples():   
 print(row)      

Output:

Pandas(Index=0, x=2, _2=100)
Pandas(Index=0, x=3, _2=100)
Pandas(Index=0, x=4, _2=100)

Using the apply() Method:

Apply() allows us to do a certain operation over all of a DataFrame's values.

DataFrame.apply(func, axis=0, raw=False, result type=None, args=(), kwds) is the syntax for this statement.

Parameters:

  • func: This word indicates the operation or function that will be performed on the DataFrame.
  • axis: either 0 or 1 (default: 0). The orientation of the DataFrame applied is specified by this argument.
  • raw: Boolean (default: False). It is used to indicate whether the rows or columns should be given as a pandas series.
  • args(): This function is used to define the positional arguments.
  • kwds: It's used to specify any extra keyword arguments that the function specified by the func parameter will take into account.
  • Returns: A DataFrame or a pandas Series.

We can use the apply() function to loop through the column values of the DataFrame by combining the index labels method with this method.

Code:

import numpy as np
import pandas as pd
d = {'x': [2, 3, 4], 'y': 100}
f = pd.DataFrame(d)
print(df.apply(lambda row: row['x'] + ', ' + row['y'], axis=1))

Output:

0                   2, 100
1                   3, 100
2                   4, 100