Ways to Add Rows in Dataframe
A dataframe is a fundamental statistics shape used in records manipulation and analysis, particularly in programming languages like Python thru libraries including pandas. It resembles a table wherein data is prepared into rows and columns, allowing for established illustration and easy manipulation of datasets. Each column typically holds specific statistics, like numbers or strings, while rows represent personal information or observations.
Dataframes provide a flexible manner to manage, filter out, transform, and examine information, imparting functionalities for facts cleaning, aggregation, and statistical operations. They are widely employed in information preprocessing, exploratory statistics analysis, and modeling tasks, making them a cornerstone for facts specialists and researchers running with established data.
Methods to Add Rows in Dataframe
Here are numerous approaches to feature rows to a dataframe, at the side of reasons, Python code examples, pattern outputs, and considerations of time complexities for every approach.
Method 1: Using 'loc' with a Dictionary
The 'loc' indexer lets you add a new row to a dataframe by supplying a dictionary in which keys are column names and values are corresponding statistics for every column. Code
import pandas as pd
# Create a sample dataframe
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
# New row data
new_data = {'Name': 'Charlie', 'Age': 28}
# Add new row using loc
df = df.append(new_data, ignore_index=True)
print(df)
Output
Name Age
0 Alice 25
1 Bob 30
2 Charlie 28
Time Complexity: Appending a single row, the use of 'loc' with a dictionary has a mean time complexity of O(n), wherein n is the variety of columns within the dataframe.
Method 2: Using 'loc' with a List
This approach entails including a new row in the dataframe and using a list of values, wherein each cost corresponds to a column within the dataframe.
Code
import pandas as pd
# Create a sample dataframe
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
# New row data
new_row = ['Charlie', 28]
# Add new row using loc
df.loc[len(df)] = new_row
print(df)
Output
Name Age
0 Alice 25
1 Bob 30
2 Charlie 28
Time Complexity: Appending a single row, the usage of 'loc' with a list has an average time complexity of O(n), where n is the variety of columns within the dataframe.
Method 3: Using 'append' Method
The 'append' technique concatenates dataframes. You can create a new dataframe for the row you want to feature, after which you append it to the authentic dataframe. Code
import pandas as pd
# Create a sample dataframe
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
# New row data
new_row = pd.DataFrame({'Name': ['Charlie'], 'Age': [28]})
# Append new row using append method
df = df.append(new_row, ignore_index=True)
print(df)
Output
Name Age
0 Alice 25
1 Bob 30
2 Charlie 28
Time Complexity: The 'append' approach includes growing a new dataframe after which concatenating the facts, ensuing in an average time complexity of O(m * n), where m is the wide variety of rows and n is the number of columns in the dataframe.
Method 4: Using 'concat' Function
The 'concat' feature concatenates multiple dataframes. You can create a new dataframe for the row you want to add and then concatenate it with the unique dataframe.
Code
import pandas as pd
# Create a sample dataframe
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
# New row data
new_row = pd.DataFrame({'Name': ['Charlie'], 'Age': [28]})
# Concatenate dataframes using concat function
df = pd.concat([df, new_row], ignore_index=True)
print(df)
Output
Name Age
0 Alice 25
1 Bob 30
2 Charlie 28
Time Complexity: The 'concat' feature has a mean time complexity of O(m * n), in which m is the wide variety of rows and n is the wide variety of columns within the concatenated dataframes.
Method 5: Using 'DataFrame.loc' and 'at'
You can use the 'DataFrame.Loc' indexer at the side of the '.at' characteristic to feature a brand new row to a selected location inside the dataframe.
Code
import pandas as pd
# Create a sample dataframe
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
# New row data
new_row = pd.Series(['Charlie', 28], index=df.columns)
# Add new row using loc and at
df.loc[len(df)] = new_row
print(df)
Output
Name Age
0 Alice 25
1 Bob 30
2 Charlie 28
Time Complexity: This method has a mean time complexity of O(n), where n is the number of columns inside the dataframe.