This tutorial will teach you how to drop an index column using Pandas. You will learn how to accomplish this using the .reset index()
and .set index()
DataFrame methods, as well as how to receive and write CSV files without an index.
Pandas attempts to infer an index column when you construct a DataFrame. Despite the fact that these indices are often useful, there may be times when you merely wish to remove them. After a DataFrame has been loaded or before loading a DataFrame, Pandas offers a number of beneficial methods for achieving this.
What is a Pandas Index Column?
Excel row numbers are comparable to the Pandas index. But merely stating this would be a disservice to the index. This is so because it is much, much more than a row number. Similarly to an address or a dictionary’s key, the row index provides access to the DataFrame’s records.
Pandas will automatically construct an index unless a specific index is supplied in. This index will begin at 0 and continue up to the length minus 1 of the DataFrame. RangeIndex describes this form of index (as it represents the values from the range function). Nevertheless, if you are working with specific data, such as time series data, you may wish to index your data by a different column.
Loading a Sample Pandas Dataframe
To load a sample Pandas DataFrame, you can use the pd.DataFrame()
function provided by the Pandas library.
Here’s an example of how to construct a basic DataFrame with three columns and four rows of data:
# Loading a Sample Pandas Dataframe
import pandas as pd
df = pd.DataFrame.from_dict({
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Sarah', 'Emma', 'Lisa'],
'Age': [21, 38, 34, 32, 76, 58, 43],
'Gender': ['female', 'male', 'male', 'male', 'female', 'female', 'female'],
'Income': [20000, 80000, 70000, 150000, 40000, 35000, 50000]
}).set_index('Name')
print(df.head())
Output:
# Age Gender Income
#Name
#Alice 21 female 20000
#Bob 38 male 80000
#Charlie 34 male 70000
#David 32 male 150000
#Sarah 76 female 40000
This will set the ‘Name’ column as the index of the dataframe.
Now you have loaded a sample Pandas dataframe using .set_index()
. You can use other columns as the index by replacing ‘Name’ with the desired column name.
Dropping a Pandas Index Column Using reset_index
The simplest approach to remove an index from a Pandas DataFrame is to use the Pandas .reset index()
method. By default, the method resets the index and generates a RangeIndex (from 0 to the length of the DataFrame minus 1). Additionally, the method inserts the DataFrame index into a column within the DataFrame.
Here’s an example of how to drop the index column of a DataFrame df
using set_index()
and reset the index to a default integer index:
# Resetting a dataframe index with .reset_index()
df = df.reset_index()
print(df.head())
Output:
# Name Age Gender Income
#0 Alice 21 female 20000
#1 Bob 38 male 80000
#2 Charlie 34 male 70000
#3 David 32 male 150000
#4 Sarah 76 female 40000
However, what if we wanted to remove the DataFrame index and not retain it? Then, we could use the drop=True
argument to instruct Pandas to reset the index and discard the original values. Let’s see what this looks like:
# Drop a Pandas Dataframe index with .reset_index() Method
df = df.reset_index(drop=True)
print(df.head())
Output:
# Age Gender Income
#0 21 female 20000
#1 38 male 80000
#2 34 male 70000
#3 32 male 150000
#4 76 female 40000
Dropping a Pandas Index a Multi-Index DataFrame
Pandas also enables you to deal with multi-index DataFrames, which have multiple columns representing the index. This indicates that each record has two or more distinct identifiers. Let’s create a sample MultiIndex DataFrame:
# Creating a MultiIndex DataFrame Dataframe
import pandas as pd
df = pd.DataFrame.from_dict({
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Sarah', 'Emma', 'Lisa'],
'Age': [21, 38, 34, 32, 76, 58, 43],
'Gender': ['female', 'male', 'male', 'male', 'female', 'female', 'female'],
'Income': [20000, 80000, 70000, 150000, 40000, 35000, 50000]
}).set_index(['Name', 'Gender'])
print(df.head())
Output:
If you wish to remove a single index, however, you must use the level=
parameter. Let’s see how we can remove the 'Income'
index column while maintaining the values:
# Age Income
# Name Gender
# Alice female 21 20000
# Bob male 38 80000
# Charlie male 34 70000
# David male 32 150000
# Sarah female 76 40000
# Dropping a Single MultiIndex and Keeping Values
import pandas as pd
df = pd.DataFrame.from_dict({
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Sarah', 'Emma', 'Lisa'],
'Age': [21, 38, 34, 32, 76, 58, 43],
'Gender': ['female', 'male', 'male', 'male', 'female', 'female', 'female'],
'Income': [20000, 80000, 70000, 150000, 40000, 35000, 50000]
}).set_index(['Name', 'Income'])
df = df.reset_index(level='Income')
print(df.head())
Output:
# Income Age Gender
#Name
#Alice 20000 21 female
#Bob 80000 38 male
#Charlie 70000 34 male
#David 150000 32 male
#Sarah 40000 76 female
Similarly, we can remove all values from a single index column by passing in the drop=True
parameter, as demonstrated below:
# # Dropping a Single MultiIndex and Dropping Values
import pandas as pd
df = pd.DataFrame.from_dict({
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Sarah', 'Emma', 'Lisa'],
'Age': [21, 38, 34, 32, 76, 58, 43],
'Gender': ['female', 'male', 'male', 'male', 'female', 'female', 'female'],
'Income': [20000, 80000, 70000, 150000, 40000, 35000, 50000]
}).set_index(['Name', 'Income'])
df = df.reset_index(level='Income', drop=True)
print(df.head())
Output:
# Age Gender
#Name
#Alice 21 female
#Bob 38 male
#Charlie 34 male
#David 32 male
#Sarah 76 female
Dropping a Pandas Index Column Using set_index
We can also apply a workaround consisting of creating an index with a column that merely replicates the normal index pattern. This can be accomplished by first establishing a column containing the values from 0 to the list’s length minus 1. This can be accomplished using the.assign()
method, which adds a column to a Pandas DataFrame. The .set index()
method is then used to assign the new column to the DataFrame’s index.
Example:
# Delete a Pandas Dataframe Index with .set_index() Dataframe
import pandas as pd
df = pd.DataFrame.from_dict({
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Sarah', 'Emma', 'Lisa'],
'Age': [21, 38, 34, 32, 76, 58, 43],
'Gender': ['female', 'male', 'male', 'male', 'female', 'female', 'female'],
'Income': [20000, 80000, 70000, 150000, 40000, 35000, 50000]
}).set_index(['Name', 'Gender'])
print(df.head())
Output:
# Name Age Gender Income
#Index
#0 Alice 21 female 20000
#1 Bob 38 male 80000
#2 Charlie 34 male 70000
#3 David 32 male 150000
#4 Sarah 76 female 40000
We started by using the.assign() method to create a column called “Index.” Then, we add the.set index() method to the chain to add the new column to the index. This replaces the old index and deletes it.
Read a CSV File into a Pandas DataFrame without an Index
You may come across CSV files that aren’t formatted correctly, like ones where the last character in a row is a delimiter. This is how they might look:
Name ,Income,Gender
Alice,20000,female,
Bob,80000,male,
Charlie,70000,male,
David,150000,male,
Sarah,76,female,
Pandas will think that the first values are the index values because there is a comma at the end. The file will look like this when we read it into a DataFrame:
# Reading a malformed .csv file with Pandas
import pandas as pd
df = pd.read_csv('example.csv')
print(df.head())
Output:
# Name Income Gender
# Alice 20000 NaN
# Bob 80000 NaN
# Charlie 70000 NaN
# David 150000 NaN
# Sarah 76 NaN
To be honest, this is not what we desire. We would like the data to be aligned correctly with the columns, so that an empty column is returned at the end.
Let’s add index_col = False
into our function:
# Reading a malformed .csv file with Pandas
import pandas as pd
df = pd.read_csv('example.csv', index_col=False)
print(df.head())
Output:
# Name Income Gender
# Alice 20000 female
# Bob 80000 male
# Charlie 70000 male
# David 150000 male
# Sarah 40000 female
Wrap up
In this tutorial, you learned how to remove an index column using Pandas. You learned how to delete an index using the Pandas .reset index()
and.set index()
methods. You also discovered how to read and write CSV data to DataFrame. Working with Pandas indices is a valuable skill as you learn to manipulate data with Pandas.
More information about Pandas .reset_index()
you can find in the official documentation.
Check also how to install Python if you don’t do this yet, HERE.
Thanks for reading. Happy coding!