Practical Methods to Reindex Rows of DataFrame with Pandas

Easily reindex rows of a DataFrame

Reindexing is a process in Pandas that allows you to change the index labels of rows or columns in a DataFrame. It’s useful when you want to rearrange the order of rows or introduce new rows with specific labels.

This tutorial will guide you through the process of reindexing rows step by step, helping you understand how to perform this operation efficiently using Pandas.

1. Create a Sample DataFrame

For the purpose of this tutorial, let’s import pandas and create a sample DataFrame to work with:

import pandas as pd

data = {'Name': ['John', 'Alice', 'Bob', 'Eve'],
        'Age': [25, 28, 32, 21],
        'City': ['New York', 'Paris', 'London', 'Sydney']}
df = pd.DataFrame(data)
df

By default, Pandas assigns a numerical index starting from 0.In this example, the index is from 0 to 3.

2. Create a New Index

However, we can create a new index. In this example, let’s create a new index.

# create an index
index = ['a', 'b', 'c', 'd']

# convert it into DataFrame 
# make it as the new index and removing default index
df_new1 = pd.DataFrame(data,index)  
df_new1

3. Set a Column as the index

We can also set a column as the index using set_index(). For example, let’s set the ‘Name’ column as the index in our example.

df_new2 = df.set_index('Name')
df_new2

4. Set a Column as the Index without Removing the Column

In the above example, the column has been removed when it is set as an index. In this example, we’ll set a column as index without removing it by setting the parameter drop=False.

df_new3 = df.set_index('Name',drop=False)
df_new3

5. Set a New Index without Removing Default Index

In the above two examples, the default numerical index has been removed. In this example, we’ll reindex without removing the default index.

df_new4 = df.set_index('Name',drop=True, append=True)
df_new4

6. Reindex the rows

To reindex the rows, you can use the reindex() method. It takes either a list or an array-like object containing the new index labels. In our example, we’ll create a new order for the rows.

# set the name as the index
df_new = df.set_index('Name')

# create the index order
new_order = ['Bob', 'Alice', 'Eve', 'John']

# reindex with the new order
df_new5 = df_new.reindex(new_order)
df_new5

You can observe that the rows have been rearranged according to the new order.

7. Reset the index

We can reset the index to its original index easily. In the following example, let’s set the Name as the index to replace the numerical index, and then we change it back to the numerical one.

# orginal numerical index
df
# set the name as the index
df_new = df.set_index('Name')
df_new
# reset the index to the default (0, 1, 2, ...)
df_new6 = df_new.reset_index(level =['Name'])
df_new6

Conclusion

Reindexing rows in Pandas is a valuable technique for manipulating the index labels of rows in a DataFrame. It allows you to rearrange the order of rows or introduce new rows with specific labels, providing flexibility in data organization. By following the steps outlined in this tutorial, you can confidently reindex rows in Pandas, efficiently tailoring your DataFrame to meet your analysis and visualization needs.

Originally published at https://medium.com/@shouke.wei on June 18, 2023.

Bookmark
ClosePlease login
0 - 0

Thank You For Your Vote!

Sorry You have Already Voted!

Please follow and like me:

Leave a Reply

Your email address will not be published. Required fields are marked *