Pivot table reset index python

Pivot tables in Pandas and Handling Multi-Index Data with Hands-On Examples in Python

Learn how to pivot a Pandas DataFrame and get meaningful insights

A pivot table is a data manipulation tool that rearranges a table and sometimes aggregates the values for easy analysis.

In this article, we’ll look at the Pandas pivot_table function and how to use the various parameters it offers. We’ll explore a real-world dataset from Kaggle to illustrate when and how to use the pivot_table function.

Advantages of a pivot table

  • You can group the data by one or more columns and then summarize the values using various statistics such as mean, sum, and count.
  • It has an easy-to-use syntax that intuitively allows for simple to complex data transformations.

Pivot_table syntax

pandas.pivot_table(data, 
values=None,
index=None,
columns=None,
aggfunc='mean',
fill_value=None,
margins=False,
dropna=True,
margins_name='All',
observed=False,
sort=True)

Either one of the two parameters below must be present to group the data.

  • index : column(s) to group the data row-wise.
  • columns : column(s) to group the data column-wise.
  • values : column(s) to aggregate using the aggfunc function.
  • aggfunc : Function used to aggregate the values by.
  • fill_value : value to replace missing values with.
  • dropna : whether to remove entire rows or columns that contain only NaN values.
  • margins : whether to include row and column subtotals.
  • margins_name : label names for the row and column subtotals.
  • observed : displays only observed values for categorical groupers.
  • sort : whether to sort the resulting…
Читайте также:  Javascript insertbefore мы appendchild

Источник

Как преобразовать сводную таблицу Pandas в DataFrame

Вы можете использовать следующий синтаксис для преобразования сводной таблицы pandas в кадр данных pandas:

df = pivot_name. reset_index() 

В следующем примере показано, как использовать этот синтаксис на практике.

Пример: преобразование сводной таблицы в DataFrame

Предположим, у нас есть следующие Pandas DataFrame:

import pandas as pd #create DataFrame df = pd.DataFrame() #view DataFrame df team position points 0 A G 11 1 A G 8 2 A F 10 3 A F 6 4 B G 6 5 B G 5 6 B F 9 7 B F 12 

Мы можем использовать следующий код для создания сводной таблицы, в которой отображаются средние очки, набранные командой и позицией:

#create pivot table df_pivot = pd.pivot_table(df, values='points', index='team', columns='position') #view pivot table df_pivot position F G team A 8.0 9.5 B 10.5 5.5 

Затем мы можем использовать функцию reset_index() для преобразования этой сводной таблицы в DataFrame pandas:

#convert pivot table to DataFrame df2 = df_pivot. reset_index() #view DataFrame df2 team F G 0 A 8.0 9.5 1 B 10.5 5.5 

Результатом является пандас DataFrame с двумя строками и тремя столбцами.

Мы также можем использовать следующий синтаксис для переименования столбцов DataFrame:

#convert pivot table to DataFrame df2. columns = ['team', 'Forward_Pts', 'Guard_Pts'] #view updated DataFrame df2 team Forward_Pts Guard_Pts 0 A 8.0 9.5 1 B 10.5 5.5 

Дополнительные ресурсы

В следующих руководствах объясняется, как выполнять другие распространенные операции в pandas:

Источник

How to Convert Pandas Pivot Table to DataFrame

You can use the following syntax to convert a pandas pivot table to a pandas DataFrame:

df = pivot_name.reset_index()

The following example shows how to use this syntax in practice.

Example: Convert Pivot Table to DataFrame

Suppose we have the following pandas DataFrame:

import pandas as pd #create DataFrame df = pd.DataFrame(team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'], 'position': ['G', 'G', 'F', 'F', 'G', 'G', 'F', 'F'], 'points': [11, 8, 10, 6, 6, 5, 9, 12]>) #view DataFrame df team position points 0 A G 11 1 A G 8 2 A F 10 3 A F 6 4 B G 6 5 B G 5 6 B F 9 7 B F 12 

We can use the following code to create a pivot table that displays the mean points scored by team and position:

#create pivot table df_pivot = pd.pivot_table(df, values='points', index='team', columns='position') #view pivot table df_pivot position F G team A 8.0 9.5 B 10.5 5.5

We can then use the reset_index() function to convert this pivot table to a pandas DataFrame:

#convert pivot table to DataFrame df2 = df_pivot.reset_index() #view DataFrame df2 team F G 0 A 8.0 9.5 1 B 10.5 5.5 

The result is a pandas DataFrame with two rows and three columns.

We can also use the following syntax to rename the columns of the DataFrame:

#convert pivot table to DataFrame df2.columns = ['team', 'Forward_Pts', 'Guard_Pts'] #view updated DataFrame df2 team Forward_Pts Guard_Pts 0 A 8.0 9.5 1 B 10.5 5.5

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

Источник

How to get rid of multilevel index after using pivot table pandas in Python?

When working with pivot tables in pandas, it is common to end up with a MultiIndex on the columns. This can make it more difficult to work with the data, as certain operations and methods may not work as expected. In this case, we will discuss how to remove the multilevel index and return to a single level index on the columns.

Method 1: Reset Index

To get rid of multilevel index after using pivot table in pandas, you can use the reset_index() method. This method resets the index of the DataFrame to the default integer index. Here’s how you can use it:

import pandas as pd df = pd.DataFrame('A': ['foo', 'foo', 'bar', 'bar', 'foo', 'foo', 'bar', 'bar'], 'B': ['one', 'one', 'one', 'two', 'two', 'two', 'one', 'two'], 'C': ['x', 'y', 'x', 'y', 'x', 'y', 'x', 'y'], 'D': [1, 2, 3, 4, 5, 6, 7, 8], 'E': [2, 4, 6, 8, 10, 12, 14, 16]>) pivot_table = pd.pivot_table(df, values=['D', 'E'], index=['A', 'B'], columns=['C']) pivot_table = pivot_table.reset_index() print(pivot_table)
 A B D E C x y x y x y 0 bar one 3 7 6 14 1 bar two 4 8 8 16 2 foo one 1 2 2 4 3 foo two 5 6 10 12

In this example, we first create a sample DataFrame df . We then create a pivot table using the pd.pivot_table() method. The resulting pivot table has a multilevel index. We then use the reset_index() method to reset the index to the default integer index. Finally, we print the resulting pivot table.

Method 2: Dropping Level

To get rid of multilevel index after using pivot table pandas, you can use the «Dropping Level» method. This method allows you to drop one or more levels from a multi-level column index. Here’s how to do it:

Step 1: Create a pivot table using pandas

import pandas as pd df = pd.DataFrame('A': ['foo', 'foo', 'bar', 'bar', 'foo', 'foo', 'bar', 'bar'], 'B': ['one', 'one', 'one', 'two', 'two', 'two', 'one', 'two'], 'C': ['x', 'y', 'x', 'y', 'x', 'y', 'x', 'y'], 'D': [1, 2, 3, 4, 5, 6, 7, 8]>) pivot_table = df.pivot_table(values='D', index=['A', 'B'], columns=['C'], aggfunc='sum')

Step 2: Drop the first level of the column index using «Dropping Level»

pivot_table.columns = pivot_table.columns.droplevel(0)

Step 3: Print the resulting pivot table

C x y A B bar one 3 7 two 4 8 foo one 1 2 two 5 6

In this example, we first created a pivot table using pandas. Then, we dropped the first level of the column index using the «Dropping Level» method. Finally, we printed the resulting pivot table without the multilevel index.

Note that you can use the «Dropping Level» method to drop any level of a multi-level column index. Simply replace the «0» in the method with the level you want to drop.

Method 3: Using columns.levels and columns.labels

To get rid of multilevel index after using pivot table pandas, you can use the columns.levels and columns.labels attributes. Here are the steps to do it:

import pandas as pd df = pd.DataFrame('A': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'], 'B': ['one', 'one', 'two', 'two', 'one', 'one'], 'C': ['x', 'y', 'x', 'y', 'x', 'y'], 'D': [1, 3, 2, 5, 4, 1]>) pivot_table = df.pivot_table(index=['A', 'B'], columns='C', values='D')
pivot_table = pivot_table.reset_index()
levels = pivot_table.columns.levels
labels = pivot_table.columns.labels
pivot_table.columns = levels[1][labels[1]] + '_' + levels[0][labels[0]]
pivot_table = pivot_table.rename_axis(None, axis=1)

Here is the complete code:

import pandas as pd df = pd.DataFrame('A': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'], 'B': ['one', 'one', 'two', 'two', 'one', 'one'], 'C': ['x', 'y', 'x', 'y', 'x', 'y'], 'D': [1, 3, 2, 5, 4, 1]>) pivot_table = df.pivot_table(index=['A', 'B'], columns='C', values='D') pivot_table = pivot_table.reset_index() levels = pivot_table.columns.levels labels = pivot_table.columns.labels pivot_table.columns = levels[1][labels[1]] + '_' + levels[0][labels[0]] pivot_table = pivot_table.rename_axis(None, axis=1)

This will give you a pivot table with a single-level column index.

Method 4: Using rename_axis() and reset_index()

To get rid of a multilevel index after using pivot table in pandas, you can use the rename_axis() and reset_index() methods. Here is how to do it in a few steps:

import pandas as pd df = pd.read_csv('data.csv') table = pd.pivot_table(df, values='value', index=['A', 'B'], columns=['C'], aggfunc=sum)
table = table.rename_axis(None, axis=0).rename_axis(None, axis=1)

Now you have a flattened table with no index names. Here is the full code:

import pandas as pd df = pd.read_csv('data.csv') table = pd.pivot_table(df, values='value', index=['A', 'B'], columns=['C'], aggfunc=sum) table = table.rename_axis(None, axis=0).rename_axis(None, axis=1) table = table.reset_index() print(table)
 A B X Y 0 a c 1 2 1 b c 3 4 2 b d 5 6

Источник

How to turn a Pandas pivot table to a dataframe (with Example)?

EasyTweaks.com

In this tutorial we’ll explore a simple recipe that you can use to reshape the structure of a Pandas pivot table into a simple tabular looking DataFrame. Remember, the pandas pivot table is already a DataFrame – it is just arranged differently so there is no need to convert it, just to reshape its structure as needed using some simple Python code.

Creating the example dataset

We’ll get started by creating a random dataset that you can use to follow along this example. First off, we’ll import the Pandas library and then initialize our DataFrame.

import pandas as pd language = ['R', 'C#', 'Python', 'R', 'Python', 'Kotlin', 'R', 'R'] office = ['BAR', 'LON', 'PAR', 'LON', 'LON', 'BAR', 'LON', 'BAR'] salary = [111.0, 120.0, 125.0, 120.0, 89.0, 126.0, 89.0, 118.0] interviews = dict(office=office,language=language, salary = salary) # construct the DataFrame df = pd.DataFrame(data=interviews) # pivot the data pvt_tab = df.pivot_table(values='salary', index = 'office', \ columns= 'language', \ aggfunc= 'sum', \ fill_values='') print(pvt_tab) 

Pivot table to tabular DataFrame structure

We’ll start by re-setting the Pandas pivot table index:

This will result in the following structure – note the index column at the left hand side.

And then use the melt DataFrame method to convert the pivot from its originally wide form into a longer form DataFrame:

pvt_tab.reset_index().melt(id_vars = ['office']) 

Note that we use the id_vars parameter to specify the office structure as the identifier variable, to obtain the following result:

Important note: Remember to reset the index of the pivot table before converting it to long form with melt. Failing to do that will result in the following Jey error:

KeyError: "The following 'id_vars' are not present in the DataFrame: ['your_filed']"

Источник

Оцените статью