- How to rename column in Pandas
- Step 1: Rename all column names in Pandas DataFrame
- Step 2: Rename specific column names in Pandas
- Step 3: Rename column names in Pandas with lambda
- Step 4: Rename column names in Pandas with str methods
- Step 5: Rename multi-level column names in DataFrame
- Resources
- pandas.DataFrame.rename#
- How To Change Column Names and Row Indexes in Pandas?
- 1. How to Rename Columns in Pandas?
- 2. Pandas rename function to Rename Columns
- How To Change and Row Names/Indexes in Pandas?
- How To Change Column Names and Row Indexes Simultaneously in Pandas?
How to rename column in Pandas
In this short guide, I’ll show you how to rename column names in Pandas DataFrame.
(1) rename single column
df.rename(columns = , inplace = True)
(2) rename multiple columns
column_map = df = df.rename(columns=column_map)
(3) rename multi-index columns
cols = pd.MultiIndex.from_tuples([(0, 1), (0, 2)]) df = pd.DataFrame([[1,2], [3,4]], columns=cols)
(4) rename all columns
In the next sections, I’ll review the steps to apply the above syntax in practice and a few exceptional cases.
Let’s say that you have the following DataFrame with random numbers generated by:
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randint(0,10,size=(5, 5)), columns=list('ABCDF'))
A | B | C | D | F | |
---|---|---|---|---|---|
0 | 4 | 8 | 9 | 0 | 1 |
1 | 5 | 8 | 6 | 1 | 0 |
2 | 7 | 9 | 8 | 1 | 1 |
3 | 6 | 8 | 3 | 8 | 9 |
4 | 6 | 0 | 2 | 8 | 8 |
If you like to understand more about how to create DataFrame with random numbers please check: How to Create a Pandas DataFrame of Random Integers
Step 1: Rename all column names in Pandas DataFrame
Column names in Pandas DataFrame can be accessed by attribute: .columns :
Index(['A', 'B', 'C', 'D', 'F'], dtype='object')
The same attribute can be used to rename all columns in Pandas.
df.columns = ['First', 'Second', '3rd', '4th', '5th']
Index(['First', 'Second', '3rd', '4th', '5th'], dtype='object')
df.rename(columns = , inplace = True)
Step 2: Rename specific column names in Pandas
If you like to rename specific columns in Pandas you can use method — .rename . Let’s work with the first DataFrame with names — A, B etc.
To rename two columns — A, B to First, Second we can use the following code:
column_map = df = df.rename(columns=column_map)
Index(['First', 'Second', 'C', 'D', 'F'], dtype='object')
Note: If any of the column names are missing they will be skipped without any error or warning because of default parameter errors=’ignore’
Note 2: Instead of syntax: df = df.rename(columns=column_map) you can use df.rename(columns=column_map, inplace=False)
df.rename(columns=lambda x: x.lstrip())
Step 3: Rename column names in Pandas with lambda
Sometimes you may like to replace a character or apply other functions to DataFrame columns. In this example we will change all columns names from upper to lowercase:
df = df.rename(columns=lambda x: x.lower())
Index(['a', 'b', 'c', 'd', 'f'], dtype='object')
This step is suitable for complex transformations and logic.
Step 4: Rename column names in Pandas with str methods
You can apply str methods to Pandas columns. For example we can add extra character for each column name with a regex:
df.columns = df.columns.str.replace(r'(.*)', r'Column \1')
Working with the original DataFrame will give us:
Index(['Column A', 'Column B', 'Column C', 'Column D', 'Column F'], dtype='object')
Step 5: Rename multi-level column names in DataFrame
Finally let’s check how to rename columns when you have MultiIndex. Let’s have a DataFrame like:
import pandas as pd cols = pd.MultiIndex.from_tuples([(0, 1), (0, 2)]) df = pd.DataFrame([[1,2], [3,4]], columns=cols)
If we check the column names we will get:
Renaming of the MultiIndex columns can be done by:
df.columns = pd.MultiIndex.from_tuples([('A', 'B'), ('A', 'C')])
Resources
By using DataScientYst — Data Science Simplified, you agree to our Cookie Policy.
pandas.DataFrame.rename#
Function / dict values must be unique (1-to-1). Labels not contained in a dict / Series will be left as-is. Extra labels listed don’t throw an error.
Parameters mapper dict-like or function
Dict-like or function transformations to apply to that axis’ values. Use either mapper and axis to specify the axis to target with mapper , or index and columns .
index dict-like or function
Alternative to specifying axis ( mapper, axis=0 is equivalent to index=mapper ).
columns dict-like or function
Alternative to specifying axis ( mapper, axis=1 is equivalent to columns=mapper ).
Axis to target with mapper . Can be either the axis name (‘index’, ‘columns’) or number (0, 1). The default is ‘index’.
copy bool, default True
Also copy underlying data.
inplace bool, default False
Whether to modify the DataFrame rather than creating a new one. If True then value of copy is ignored.
level int or level name, default None
In case of a MultiIndex, only rename labels in the specified level.
errors , default ‘ignore’
If ‘raise’, raise a KeyError when a dict-like mapper , index , or columns contains labels that are not present in the Index being transformed. If ‘ignore’, existing keys will be renamed and extra keys will be ignored.
Returns DataFrame or None
DataFrame with the renamed axis labels or None if inplace=True .
If any of the labels is not found in the selected axis and “errors=’raise’”.
DataFrame.rename supports two calling conventions
We highly recommend using keyword arguments to clarify your intent.
Rename columns using a mapping:
>>> df = pd.DataFrame("A": [1, 2, 3], "B": [4, 5, 6]>) >>> df.rename(columns="A": "a", "B": "c">) a c 0 1 4 1 2 5 2 3 6
Rename index using a mapping:
>>> df.rename(index=0: "x", 1: "y", 2: "z">) A B x 1 4 y 2 5 z 3 6
Cast index labels to a different type:
>>> df.index RangeIndex(start=0, stop=3, step=1) >>> df.rename(index=str).index Index(['0', '1', '2'], dtype='object')
>>> df.rename(columns="A": "a", "B": "b", "C": "c">, errors="raise") Traceback (most recent call last): KeyError: ['C'] not found in axis
Using axis-style parameters:
>>> df.rename(str.lower, axis='columns') a b 0 1 4 1 2 5 2 3 6
>>> df.rename(1: 2, 2: 4>, axis='index') A B 0 1 4 2 2 5 4 3 6
How To Change Column Names and Row Indexes in Pandas?
One of the most common operations one might do while cleaning the data or doing exploratory data analysis in doing data science is manipulating/fixing the column names or row names.
- How to rename columns of pandas dataframe?
- How to change row names or row indexes of a pandas dataframe?
# import pandas >import pandas as pd
Let us use gapminder data from software carpentry website.
# link to gapminder data data_url = 'http://bit.ly/2cLzoxH' # read data from url as pandas dataframe >gapminder = pd.read_csv(data_url)
let us check the names of the columns of the dataframe, the first three rows of the data, using head function.
>print(gapminder.head(3)) country year pop continent lifeExp gdpPercap 0 Afghanistan 1952 8425333 Asia 28.801 779.445314 1 Afghanistan 1957 9240934 Asia 30.332 820.853030 2 Afghanistan 1962 10267083 Asia 31.997 853.100710
We can also use columns function to get the column names.
>gapminder.columns Index(['country', 'year', 'pop', 'continent', 'lifeExp', 'gdpPercap'], dtype='object')
1. How to Rename Columns in Pandas?
One can change the column names of a pandas dataframe in at least two ways. One way to rename columns in Pandas is to use df.columns from Pandas and assign new names directly.
For example, if you have the names of columns in a list, you can assign the list to column names directly.
To change the columns of gapminder dataframe, we can assign the list of new column names to gapminder.columns as
>gapminder.columns = ['country','year','population', 'continent','life_exp','gdp_per_cap']
This will assign the names in the list as column names for the data frame “gapminder”. We can check the dataframe to see that if it has new column names using head() function.
>gapminder.head(3) country year population continent life_exp gdp_per_cap 0 Afghanistan 1952 8425333 Asia 28.801 779.445314 1 Afghanistan 1957 9240934 Asia 30.332 820.853030 2 Afghanistan 1962 10267083 Asia 31.997 853.100710
A problem with this approach to change column names is that one has to change names of all the columns in the data frame. This approach would not work, if we want to change just change the name of one column.
2. Pandas rename function to Rename Columns
Another way to change column names in pandas is to use rename function. Using rename to change column names is a much better way than before. One can change names of specific column easily. And not all the column names need to be changed.
To change column names using rename function in Pandas, one needs to specify a mapper, a dictionary with old name as keys and new name as values. Here is an example to change many column names using a dictionary. We will also use inplace=True to change column names in place.
>gapminder.rename(columns=, inplace=True) >print(gapminder.columns) Index([u'country', u'year', u'population', u'continent', u'life_exp', u'gdp_per_cap'], dtype='object') >gapminder.head(3) country year population continent life_exp gdp_per_cap 0 Afghanistan 1952 8425333 Asia 28.801 779.445314 1 Afghanistan 1957 9240934 Asia 30.332 820.853030 2 Afghanistan 1962 10267083 Asia 31.997 853.100710
One of the biggest advantages of using rename function is that we can use rename to change as many column names as we want.
Let us change the name of a single column.
>gapminder.rename(columns=, inplace=True) >print(gapminder.columns) Index([u'country', u'year', u'population', u'continent', u'lifeExp', u'gdpPercap'], dtype='object') >gapminder.head(3) country year population continent lifeExp gdpPercap 0 Afghanistan 1952 8425333 Asia 28.801 779.445314 1 Afghanistan 1957 9240934 Asia 30.332 820.853030 2 Afghanistan 1962 10267083 Asia 31.997 853.100710
Pandas rename function can also take a function as input instead of a dictionary. For example, we can write a lambda function to take the current column names and consider only the first three characters for the new column names.
>gapminder.rename(columns=lambda x: x[0:3], inplace=True) >gapminder.head(3) coun year pop cont life gdpP 0 Afghanistan 1952 8425333 Asia 28.801 779.445314 1 Afghanistan 1957 9240934 Asia 30.332 820.853030 2 Afghanistan 1962 10267083 Asia 31.997 853.100710
How To Change and Row Names/Indexes in Pandas?
Another good thing about pandas rename function is that, we can also use it to change row indexes or row names.
We just need to use index argument and specify, we want to change index not columns.
For example, to change row names 0 and 1 to ‘zero’ and ‘one’ in our gapminder dataframe, we will construct a dictionary with old row index names as keys and new row index as values.
>gapminder.rename(index=, inplace=True) >print(gapminder.head(4)) country year pop continent lifeExp gdpPercap zero Afghanistan 1952 8425333 Asia 28.801 779.445314 one Afghanistan 1957 9240934 Asia 30.332 820.853030 2 Afghanistan 1962 10267083 Asia 31.997 853.100710 3 Afghanistan 1967 11537966 Asia 34.020 836.197138
We can see that just first two rows have new names as we intended.
How To Change Column Names and Row Indexes Simultaneously in Pandas?
With pandas’ rename function, one can also change both column names and row names simultaneously by using both column and index arguments to rename function with corresponding mapper dictionaries.
Let us change the column name “lifeExp” to “life_exp” and also row indices “0 & 1” to “zero and one”.
>gapminder.rename(columns=, index=, inplace=True) >print(gapminder.head(4)) country year pop continent life_exp gdpPercap zero Afghanistan 1952 8425333 Asia 28.801 779.445314 one Afghanistan 1957 9240934 Asia 30.332 820.853030 2 Afghanistan 1962 10267083 Asia 31.997 853.100710 3 Afghanistan 1967 11537966 Asia 34.020 836.197138
Are you new to Pandas? And getting started with Pandas recently? Check out our new Byte Sized Pandas 101 tutorials.