- How to Get Column Names of Pandas DataFrame?
- Examples
- 1. Print DataFrame column names
- 2. Access individual column names using index
- 3. Print column names using For loop
- Summary
- Pandas Get Column Names from DataFrame
- 1. Quick Examples of Get Column Names
- 2. pandas Get Column Names
- 3. Use list(df) to Get Column Names from DataFrame
- 4. Get Column Names in Sorting order
- 5. Access All Column Names by Iterating
- 6. Get Column Headers Using the keys() Method
- 7. Get All Numeric Column Names
- 9. Complete Example of pandas Get Columns Names
- Conclusion
- Related Articles
- References
- You may also like reading:
- How to Get Column Names in Pandas Dataframe – Definitive Guide
- Sample Dataframe
- Pandas Get Column Names
- Pandas Get List From Dataframe Columns Headers
- Pandas List Column Names and Types
- Pandas Get Column Names by Index
- Pandas Get Column Names Based on Condition
- You May Also Like
How to Get Column Names of Pandas DataFrame?
To get the column names of DataFrame, use DataFrame.columns property.
The syntax to use columns property of a DataFrame is
The columns property returns an object of type Index. We could access individual names using any looping technique in Python.
Examples
1. Print DataFrame column names
In this example, we get the dataframe column names and print them.
Python Program
import pandas as pd # Initialize a DataFrame df = pd.DataFrame( [['Amol', 72, 67, 91], ['Lini', 78, 69, 87], ['Kiku', 74, 56, 88], ['Ajit', 54, 76, 78]], columns=['name', 'physics', 'chemistry', 'algebra']) # Get the DataFrame column names cols = df.columns # Print the column names print(cols)
Index(['name', 'physics', 'chemistry', 'algebra'], dtype='object')
2. Access individual column names using index
You can access individual column names using the index.
Python Program
import pandas as pd # Initialize a DataFrame df = pd.DataFrame( [['Amol', 72, 67, 91], ['Lini', 78, 69, 87], ['Kiku', 74, 56, 88], ['Ajit', 54, 76, 78]], columns=['name', 'physics', 'chemistry', 'algebra']) # Get the DataFrame column names cols = df.columns # Print the column names for i in range(len(cols)): print(cols[i])
name physics chemistry algebra
3. Print column names using For loop
You can use a For loop to iterate over the column names of DataFrame.
Python Program
import pandas as pd # Initialize a dataframe df = pd.DataFrame( [['Amol', 72, 67, 91], ['Lini', 78, 69, 87], ['Kiku', 74, 56, 88], ['Ajit', 54, 76, 78]], columns=['name', 'physics', 'chemistry', 'algebra']) # Get the DataFrame column names cols = df.columns # Print the column names using For loop for column in cols: print(column)
name physics chemistry algebra
Summary
In this Pandas Tutorial, we extracted the column names from DataFrame using DataFrame.column property.
Pandas Get Column Names from DataFrame
How to get or print Pandas DataFrame Column Names? You can get the Pandas DataFrame Column Names by using DataFrame.columns.values method and to get it as a list use tolist(). Each column in a Pandas DataFrame has a label/name that specifies what type of value it holds/represents. Getting a column names is useful when you wanted to access all columns by name programmatically or manipulate the values of all columns. In this article, I will explain different ways to get column names from pandas DataFrame headers with examples.
To get a list of columns from the DataFrame header use DataFrame.columns.values.tolist() method. Below is an explanation of each section of the statement.
- .columns returns an Index object with column names. This preserves the order of column names.
- .columns.values returns an array and this has a helper function .tolist() that returns a list of column names.
1. Quick Examples of Get Column Names
Following are some quick examples of how to get column names from pandas DataFrame, If you wanted to print it to console just use the print() statment.
# Below are some quick examples # Get the list of all column names from headers column_names = list(df.columns.values) # Get the list of all column names from headers column_names = df.columns.values.tolist() # Using list(df) to get the column headers as a list column_names = list(df.columns) # Using list(df) to get the list of all Column Names column_names = list(df) # Dataframe show all columns sorted list column_names=sorted(df) # Get all Column Header Labels as List for column_headers in df.columns: print(column_headers) column_names = df.keys().values.tolist() # Get all numeric columns numeric_columns = df._get_numeric_data().columns.values.tolist() # Simple Pandas Numeric Columns Code numeric_columns=df.dtypes[df.dtypes == "int64"].index.values.tolist()
Create a Pandas DataFrame from Dict with a few rows and with columns names Courses , Fee , Duration and Discount .
import pandas as pd import numpy as np technologies= < 'Courses':["Spark","PySpark","Hadoop","Python","Pandas"], 'Fee' :[22000,25000,23000,24000,26000], 'Duration':['30days','50days','30days', None,np.nan], 'Discount':[1000,2300,1000,1200,2500] >df = pd.DataFrame(technologies) print(df)
2. pandas Get Column Names
You can get the column names from pandas DataFrame using df.columns.values , and pass this to python list() function to get it as list, once you have the data you can print it using print() statement. I will take a moment to explain what is happening on this statement, df.columns attribute returns an Index object which is a basic object that stores axis labels. Index object provides a property Index.values that returns data in an array, in our case it returns column names in an array.
Note that df.columns preserve the order of the columns as-is.
To convert an array of column names into a list, we can use either .toList() on array object or use list(array object) .
# Get the list of all column names from headers column_headers = list(df.columns.values) print("The Column Header :", column_headers)
# Output: The Column Header : ['Courses', 'Fee', 'Duration', 'Discount']
You can also use df.columns.values.tolist() to get the DataFrame column names.
# Get the list of all column names from headers column_headers = df.columns.values.tolist() print("The Column Header :", column_headers)
3. Use list(df) to Get Column Names from DataFrame
Use list(df) to get the column header from pandas DataFrame. You can also use list(df.columns) to get column names.
# Using list(df) to get the column headers as a list column_headers = list(df.columns) # Using list(df) to get the list of all Column Names column_headers = list(df)
4. Get Column Names in Sorting order
In order to get a list of column names in a sorted order use sorted(df) function. this function returns column names in alphabetical order.
# Dataframe show all columns sorted list col_headers=sorted(df) print(col_headers)
Yields below output. Notice the difference of output from above.
# Output: ['Courses', 'Discount', 'Duration', 'Fee']
5. Access All Column Names by Iterating
Sometimes you may need to iterate over all columns and apply some function, you can do this as below.
# Get all Column Header Labels as List for column_headers in df.columns: print(column_headers)
# Output: Courses Fee Duration Discount
6. Get Column Headers Using the keys() Method
df.keys() is another approach to get all column names as a list from pandas DataFrame.
# Get column header using keys() method column_headers = df.keys().values.tolist() print("The Column Header :", column_headers)
# Output: The Column Header : Index(['Courses', 'Fee', 'Duration', 'Discount'], dtype='object')
7. Get All Numeric Column Names
Sometimes while working on the analytics, you may need to work only on numeric columns, hence you would be required to get all columns of a specific data type. For example, getting all columns of numeric data type can get using undocumented function df._get_numeric_data() .
# Get all numeric columns numeric_columns = df._get_numeric_data().columns.values.tolist() print(numeric_columns)
Use for df.dtypes[df.dtypes!=»Courses»].index : This is another simple code for finding numeric columns in a pandas DataFrame.
# Simple Pandas Numeric Columns Code numeric_columns=df.dtypes[df.dtypes == "int64"].index.values.tolist()
Yields same output as above.
9. Complete Example of pandas Get Columns Names
import pandas as pd import numpy as np technologies= < 'Courses':["Spark","PySpark","Hadoop","Python","Pandas"], 'Fee' :[22000,25000,23000,24000,26000], 'Duration':['30days','50days','30days', None,np.nan], 'Discount':[1000,2300,1000,1200,2500] >df = pd.DataFrame(technologies) print(df) # Get the list of all column names from headers column_headers = list(df.columns.values) print("The Column Header :", column_headers) # Get the list of all column names from headers column_headers = df.columns.values.tolist() print("The Column Header :", column_headers) # Using list(df) to get the column headers as a list column_headers = list(df.columns) # Using list(df) to get the list of all Column Names column_headers = list(df) # Dataframe show all columns sorted list col_headers=sorted(df) print(col_headers) # Get all Column Header Labels as List for column_headers in df.columns: print(column_headers) column_headers = df.keys().values.tolist() print("The Column Header :", column_headers) # Get all numeric columns numeric_columns = df._get_numeric_data().columns.values.tolist() print(numeric_columns) # Simple Pandas Numeric Columns Code numeric_columns=df.dtypes[df.dtypes == "int64"].index.values.tolist() print(numeric_columns)
Conclusion
In this article, you have learned how to get or print the column names using df.columns , list(df) , df.keys , and also learned how to get all column names of type integer, finally getting column names in a sorted order e.t.c
Related Articles
References
You may also like reading:
How to Get Column Names in Pandas Dataframe – Definitive Guide
Pandas dataframe is a two-dimensional data structure used to store data in rows and columns format. Each column will have headers/names.
You can get column names in Pandas dataframe using df.columns statement.
In this tutorial, you’ll learn the different methods available to get column names from the pandas dataframe.
If you’re in Hurry
You can use the below code snippet to get column names from pandas dataframe.
You’ll see all the column names from the dataframe printed as Index.
The index is an immutable sequence used for indexing.
Index(['product_name', 'Unit_Price', 'No_Of_Units', 'Available_Quantity', 'Available_Since_Date'], dtype='object')
If You Want to Understand Details, Read on…
In this tutorial, you’ll learn the different methods available to get the pandas dataframe column headers for various purposes.
Sample Dataframe
This is the sample dataframe used throughout the tutorial.
import pandas as pd data = <"product_name":["Keyboard","Mouse", "Monitor", "CPU", "Speakers",pd.NaT], "Unit_Price":[500,200, 5000, 10000, 250.50,350], "No_Of_Units":[5,5, 10, 20, 8,pd.NaT], "Available_Quantity":[5,6,10,"Not Available", pd.NaT,pd.NaT], "Available_Since_Date":['11/5/2021', '4/23/2021', '08/21/2021','09/18/2021','01/05/2021',pd.NaT] >df = pd.DataFrame(data) # Converting one column as float to demonstrate dtypes df = df.astype() df
Dataframe Looks Like
product_name | Unit_Price | No_Of_Units | Available_Quantity | Available_Since_Date | |
---|---|---|---|---|---|
0 | Keyboard | 500.0 | 5 | 5 | 11/5/2021 |
1 | Mouse | 200.0 | 5 | 6 | 4/23/2021 |
2 | Monitor | 5000.0 | 10 | 10 | 08/21/2021 |
3 | CPU | 10000.0 | 20 | Not Available | 09/18/2021 |
4 | Speakers | 250.5 | 8 | NaT | 01/05/2021 |
5 | NaT | 350.0 | NaT | NaT | NaT |
Now, let’s see how to get the column headers.
Pandas Get Column Names
In this section, you’ll see how to get column names using different methods.
Using Columns
The columns attribute of the dataframe returns the column labels of the dataframe.
Index(['product_name', 'Unit_Price', 'No_Of_Units', 'Available_Quantity', 'Available_Since_Date'], dtype='object')
Get Column Names as Array
You can get the column names as an array using the .columns.values property of the dataframe.
You’ll see the column headers returned as array .
array(['product_name', 'Unit_Price', 'No_Of_Units', 'Available_Quantity', 'Available_Since_Date'], dtype=object)
Pandas Get List From Dataframe Columns Headers
You can get column names as a list by using the .columns.values property and converting it to a list using the tolist() method, as shown below.
You’ll see the column headers returned as list.
['product_name', 'Unit_Price', 'No_Of_Units', 'Available_Quantity', 'Available_Since_Date']
Another way to get column headers as a list is by using the list() method.
You can pass the dataframe object to the list() method. It’ll return the column headers as a list.
columns_list = list(df) columns_list
You’ll see the column headers displayed as a list.
['product_name', 'Unit_Price', 'No_Of_Units', 'Available_Quantity', 'Available_Since_Date']
This is how you can get pandas column names as a list.
Pandas List Column Names and Types
In this section, you’ll learn how to list column names and types of each column of the dataframe.
You can do this by using the dtypes. The dtypes return a series with the data type of each column in the dataframe.
You’ll see the column name and the data type of each column is printed as series.
product_name object Unit_Price float64 No_Of_Units object Available_Quantity object Available_Since_Date object dtype: object
Pandas Get Column Names by Index
In this section, you’ll learn how to get column names by using its index.
- You can get the name from a specific index by passing the index to the columns attribute
- The index is 0 based. Hence, if you use 2 , you’ll get a column from the third position.
You’ll see the column header available in the position 3 .
This is how you can get a single column header using the index.
Pandas Get Column Names Based on Condition
In this section, you’ll learn how to get column names based on conditions.
- This can be useful when you want to identify columns that contain specific values. It is also known as getting column names by value.
- For example, if you need to get column names which have the value 5 in any cell, then you can use the following example.
df.columns[ (df == 5) # mask .any(axis=0) # mask ]
In the sample dataframe, the columns No_Of_Units and Available_Quantity contains the value 5 . Hence, you’ll see the two columns printed as index .
Index(['No_Of_Units', 'Available_Quantity'], dtype='object')
This is how you can get column names based on value.