- How to find a value in columns of your pandas DataFrame?
- Finding specific value in Pandas DataFrame column
- Search for multiple strings in a column
- Filter column by value greater than
- Searching for a string and return the index
- Find values based on other column values
- Replace values found in column
- Recent Posts
- How to Search a Value Within a Pandas DataFrame Column?
- Python Pandas Code Example to Search for a Value in a DataFrame Column
- Python Pandas Sample Code to Find Value in DataFrame
- Table of Contents
- Step 1 — Import the library
- Step 2 — Setting up the Data
- Step 3 — Searching the Values in the DataFrame
- Search Value in pandas DataFrame in Python (2 Examples)
- Example Data & Add-On Libraries
- Example 1: Return Matrix of Logicals Indicating Location of Particular Value
- Example 2: Test if Value is Contained in pandas DataFrame Column
- Video, Further Resources & Summary
- Pandas: выберите строки, где значение появляется в любом столбце
- Пример 1: найти значение в любом столбце
- Пример 2: поиск символа в любом столбце
How to find a value in columns of your pandas DataFrame?
In today’s tutorial we’ll learn how to find specific single or multiple values across columns of your pandas DataFrame.
We’ll first import the pandas Data Analysis library. Look into this tutorial in case that you have issues importing pandas into your Python development environment (Jupyter, PyCharm or others).
We’ll now define a simple pandas DataFrame that you are able to use to follow along this example.
month = ['July', 'October', 'September', 'July', 'November', 'July'] language = ['Java', 'Python', 'Python', 'Javascript', 'R', 'Javascript'] office = ['Istanbul', 'New York', 'Osaka', 'Toronto', 'New York', 'Hong Kong'] salary = [181.0, 203.0, 163.0, 181.0, 121.0, 132.0] hr_campaign = dict(month = month, language = language, salary = salary) interviews_data = pd.DataFrame(data=hr_campaign)
Let’s take a look at the data:
month | language | salary | |
---|---|---|---|
0 | July | Java | 181.0 |
1 | October | Python | 203.0 |
2 | September | Python | 163.0 |
3 | July | Javascript | 181.0 |
4 | November | R | 121.0 |
Finding specific value in Pandas DataFrame column
Let’s assume that we would like to find interview data related to Python candidates. We’ll define our search criteria and filter the pandas DataFrame accordingly.
value = 'Python' mask = interviews_data['language'].str.contains(value) interviews_data[mask]
month | language | salary | |
---|---|---|---|
1 | October | Python | 203.0 |
2 | September | Python | 163.0 |
Search for multiple strings in a column
In the same fashion we can find multiple strings. We’ll first define a Python list containing values to search for and then subset the DataFrame.
value_lst = ['Java', 'Python'] mask = interviews_data['language'].isin(value_lst) interviews_data[mask]
This also returns a DataFrame:
month | language | salary | |
---|---|---|---|
0 | July | Java | 181.0 |
1 | October | Python | 203.0 |
2 | September | Python | 163.0 |
Filter column by value greater than
Same logic applies for selecting specific rows according to a condition. In the following example we will retrieve rows with average salary expectation higher to the maximal salary threshold defined by the company:
max_salary = 170 mask = interviews_data['salary'] >= max_salary interviews_data[mask]
Searching for a string and return the index
We can return the index of the relevant rows using the index DataFrame method:
This will return the following index:
Int64Index([0, 1, 3], dtype='int64')
Find values based on other column values
In the next use case, we’ll use the query DataFrame method to find the salaries pertaining to the Python candidates:
value = 'Python' python_salaries = interviews_data.query('language == @value')[['month','salary']] python_salaries
This will render the following DataFrame subset:
month | salary | |
---|---|---|
1 | October | 203.0 |
2 | September | 163.0 |
Replace values found in column
After finding specific string or numeric values, we can also use the replace DataFrame method to replace values as needed. In this very trivial example we’ll replace the string “Javascript” with the string “JavaScript”.
max_salary = 170 mask = interviews_data['salary'] >= max_salary high_salaries = interviews_data[mask].replace('Javascript', 'JavaScript')
Recent Posts
How to Search a Value Within a Pandas DataFrame Column?
Python Pandas Code Example to Search for a Value in a DataFrame Column
When working with a large dataset on any machine learning or data science project, there is a need to search for some values in a feature, and for that values, we need to get the values from other features. Searching for values within a dataset might sound complicated but Python Pandas makes it easy.
The Python Pandas Code below does the following:
1. Creates data dictionary and converts it into DataFrame
2. Uses the «where» function to filter out desired data columns. The pandas.DataFrame.where() function is like the if-then idiom which checks for a condition to return the result accordingly.
Python Pandas Sample Code to Find Value in DataFrame
Below is the pandas code in python to search for a value within a Pandas DataFrame column —
Table of Contents
Step 1 — Import the library
We have only imported the python pandas library which is needed for this code example.
Step 2 — Setting up the Data
We have created a dictionary of data and passed it to pd.DataFrame to make a dataframe with columns ‘first_name’, ‘last_name’, ‘age’, ‘Comedy_Score’ and ‘Rating_Score’.
Try A Few More Pandas Code Examples With These Python Pandas Projects with Source Code
Step 3 — Searching the Values in the DataFrame
We are searching the data in the feature Rating_Score which have values less than 50 and for those values, we are selecting the corresponding values in comedy_Score.
The output is as shown below —
first_name last_name age Comedy_Score Rating_Score 0 Sheldon Copper 42 9 25 1 Raj Koothrappali 38 7 25 2 Leonard Hofstadter 36 8 49 3 Howard Wolowitz 41 8 62 4 Amy Fowler 35 5 70 0 9.0 1 7.0 2 8.0 3 NaN 4 NaN Name: Comedy_Score, dtype: float64
Search Value in pandas DataFrame in Python (2 Examples)
In this tutorial you’ll learn how to locate a specific value in a pandas DataFrame in the Python programming language.
The article looks as follows:
Example Data & Add-On Libraries
In order to use the functions of the pandas library, we first have to import pandas:
import pandas as pd # Import pandas library in Python
As a next step, I also need to create some example data:
data = pd.DataFrame('x1':range(80, 73, - 1), # Create pandas DataFrame 'x2':['a', 'b', 'c', 'a', 'c', 'c', 'b'], 'x3':range(27, 20, - 1)>) print(data) # Print pandas DataFrame
Table 1 shows that our pandas DataFrame consists of seven lines and three columns.
Example 1: Return Matrix of Logicals Indicating Location of Particular Value
In Example 1, I’ll illustrate how to create and print a data matrix containing logical values that indicate whether a data cell contains a particular value.
Let’s assume that we want to find out which elements in our example DataFrame contain the character ‘b’. Then, we can apply the isin function as shown below:
search_result_1 = data.isin(['b']) # Create matrix of logical values print(search_result_1) # Print output
By executing the previous Python syntax, we have constructed Table 2, i.e. a matrix of logicals that indicates the locations of the element ‘b’ in our input data set.
Example 2: Test if Value is Contained in pandas DataFrame Column
In this example, I’ll illustrate how to check if an entire column contains a certain value at least once.
To achieve this, we have to use the any function in addition to the isin function as shown below:
search_result_2 = data.isin(['b']).any() # Check by column print(search_result_2) # Print output # x1 False # x2 True # x3 False # dtype: bool
The previous output shows that only the second variable x2 contains the character ‘b’.
Video, Further Resources & Summary
I have recently released a video on my YouTube channel, which illustrates how to search and find particular values in a pandas DataFrame. You can find the video below.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Accept YouTube Content
Furthermore, you may want to have a look at the other articles on this website.
In this tutorial, I have shown how to search, find, and locate a specific value in a pandas DataFrame in the Python programming language. In case you have additional questions and/or comments, don’t hesitate to let me know in the comments section.
Pandas: выберите строки, где значение появляется в любом столбце
Часто вам может понадобиться выбрать строки кадра данных pandas, в которых определенное значение появляется в любом из столбцов.
К счастью, это легко сделать с помощью функции .any pandas. В этом руководстве объясняется несколько примеров использования этой функции на практике.
Пример 1: найти значение в любом столбце
Предположим, у нас есть следующие Pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame() #view DataFrame print(df) # points assists rebounds #0 25 5 11 #1 12 7 8 #2 15 7 10 #3 14 9 6 #4 19 12 6**
Следующий синтаксис показывает, как выбрать все строки DataFrame, содержащие значение 25 в любом из столбцов:
df[df.isin([25]).any(axis= 1 )] points assists rebounds 0 25 5 11
Следующий синтаксис показывает, как выбрать все строки DataFrame, содержащие значения 25, 9 или 6 в любом из столбцов:
df[df.isin([25, 9, 6 ]).any(axis= 1 )] # points assists rebounds #0 25 5 11 #3 14 9 6 #4 19 12 6**
Пример 2: поиск символа в любом столбце
Предположим, у нас есть следующие Pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame() #view DataFrame print(df) # points assists position #0 25 5 G #1 12 7 G #2 15 7 F #3 14 9 F #4 19 12 C**
Следующий синтаксис показывает, как выбрать все строки DataFrame, содержащие символ G в любом из столбцов:
df[df.isin(['G']).any(axis= 1 )] points assists position 0 25 5 G 1 12 7 G
Следующий синтаксис показывает, как выбрать все строки DataFrame, содержащие значения G или C в любом из столбцов:
df[df.isin(['G', 'C']).any(axis= 1)] points assists position 0 25 5 G 1 12 7 G 4 19 12 C