Python dataframe delete row

How to Remove Rows from DataFrame in Pandas

You can remove rows from a data frame using the following approaches.

Method 1: Using the drop() method

To remove single or multiple rows from a DataFrame in Pandas, you can use the drop() method by specifying the index labels of the rows you want to remove.

import pandas as pd # Create a sample DataFrame df = df = pd.DataFrame(df) # Remove row with index label 1 df = df.drop(1) print("Removing a single row") print(df) # Remove rows with index labels 2 and 3 df = df.drop([2, 3]) print("Removing multiple rows") print(df)
Removing a single row A B C 0 1 5 9 2 3 7 11 3 4 8 12 Removing multiple rows A B C 0 1 5 9

Method 2: Using boolean indexing

You can use boolean indexing to filter out the rows you want to remove based on a condition.

import pandas as pd # Create a sample DataFrame df = df = pd.DataFrame(df) # Remove rows where the 'B' column is less than or equal to 6 df = df[df['B'] > 6] print(df)

Method 3: Using the query() method

You can use the query() method to remove rows based on a query expression.

import pandas as pd # Create a sample DataFrame df = df = pd.DataFrame(df) # Remove rows where the 'C' column is greater than than 9 df = df.query('C > 9') print(df)
 A B C 1 2 6 10 2 3 7 11 3 4 8 12

Method 4: Using the iloc[] indexer

Use the iloc[] indexer to select rows based on their integer locations.

import pandas as pd # Create a sample DataFrame df = df = pd.DataFrame(df) # Remove the three rows df = df.iloc[1:] print(df)
 A B C 1 2 6 10 2 3 7 11 3 4 8 12

Method 5: Using the drop() method with the index attribute

You can remove rows by their index label(s) with the drop() method combined with the index attribute.

import pandas as pd # Create a sample DataFrame df = df = pd.DataFrame(df) # Remove the row with the index label '0' df = df.drop(df.index[0]) # Remove the rows with index labels '1' and '2' df = df.drop(df.index[[1, 2]]) print(df)

Источник

Pandas Drop Rows From DataFrame Examples

By using pandas.DataFrame.drop() method you can drop/remove/delete rows from DataFrame. axis param is used to specify what axis you would like to remove. By default axis = 0 meaning to remove rows. Use axis=1 or columns param to remove columns. By default, pandas return a copy DataFrame after deleting rows, use inpalce=True to remove from existing referring DataFrame.

In this article, I will cover how to remove rows by labels, by indexes, by ranges and how to drop inplace and None , Nan & Null values with examples. if you have duplicate rows, use drop_duplicates() to drop duplicate rows from pandas DataFrame

1. Pandas.DataFrame.drop() Syntax – Drop Rows & Columns

  • labels – Single label or list-like. It’s used with axis param.
  • axis – Default set’s to 0. 1 to drop columns and 0 to drop rows.
  • index – Use to specify rows. Accepts single label or list-like.
  • columns – Use to specify columns. Accepts single label or list-like.
  • level – int or level name, optional, use for Multiindex.
  • inplace – Default False , returns a copy of DataFrame. When used True , it drop’s column inplace (current DataFrame) and returns None .
  • errors – , default ‘raise’

Let’s create a DataFrame, run some examples and explore the output. Note that our DataFrame contains index labels for rows which I am going to use to demonstrate removing rows by labels.

 indexes=['r1','r2','r3','r4'] df = pd.DataFrame(technologies,index=indexes) print(df) 

2. pandas Drop Rows From DataFrame Examples

By default drop() method removes rows ( axis=0 ) from DataFrame. Let’s see several examples of how to remove rows from DataFrame.

2.1 Drop rows by Index Labels or Names

One of the pandas advantages is you can assign labels/names to rows, similar to column names. If you have DataFrame with row labels (index labels), you can specify what rows you wanted to remove by label names.

Alternatively, you can also write the same statement by using the field name ‘index’ .

And by using labels and axis as below.

  • As you see using labels, axis=0 is equivalent to using index=label names .
  • axis=0 mean rows. By default drop() method considers axis=0 hence you don’t have to specify to remove rows. to remove columns explicitly specify axis=1 or columns .

2.2 Drop Rows by Index Number (Row Number)

Similarly by using drop() method you can also remove rows by index position from pandas DataFrame. drop() method doesn’t have position index as a param, hence we need to get the row labels from the index and pass these to the drop method. We will use df.index to get us row labels for the indexes we wanted to delete.

  • df.index.values returns all row labels as list.
  • df.index[[1,3]] get’s you row labels for 2nd and 3rd rows, by passing these to drop() method removes these rows. Note that in python list index starts from zero.

Yields the same output as section 2.1. In order to remove the first row, you can use df.drop(df.index[0]) , and to remove the last row use df.drop(df.index[-1]) .

2.3 Delete Rows by Index Range

You can also remove rows by specifying the index range. The below example removes all rows starting 3rd row.

2.4 Delete Rows when you have Default Indexs

By default pandas assign a sequence number to all rows also called index, row index starts from zero and increments by 1 for every row. If you are not using custom index labels then pandas DataFrame assigns sequence numbers as Index. To remove rows with the default index, you can try below.

Note that df.drop(-1) doesn’t remove the last row as -1 index not present in DataFrame. You can still use df.drop(df.index[-1]) to remove the last row.

2.5 Remove DataFrame Rows inplace

All examples you have seen above return a copy DataFrame after removing rows. In case if you wanted to remove rows inplace from referring DataFrame use inplace=True . By default inplace param is set to False .

2.6 Drop Rows by Checking Conditions

Most of the time we would also need to remove DataFrame rows based on some conditions (column value), you can do this by using loc[] and iloc[] methods.

2.7 Drop Rows that has NaN/None/Null Values

While working with analytics you would often be required to clean up the data that has None , Null & np.NaN values. By using df.dropna() you can remove NaN values from DataFrame.

This removes all rows that have None, Null & NaN values on any columns.

2.8 Remove Rows by Slicing DataFrame

You can also remove DataFrame rows by slicing. Remember index starts from zero.

Conclusion

In this pandas drop rows article you have learned how to drop/remove pandas DataFrame rows using drop() method. By default drop() deletes rows (axis = 0), if you wanted to delete columns either you have to use axis =1 or columns=labels param.

References

You may also like reading:

Источник

Drop or delete the row in python pandas with conditions

In this section we will learn how to drop or delete the row in python pandas by index, delete row by condition in python pandas and drop rows by position. Dropping a row in pandas is achieved by using .drop() function. Lets see example of each.

  • Delete or Drop rows with condition in python pandas using drop() function.
  • Drop rows by index / position in pandas.
  • Remove or Drop NA rows or missing rows in pandas python.
  • Remove or Drop Rows with Duplicate values in pandas.
  • Drop or remove rows based on multiple conditions pandas

Syntax of drop() function in pandas :

DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors=’raise’) 
  • labels: String or list of strings referring row.
  • axis: int or string value, 0 ‘index’ for Rows and 1 ‘columns’ for Columns.
  • index or columns: Single label or list. index or columns are an alternative to axis and cannot be used together.
  • level: Used to specify level, in case data frame is having multiple level index.
  • inplace: Makes changes in original Data Frame if True.
  • errors: Ignores error if any value from the list doesn’t exists and drops rest of the values when errors = ‘ignore’.

Create Dataframe:

import pandas as pd import numpy as np #Create a DataFrame import pandas as pd import numpy as np d = < 'Name':['Alisa','raghu','jodha','jodha','raghu','Cathrine', 'Alisa','Bobby','Bobby','Alisa','raghu','Cathrine'], 'Age':[26,23,23,23,23,24,26,24,22,26,23,24], 'Score':[85,31,55,55,31,77,85,63,42,85,31,np.nan]>df = pd.DataFrame(d,columns=['Name','Age','Score']) df

Drop rows in pandas python drop() 1

Simply drop a row or observation:

Dropping the second and third row of a dataframe is achieved as follows

# Drop an observation or row df.drop([1,2])

The above code will drop the second and third row.
0 – represents 1st row
1- represnts 2nd row and so on. So the resultant dataframe will be

Drop rows in pandas python drop() 2

Drop a row or observation by condition:

we can drop a row when it satisfies a specific condition

# Drop a row by condition df[df.Name != 'Alisa']

The above code takes up all the names except Alisa, thereby dropping the row with name ‘Alisa’. So the resultant dataframe will be

Drop rows in pandas python drop() 3

Drop a row or observation by index:

We can drop a row by index as shown below

# Drop a row by index df.drop(df.index[2])

The above code drops the row with index number 2. So the resultant dataframe will be

Drop rows in pandas python drop() 4

Drop the row by position:

Now let’s drop the bottom 3 rows of a dataframe as shown below

The above code selects all the rows except bottom 3 rows, there by dropping bottom 3 rows, so the resultant dataframe will be

Drop rows in pandas python drop() 5

Drop Rows with multiple conditions in pandas:

Now lets drop all the rows where age is between 20 and 25 .

indexAge = df[ (df['Age'] >= 20) & (df['Age'] 

Drop or delete the row in python pandas with conditions 11

Output:

Drop Rows with multiple conditions in pandas based on multiple columns:

Remove rows where Name is Bobby or Catherine or any person with Age >=26. these conditions are specified below

indexAge = df[ (df['Name'].isin(['Bobby', 'Catherine']) | (df['Age'] >= 26)) ].index df.drop(indexAge , inplace=True) df

Drop or delete the row in python pandas with conditions 12

Drop Duplicate rows of the dataframe in pandas

Drop duplicates in pandas python 1

now lets simply drop the duplicate rows in pandas as shown below

# drop duplicate rows df.drop_duplicates()

In the above example first occurrence of the duplicate row is kept and subsequent duplicate occurrence will be deleted, so the output will be

Drop duplicates in pandas python 7

For further detail on drop duplicates one can refer our page on Drop duplicate rows in pandas python drop_duplicates()

Drop or Remove Duplicate rows by keeping first and last occurrence:

# Remove Duplicate rows by keeping last occurence df.drop_duplicates(keep='last')

Drop or delete the row in python pandas with conditions 14

# Remove Duplicate rows by keeping First occurence df.drop_duplicates(keep='first')

Drop or delete the row in python pandas with conditions 14

Output:

Drop rows with NA values in pandas python

Remove or Drop the rows even with single NaN or single missing values.

so the resultant table on which rows with NA values dropped will be

For further detail on drop rows with NA values one can refer our page

for documentation on drop() function kindly refer here.

Author

With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark. View all posts

Источник

Читайте также:  Java lang illegalargumentexception bound must be positive
Оцените статью