Dataframe filtering based on two columns from a different dataframe using pandas

Question

I have 2 Dataframes as:

df1 = pd.DataFrame(data = {'col1' : [1, 2, 3, 4, 5], 'col2' : [10, 11, 12, 13, 14], 'col3' : [8, 9, 3, 9, 7], 'col4' : [3, 5, 9, 4, 8]}) 
df2 = pd.DataFrame(data = {'col1' : [1, 2, 3, 3,5], 'col2' : [10, 10, 14, 12, 14]})

df1

   col1     col2    col3    col4
0     1       10       8       3
1     2       11       9       5
2     3       12       3       9
3     4       13       9       4
4     5       14       7       8

df2

   col1     col2
0     1       10       
1     2       10      
2     3       14       
3     3       12       
4     5       14

I want to filter df1 based on column col1 and col2 of df2, i.e. check if col1 & col2 of df1 is same as col1 & col2 of df2. If there is a match then exclude those rows from df1 and print the remaining. The output for given dataframe should be like this:

  col1  col2  col3  col4
0    4    13     9     4
1    5    14     7     8

Please help me solve this, preferably without using 'merge'

`df1.merge(df2, on=['col1','col2'], how='outer',indicator=True)` and look at `_merge` column for `'left_only'`. — Quang Hoang, Nov 04 '20 at 06:59
I am aware that it can be done using 'merge'. I was looking for alternative approach. The dataframes I am working on, has more than 80 columns each and using 'merge' creates unnecessary columns. — sak, Nov 04 '20 at 07:16
You should have said that in your question. That said, you can merge only those columns you want to retain, e.g. `df1.merge(df2[['col1','col2']], on=['col1','col2'],...`. — Quang Hoang, Nov 04 '20 at 07:18

Dataframe filtering based on two columns from a different dataframe using pandas

0 Answers0