Merging two pandas dataframe that are exactly the same without repeating columns

Asked Feb 05 '19 at 20:58

Active Feb 05 '19 at 20:58

Viewed 40 times

I have two dataframes that are exactly the same:

df1:

df2:

I want to merge them on ID, but I do not want to repeat the values of the P columns in the same row. That is I want the following result:

P1    ID    P2
1     10    2
1     10    3
2     10    1
2     10    3
3     10    1
3     10    2 
2     5     3
3     5     2

Is there a way to do this without doing regular merge and then removing rows with repeated values in P1 and P2?

asked Feb 05 '19 at 20:58

BKS

But why `without doing regular merge and then removing rows with repeated values` – yatu Feb 05 '19 at 21:08
@yatu because the files are so large and I keep running out of memory. I can divide one of the dataframes into subframes and merge each separately and remove repeated values, but I'm wondering if there is a more efficient way – BKS Feb 05 '19 at 21:11
So you want no duplicates of `ID`, `P1`, or `P2`? – Jab Feb 05 '19 at 21:12
I'm sorry but the logic for merging is not clear here. Can you explain why/how the result is like this? – cs95 Feb 05 '19 at 21:19
It looks to me like you want a cross join for each ID, but without rows where P1 == P2. I can't see it any other way, and your explanation leaves a lot to be desired. Help us out please? – cs95 Feb 05 '19 at 21:21
2

A solution to what I _think_ your question is asking would be `df1.merge(df2, on='ID').query('P1 != P2')` – cs95 Feb 05 '19 at 21:21
@Jaba I do not want to have a row where P1 and P2 have equal values. – BKS Feb 05 '19 at 21:23
1

What a beautiful dup target you've created @coldspeed :P. – d_kennetz Feb 05 '19 at 21:34
1

@d_kennetz Thanks! I'd rather answer in the comments and close these basic merge questions as duplicate than leave them open to receive the same, tired answers over and over. – cs95 Feb 05 '19 at 21:41

0 Answers0