Can somebody please explain this result to me? In particular, I don't know where the NaN
s come from in the result. Also, I don't know how the join
will decide what row to match with what row in this case.
left_df = pd.DataFrame.from_dict({'unique_l':[0, 1, 2, 3, 4], 'join':['a', 'a', 'b','b', 'c'] })
right_df = pd.DataFrame.from_dict({'unique_r':[10, 11, 12, 13, 14], 'join':['a', 'b', 'b','c', 'c'] })
join unique_l
0 a 0
1 a 1
2 b 2
3 b 3
4 c 4
join unique_r
0 a 10
1 b 11
2 b 12
3 c 13
4 c 14
print left_df.join(right_df, on='join', rsuffix='_r')
join unique_l join_r unique_r
0 a 0 NaN NaN
1 a 1 NaN NaN
2 b 2 NaN NaN
3 b 3 NaN NaN
4 c 4 NaN NaN