0

I'm working on a project that has two inputs of data, and I need to merge those data into only one DataFrame, based on the index of the second DataFrame.

My first DataFrame looks like this:

    user        artist        song      timestamp
0  user_1       artist_1     song_1   2019-02-21 23:45  
1  user_1       artist_2     song_2   2019-02-21 23:49  
2  user_2       artist_2     song_3   2019-02-21 23:52

And my second DataFrame looks like this:

                        artist       song
id                                                                        
9A9xGpfS2eZn3T4Z       artist_1     song_1
ke5EfOpcvTl0dtyR       artist_2     song_2
9YjNEwfS4kX7aHvg       artist_2     song_3

The result I'm looking for and that I'm not being able to obtain is the following:

    user        song_id                timestamp
0  user_1       9A9xGpfS2eZn3T4Z    2019-02-21 23:45  
1  user_1       ke5EfOpcvTl0dtyR    2019-02-21 23:49  
2  user_2       9YjNEwfS4kX7aHvg    2019-02-21 23:52

Is there a way to obtain this result, using pandas builtin functions?

Thanks!

1 Answers1

1

Let df1 be your first DataFrame and df2 your second. Then you can use:

new_df = df1.merge(df2, on=['artist', 'song'], how='left')

Then you can rename and delete as you like.

Carsten
  • 1,959
  • 1
  • 8
  • 22