2

I want to create a column with data from another dataframe based on index.

For example, I have a dataframe below:

import numpy as np
import pandas as pd    

df1={'id':[1,1,1,3,5,6,7,8,9,10], 'name':['a','a','a','c','e','f','g','h','i','j']}

df2 = {'id':[1,2,3,4,5,6,7,8,9,10], 'age':[21,11,45,11,56,22,26,26,17,32], 'gender':['M','M','f','f','M','f','M','M','f','M']}

df1 = pd.DataFrame(df1)
df1.set_index('id', inplace = True)

df2 = pd.DataFrame(df2)
df2.set_index('id', inplace = True)

Using this two dataframe, I want to create a column in df1 called 'gender' and get the data from df2 based on its index. So my column 'gender' in df1 will look like this:

['M','M','M','f','M','f','M','M','f','M']
stardust
  • 141
  • 1
  • 9
Yun Tae Hwang
  • 763
  • 1
  • 8
  • 22

3 Answers3

3

That's very easy. Simply do:

df1['gender'] = df2['gender']

Since you already set the indexes correctly, pandas will do exactly what you want. Just try and see.

Valentino
  • 6,643
  • 6
  • 14
  • 30
2

Try this.

df1.join(df2['gender'])
Mark Wang
  • 2,207
  • 2
  • 11
1

You can merge your df1 and df2 on index:

df1.merge(df2, left_index=True, right_index=True)

   name  age gender
id
1     a   21      M
1     a   21      M
1     a   21      M
3     c   45      f
5     e   56      M
6     f   22      f
7     g   26      M
8     h   26      M
9     i   17      f
10    j   32      M

Take note that how is 'inner' by default and you can change it to 'left' should need be.

zipa
  • 24,366
  • 6
  • 30
  • 49