Creating a column in one dataframe from another dataframe doesn't transfer missing rows

Question

I have the following two dataframes:

data = {'Name': ['Tom', 'Jack', 'nick', 'juli'], 'marks': [99, 98, 95, 90]}
df = pd.DataFrame(data, index=['rank1', 'rank2', 'rank3', 'rank4'])

data = {'salata': ['ntomata', 'tzatziki']}
df2 = pd.DataFrame(data, index=['rank3', 'rank5'])

What I want is to to copy the salata column from df2 to df1.

df['salata'] = df2['salata']

However, it doesn't copy the missing row rank5 to the df1

Update: Thank you for the answers.

What should I use in case the dataframes have different column multiindex levels?

For example:

data = {('Name','Here'): ['Tom', 'Jack', 'nick', 'juli'], ('marks','There'): [99, 98, 95, 90]}
df = pd.DataFrame(data, index=['rank1', 'rank2', 'rank3', 'rank4'])

df[('salata','-')] = df2['salata']

jezrael · Accepted Answer · 2020-05-18T11:20:30.697

2

Use DataFrame.combine_first:

#all columns
df = df.combine_first(df2)
#only columns in list
#df = df.combine_first(df2[['salata']])
print (df)
       Name  marks    salata
rank1   Tom   99.0       NaN
rank2  Jack   98.0       NaN
rank3  nick   95.0   ntomata
rank4  juli   90.0       NaN
rank5   NaN    NaN  tzatziki

EDIT:

If there is MultiIndex first create MultiIndex in df2, e.g. by MultiIndex.from_product:

df2.columns = pd.MultiIndex.from_product([[''], df2.columns])

df = df.combine_first(df2)
print (df)
                 Name marks
         salata  Here There
rank1       NaN   Tom  99.0
rank2       NaN  Jack  98.0
rank3   ntomata  nick  95.0
rank4       NaN  juli  90.0
rank5  tzatziki   NaN   NaN

Another solution with concat:

df = pd.concat([df, df2], axis=1)

edited May 18 '20 at 11:20

answered May 18 '20 at 10:11

jezrael

629,482
62
918
895

I was hoping to avoid creating another dataframe, but if this is not possible, then I will mark your reply as the accepted answer. However, what is the difference between combine_first and concat? – Thanasis May 18 '20 at 11:16
1

@Thanasis - Concat join by outer join, here is possible use `pd.concat([df, df2], axis=1)`, `combine_first` is used mainly for replace missing values from another DataFrame. Here working same. – jezrael May 18 '20 at 11:19

score 0 · Answer 2 · answered May 18 '20 at 10:16

if your indexes are representative of your example then you can do an outer join :

df = df.join(df2,how='outer')

       Name  marks    salata
rank1   Tom   99.0       NaN
rank2  Jack   98.0       NaN
rank3  nick   95.0   ntomata
rank4  juli   90.0       NaN
rank5   NaN    NaN  tzatziki

Creating a column in one dataframe from another dataframe doesn't transfer missing rows

2 Answers2