0

I am trying to calculate a column in a dataframe based on another data frame. It is used to calculate seniority bonus of a HR payroll.

the two dataframes are:

df1 = headcount

peopleID    peopleSeniority
1               2
2               6
3               12
4               30

df2 = seniority_bonus

seniority    seniorityBonus
5            500
10           1000
15           2000 
20           3000

and I would like to write a script that return df2 Seniority bonus according to df1 people's seniority. Something that will do: if df1['people seniority'] > df2['seniority'] > , df2['senioritybonus'] like if df1 seniority in years > 5, 500 if df1 seniority in years > 10, 2000 ...

I've tried to use query but it is not working, I do not know how to write a loop that would be able to make the calculation.

Would anyone have an idea ?

1 Answers1

0

It is easier if the bonus always increases every five years. Add a column to df1 then merge with df2 on that column

df1['seniority'] = (df1.peopleSeniority // 5) *5
df1 = df1.merge(df2, on='seniority')

Read more about merge in the docs and this Q&A

Edit

If the increments in years between brackets are not even, nor linear, nor any other algebraic function, then you can use pd.cut() to categorize each person's seniority.

df1['seniority_bonus'] = pd.cut(df.peopleSeniority,
    bins=[0, 5, 8, 11, 15, 21, 30], right=False, labels=[0, 500, 1000, 1500, 2000, 2500])

Notice how labels have one less item than bins. That is because with a 7 integer list you can only build 6 categories (there will not be a category for 30-infinite).

RichieV
  • 4,813
  • 2
  • 7
  • 24
  • if we use the merge, then would you have an idea on how to generate a table that fills the gap between the bonus changes ? so that in the new df2 a 6, 7, 8 9 years seniority is equal to the row with 5 years seniority ? – Guillaume Aug 03 '20 at 11:14
  • The row before that adds a column changing `2, 6, 12, 30` to `0, 5, 10, 30`, thus matching perfectly. – RichieV Aug 03 '20 at 16:32