lambda function with multiple ifs using pandas df

Question

I have a simple df with revenue and cost figures. The cost can be negative in my case.

I want to compute the revenue to cost ratio as follows:

if ((x['cost'] < 0) & (x['rev'] >=0 )):
   x['r_c_ratio'] = (x['rev'] + abs(x['cost'])) / abs(x['cost'])
elif((x['cost'] > 0) & (x['rev'] <=0 )):
   x['r_c_ratio'] = (x['cost'] + abs(x['rev'])) / x['cost']
else:
   x['r_c_ratio'] = x['rev'] / x['cost']

How does one implement this in a lambda function?

df['revenue_cost_ratio'] = df.apply(lambda x:....

According to this link the lambda syntax is:

lambda x: True if x % 2 == 0 else False

This only allows one else condition.

This is unrelated to your question about creating a `lambda` function, but `&` is the bitwise-and operator in Python. You want the `and` operator when you're doing a logical-and (which appears to be what you want here). The two operators do work the same for the Boolean values `True` and `False`, but won't always do the same thing if you have other "truthy" values involved in the operaton (e.g. compare `1 & 2` with `1 and 2`). — Blckknght, Dec 26 '16 at 05:58

piRSquared · Accepted Answer · 2016-12-26T04:27:56.520

pandas vectorized

r = x.rev.div(x.cost)
r.abs().add(r < 0).rename('revenue_cost_ratio')

numpy vectorized
by the way, I'd use this one

r = x.rev.values / x.cost.values
pd.Series(np.abs(r) + (r < 0), x.index, name='revenue_cost_ratio')

if you insist on a lambda

f = lambda x: (x.rev * x.cost < 0) + abs(x.rev / x.cost)
x['revenue_cost_ratio'] = x.apply(f)

Let's look at your 3 cases

Case 1

if ((x['cost'] < 0) & (x['rev'] >=0 )):
   x['r_c_ratio'] = (x['rev'] + abs(x['cost'])) / abs(x['cost'])

when x['cost'] < 0, abs(x['cost']) is just -1 * x['cost'] so this can be simplified to

(x['rev'] - x['cost']) / -x['cost']

or

(x['cost'] - x['rev']) / x['cost']

Case 2

elif((x['cost'] > 0) & (x['rev'] <=0 )):
   x['r_c_ratio'] = (x['cost'] + abs(x['rev'])) / x['cost']

when x['rev'] <= 0, abs(x['rev']) is just -1 * x['rev'] so this can be simplified to

(x['cost'] - x['rev']) / x['cost']

Wow this is the same as case one! But we can reduce this further to

1 - x['rev'] / x['cost']

And when do we use it? Seems only when either x['rev'] or x['cost'] is negative but not both. Well, that only occurs when that ratio is negative.

Case 3

x['rev'] / x['cost']

Again! What luck! This looks a lot like 1 - x['rev'] / x['cost']

So if we pre-calculate x['rev'] / x['cost'], test it for negativity and return it or 1 less it, we are good. Hence the functions in the beginning.

Can't put a bounty on until 2 days. Not necessary but I appreciate the gesture (of course, I can't stop you if you insist :-) — piRSquared, Dec 26 '16 at 04:50

score 1 · Answer 2 · answered Dec 26 '16 at 02:23

Do not use apply. It is extremely slow. Use a nested where.

np.where((x['cost'] < 0) & (x['rev'] >=0 ),  (x['rev'] + abs(x['cost'])) / abs(x['cost']),
    np.where((x['cost'] > 0) & (x['rev'] <=0 ), (x['cost'] + abs(x['rev'])) / x['cost'], 
             x['rev'] / x['cost']))

lambda function with multiple ifs using pandas df

2 Answers2