This should be a very simple question to answer. I have two lines of code. The first one works. The second gives the following error:
SyntaxError: invalid syntax
Here are the two lines of code. The first line (which works fine) counts the rows where off0_on1 == 1. The second one trys to count the rows where off0_on1 == 0.
a1['on1'] = a1.groupby('del_month')['off0_on1'].transform(sum)
a1['off0'] = a1.groupby('del_month')['off0_on1'].transform(lambda x: 1 if x == 0)
Here is the pandas dataframe:
a1 = pd.DataFrame({'del_month':[1,1,1,1,2,2,2,2], 'off0_on1':[0,0,1,1,0,1,1,1]})
Any suggestions to revise the second line of code above?
Edit: Two of the answers have suggested using a map function, which produces the following output. The "on1" column is correct for my purposes; the "off0" column is not correct. For the first "del_month", the "off0" column should have the same results as the "on1" column. For the second "del_month", the "off0" column should be all ones (i.e. 1, 1, 1, 1).
Here's what happens when I use the following map function (see image below):
a1['off0'] = a1.groupby('del_month')['off0_on1'].transform(lambda series: map(lambda x: 1 if x == 0 else 0, series))
Edit 2 Not sure if this clarifies things, but ultimately I want pandas to do what the following SQL code does so easily:
select
del_month
, sum(case when off0_on1 = 1 then 1 else 0 end) as on1
, sum(case when off0_on1 = 0 then 1 else 0 end) as off0
from a1
group by del_month
order by del_month
Edit 3 This new question contains the answer I need. Thanks everyone!