1

This is my dataframe:

C1 | C2 |... email_id | subject | sender | recipient
   |    |             | congrats| x      |  y
   |    |             | congrats | z     | y
   |    |             | congrats | x     |y
   |    |             | meeting  | x     | y

Output:

   C1  | C2 |... email_id  | subject  | sender | recipient
       |    |       0      | congrats | x      | y
       |    |       1      | congrats | z      | y
       |    |       0      | congrats | x      | y
       |    |       2      | meeting  | x      | y

For every unique combination of subject, sender and recipient I want to assign an emai_id. I have gotten unique triples like this:

df1 = df.drop_duplicates(subset=['sender','recipient','subject'])

In order to assign value this is what I am doing

sender = df1.sender
recipient = df1.recipient
subject = df1.subject
n = 0
for i in sender:
     df.loc[df["subject"] == str(i) and df["subject"] == str(i) , "email_id"] = n
     n=n+1

This isn't the correct way to go about it. How do I add an "add" condition here? Edit: In short, assign same number to same triplets

0 Answers0