-1

I need to extract a specific value from pandas df column. The data looks like this:

row        my_column
1          artid=delish.recipe.45064;artid=delish_recipe_45064;avb=83.3;role=4;data=list;prf=i
2          ab=px_d_1200;ab=2;ab=t_d_o_1000;artid=delish.recipe.23;artid=delish;role=1;pdf=true
3          dat=_o_1000;artid=delish.recipe.23;ar;role=56;passing=true;points001

The data is not consistent, but separated by a comma and I need to extract role=x. I separated the data by a semicolon. And can loop trough the values to fetch the roles, but was wondering if there is a more elegant way to solve it. Desired output:

row        my_column
1          role=4
2          role=1
3          role=56

Thank you.

Ruben Helsloot
  • 10,555
  • 5
  • 17
  • 36
Chique_Code
  • 965
  • 5
  • 19

2 Answers2

0

This should work:

def get_role(x):
    l=x.split(sep=';')
    t=[i for i in l if i[:4]=='role')][0]
    return t

df['my_column']=[i for i in map(lambda y: get_role(y), df['my_column'])]
IoaTzimas
  • 8,393
  • 2
  • 9
  • 26
0

You can use str.extract and pass the required pattern within parentheses.

df['my_column'] = df['my_column'].str.extract('(role=\d+)')

   row  my_column
0    1     role=4
1    2     role=1
2    3    role=56
David Erickson
  • 14,448
  • 1
  • 13
  • 30