2

I want to check if a value from a column in one dataframe exists in a column of second dataframe. If it exists then add that value in a new column of the same row in second dataframe. All the values are string values. Both the dataframes are of different size. Also second dataframe has around 700K records. So the dataframes I have:

DF1

THINGS
book+pen
CAR 
chair
laptop

DF2

Description
I want a new book.
I will pen down this things 
A quick ride in my new car.
Cars are awesome.
My laptop's memory is bad.
Maybe try sitting on that CHAIR.

The output I want that is addition of an 'Updated' column:

Description                        Updated
I want a new book.                 book
I will pen down this things        pen
A quick ride in my new car.        car
Cars are awesome.                  car
My laptop's memory is bad.         laptop
Maybe try sitting on that CHAIR.   chair
Search for that book in my laptop. book+laptop

I have tried the brute method but its taking too long to process. Thanks in advance!

  • If you have range-index you can just merge on the index if not you can `df.reset_index()` `df2.reset_index()` and then just `pd.merge(df, df2, left_index=True, right_index=True)` also for more info on that https://stackoverflow.com/questions/40468069/merge-two-dataframes-by-index – Anna Semjén Dec 08 '20 at 13:49

1 Answers1

1

Please try this.

  1. Used str.split and explode to first get a neat list of strings to match from df1
  2. Created two columns with all lower cases for case insensitive match.
  3. used str.findall to retrieve the matched strings between dfs.
  4. Stripped of the paranthesis and quotes using str.strip

Code:

df1 = df1.assign(THINGS=df1['THINGS'].str.split('+')).explode('THINGS')
df1['THINGS2'] = df1.THINGS.str.lower()
df2['Description2'] =  df2.Description.str.lower()
df2['Updated'] = df2.Description2.str.findall('|'.join(df1.THINGS2))
df2['Updated'] = df2.Updated.astype(str).str.strip(to_strip=r'''[|]|\'''')
del df2['Description2']
print(df2)

Prints:

                        Description Updated
0                I want a new book.    book
1      I will pen down this things      pen
2       A quick ride in my new car.     car
3                 Cars are awesome.     car
4           My laptops hangs a lot.  laptop
5  Maybe try sitting on that CHAIR.   chair
sharathnatraj
  • 1,294
  • 3
  • 11