I have started to use Google's Dataprep solution to cleanse eCommerce product feeds. As I receive data from 100s of eCommerce stores, I want to cleanse the data for consistency and rename the various spellings of brand names. For example, I have a column 'Vendor' that has millions of rows with Adidas spelt differently:
adidas
Adidas
Adidas classic
Adidas orginals
adidas originals
adidas skateboarding
Adidas Skateboarding
For the purpose of my requirements, I want to rename all examples to 'adidas'. I was looking at the various routines in Dataprep and the Replace function could do the work, however, it's not a scalable solution.
Is there a way in Dataprep to have a master file of brand names and do a lookup on this data and replace the incorrect instances? In Excel, a simple VLOOKUP might work and I am questioning if this exists in Dataprep.
I hope the above makes sense, thank you to those who can help.
Craig