I just started coding in Python and want to build a solution where you would search a string to see if it contains a given set of values.
I've find a similar solution in R which uses the stringr library: Search for a value in a string and if the value exists, print it all by itself in a new column
The following code seems to work but i also want to output the three values that i'm looking for and this solution will only output one value:
#Inserting new column
df.insert(5, "New_Column", np.nan)
#Searching old column
df['New_Column'] = np.where(df['Column_with_text'].str.contains('value1|value2|value3', case=False, na=False), 'value', 'NaN')
------ Edit ------
So i realised i didn't give that good of an explanation, sorry about that.
Below is an example where i match fruit names in a string and depending on if it finds any matches in the string it will print out either true or false in a new column. Here's my question: Instead of printing out true or false i want to print out the name it found in the string eg. apples, oranges etc.
import pandas as pd
import numpy as np
text = [('I want to buy some apples.', 0),
('Oranges are good for the health.', 0),
('John is eating some grapes.', 0),
('This line does not contain any fruit names.', 0),
('I bought 2 blueberries yesterday.', 0)]
labels = ['Text','Random Column']
df = pd.DataFrame.from_records(text, columns=labels)
df.insert(2, "MatchedValues", np.nan)
foods =['apples', 'oranges', 'grapes', 'blueberries']
pattern = '|'.join(foods)
df['MatchedValues'] = df['Text'].str.contains(pattern, case=False)
print(df)
Result
Text Random Column MatchedValues
0 I want to buy some apples. 0 True
1 Oranges are good for the health. 0 True
2 John is eating some grapes. 0 True
3 This line does not contain any fruit names. 0 False
4 I bought 2 blueberries yesterday. 0 True
Wanted result
Text Random Column MatchedValues
0 I want to buy some apples. 0 apples
1 Oranges are good for the health. 0 oranges
2 John is eating some grapes. 0 grapes
3 This line does not contain any fruit names. 0 NaN
4 I bought 2 blueberries yesterday. 0 blueberries