I have sample data which looks like the following (these are two separate rows delimited by tabs):
Details
[{'name': 'Irrelevant_Data',
'parentName': 'Irrelevant_Scrape',
'parentId': '2662610',
'id': '2684157'},
{'name': 'Irrelevant_Data',
'parentName': 'Irrelevant_Scrape',
'parentId': '068111',
'id': '291005'}]
[{'name': 'Desired_Data',
'parentName': 'Relevant_Scrape',
'parentId': '6123777',
'id': '31568812'},
{'name': 'Desired_Data2',
'parentName': 'Relevant_Scrape',
'parentId': '6123777',
'id': '2892718'},
{'name': 'Irrelevant',
'parentName': 'Irrelevant_Scrape',
'parentId': '068111',
'id': '8001822'}]
It's stored in a Pandas DataFrame series in one column (let's call the column "Details"). I want to select only those "name" elements whose "parentName" in the same row = "Relevant_Scrape."
I'm familiar with the different data structures in Python and also am somewhat familiar with Pandas, but the combination of the two is throwing me off. When I try to loop through the series, my data is transformed into a string, making extraction much harder.
import pandas as pd
from pandas import DataFrame, read_csv
df = pd.read_csv('dataset.csv', sep = '\t')
for row in df['Details']:
if "Relevant_Scrape" in "parentname":
print("name")
Thank you in advance.
Edit 2: expanded sample
queryName date summary tagging Details
query1 3/31/2016 negative ['Dummy - Dummy'] [{'name': 'Irrelevant_Data', 'parentName': 'Irrelevant_Scrape', 'parentId': '2517840', 'id': '2565351'}]
query2 3/26/2016 positive ['Dummy', 'Dummy', 'Dummy'] [{'name': 'Irrelevant_Data', 'parentName': 'Irrelevant_Scrape', 'parentId': '2662610', 'id': '2684157'}, {'name': 'Irrelevant_Data', 'parentName': 'Irrelevant_Scrape', 'parentId': '2517840', 'id': '2565351'}]
query3 3/26/2016 neutral ['Dummy'] [{'name': 'Irrelevant_Data', 'parentName': 'Irrelevant_Scrape', 'parentId': '2662610', 'id': '2684157'}, {'name': 'Irrelevant_Data', 'parentName': 'Irrelevant_Scrape', 'parentId': '2517840', 'id': '2565351'}]
query4 3/19/2016 positive ['Dummy', 'Dummy'] [{'name': 'Relevant_Data', 'parentName': 'Relevant_Scrape', 'parentId': '2892458', 'id': '2892601'}, {'name': 'Relevant_Data', 'parentName': 'Relevant_Scrape', 'parentId': '2892458', 'id': '2892718'}, {'name': 'Irrelevant_Data', 'parentName': 'Irrelevant_Scrape', 'parentId': '2517840', 'id': '2565351'}]