0

I can return the output but can not filter the return data using python 3.x

it returns a bunch of result with a unique style and i just only want to get htmlSpinnet and htmlTitle values from the result

from googleapiclient.discovery import build
import pprint

my_api_key = "xxx"
my_cse_id = "xxx"


def google_search(search_term, api_key, cse_id, **kwargs):
    service = build("customsearch", "v1", developerKey=api_key)
    res = service.cse().list(q=search_term, cx=cse_id, **kwargs).execute()
    return res['items']


results = google_search(
    'mehkeme', my_api_key, my_cse_id, num=10)

# this is the htmlSpinnets and also htmlTitle
newDict = dict()
# Iterate over all the items in dictionary and filter items which has even keys
for (key, value) in results.items():
    if key == 'htmlSpinnet':
        newDict[key] = value

print('Filtered Dictionary : ')
print(newDict)

# for result in results:
#     pprint.pprint(result)

it returns this error

/Users/valizadavali/PycharmProjects/webScrape/venv/bin/python /Users/valizadavali/PycharmProjects/webScrape/googleCustomSearch.py
Traceback (most recent call last):
  File "/Users/valizadavali/PycharmProjects/webScrape/googleCustomSearch.py", line 20, in <module>
    for (key, value) in results.items():
AttributeError: 'list' object has no attribute 'items'

it returns this without filtering, and I need to get the values which are bolded

{'cacheId': 'fGQCNF9pc6cJ', 'displayLink': 'azvision.az', 'formattedUrl': 'https://azvision.az/.../mehkeme-huquq-sisteminde-islahatlar-derinlesdirilir-- ' 'ferman--.html', 'htmlFormattedUrl': 'https://azvision.az/.../mehkeme-huquq-sisteminde-islahatlar-derinlesdirilir-- ' 'ferman--.html', 'htmlSnippet': '3 Apr 2019 ... Prezident İlham Əliyev məhkəmə-hüquq ' 'sistemində islahatların dərinləşdirilməsi
\n' 'haqqında fərman imzalayıb.', 'htmlTitle': 'Məhkəmə-hüquq sistemində islahatlar dərinləşdirilir -', 'kind': 'customsearch#result', 'link': 'https://azvision.az/news/174505/mehkeme-huquq-sisteminde-islahatlar-derinlesdirilir--ferman--.html', }

barny
  • 5,280
  • 4
  • 16
  • 21
Vali Valizada
  • 81
  • 1
  • 10

1 Answers1

1

You can also change your request to google to ask specifically for 'htmlSnippet' and 'htmlTitle' only by using 'fields' parameter:

...&fields=items(htmlTitle,htmlSnippet)...

That might make the returned results easier to parse?

See this link for more info: fields

pipja
  • 181
  • 12
  • i will try this out, could you please look at my unasnwered question?https://stackoverflow.com/questions/58218685/using-http-request-get-output-of-each-manually-search-automatically – Vali Valizada Oct 09 '19 at 10:25
  • 1
    I'm not really well versed in python so can't really help you, I just spotted that you only want 2 properties out of the search results so I suggested the solution (which we are using extensively in production and works perfectly). – pipja Oct 09 '19 at 23:13
  • thanks anyway, what about searching multiple keywords, like we are creating searhing list and for each of them it returns outputs for each search as a dicttionaries of list. did you get what i mean? – Vali Valizada Oct 10 '19 at 04:32
  • didn't quite get you there, if you have multiple words then maybe you could run 1 api query per keywords then aggregate the results into a dictionary? – pipja Oct 11 '19 at 00:53