1

I'm having problems getting data from an HTTP response. The format unfortunately comes back with '\n' attached to all the key/value pairs. JSON says it must be a str and not "bytes".

I have tried a number of fixes so my list of includes might look weird/redundant. Any suggestions would be appreciated.

#!/usr/bin/env python3

import urllib.request
from urllib.request import urlopen
import json
import requests

url = "http://finance.google.com/finance/info?client=ig&q=NASDAQ,AAPL"
response = urlopen(url)
content = response.read()
print(content)

data = json.loads(content)
info = data[0]
print(info)
#got this far - planning to extract "id:" "22144"
martineau
  • 99,260
  • 22
  • 139
  • 249
Scott Binkley
  • 113
  • 1
  • 7

3 Answers3

3

When it comes to making requests in Python, I personally like to use the requests library. I find it easier to use.

import json
import requests

r = requests.get('http://finance.google.com/finance/info?client=ig&q=NASDAQ,AAPL')
json_obj = json.loads(r.text[4:])

print(json_obj[0].get('id'))

The above solution prints: 22144

The response data had a couple unnecessary characters at the head, which is why I am only loading the relevant (json) portion of the response: r.text[4:]. This is the reason why you couldn't load it as json initially.

Muntaser Ahmed
  • 3,797
  • 1
  • 13
  • 16
1

Bytes object has method decode() which converts bytes to string. Checking the response in the browser, seems there are some extra characters at the beginning of the string that needs to be removed (a line feed character, followed by two slashes: '\n//'). To skip the first three characters from the string returned by the decode() method we add [3:] after the method call.

data = json.loads(content.decode()[3:])
print(data[0]['id'])

The output is exactly what you expect:

22144
Ivan Georgiev
  • 895
  • 5
  • 10
-1

JSON says it must be a str and not "bytes".

Your content is "bytes", and you can do this as below.

data = json.loads(content.decode())
lxyscls
  • 243
  • 2
  • 15