0

I am trying to send json requests to scrape an infinite scroll element in this link.Though, I know some parameters are not neccessary, but to be sure, I define exactly the same parameters that are sent to the server:my parameters and code are

import requests
import json
parameters1 = {'ticker':'XOM', 'countryCode':'US',
           'dateTime':'12%3A38+p.m.+Oct.+24%2C+2016', 'docId':'',
           'docType':'806','sequence':'e5a00f51-8821-4fbc-8ac6-e5f64b5eb0f2',
           'messageNumber':'8541','count':'10',
          'channelName':'%2Fnews%2Flatest%2Fcompany%2Fus%2Fxom', 'topic':'',
           '_':'1479888927416' }



parameters2 = {'ticker':'XOM', 'countryCode':'US',
           'dateTime':'12%3A38+p.m.+Oct.+24%2C+2016','docId':'',
           'docType':'806'  ,'sequence':'e5a00f51-8821-4fbc-8ac6-e5f64b5eb0f2', 
           'messageNumber':'8525','count':'10',
           'channelName':'%2Fnews%2Flatest%2Fcompany%2Fus%2Fxom', 'topic':'',
           '_':'1479888927417' }


firstUrl = "http://www.marketwatch.com/news/headline/getheadlines"
html1 = requests.get(firstUrl, params = parameters1)
result1 = (json.loads(html1.text))

html2 = requests.get(firstUrl, params = parameters2)
result2 = (json.loads(html2.text))

and I check if they are same:

if(result2 == result1):
     print(True)

The answer is always True. I changed many parameters and it didn't work. What is wrong with my code or prodecure that I go through?

vtni
  • 860
  • 3
  • 11
  • 42
mk_sch
  • 912
  • 1
  • 11
  • 26

1 Answers1

1

Your problem is, that you are sending JSON but using encoded strings. Instead of %2F you should use /, instead of + a whitespace, instead of %3A a : and so on. You can decode your string for example on this site.

import requests
import json
parameters1 = {'ticker':'XOM', 'countryCode':'US',
           'dateTime':'12:38 p.m. Oct., 2016', 'docId':'',
           'docType':'806','sequence':'e5a00f51-8821-4fbc-8ac6-e5f64b5eb0f2',
           'messageNumber':'8541','count':'10',
          'channelName':'/news/latest/company/us/xom', 'topic':'',
           '_':'1479888927416' }

parameters2 = {'ticker':'XOM', 'countryCode':'US',
           'dateTime':'12:38 p.m. Oct., 2016','docId':'',
           'docType':'806'  ,'sequence':'e5a00f51-8821-4fbc-8ac6-e5f64b5eb0f2', 
           'messageNumber':'8525','count':'10',
           'channelName':'/news/latest/company/us/xom', 'topic':'',
           '_':'1479888927417' };

firstUrl = "http://www.marketwatch.com/news/headline/getheadlines"
html1 = requests.get(firstUrl, params = parameters1)
result1 = (json.loads(html1.text))


html2 = requests.get(firstUrl, params = parameters2);
result2 = (json.loads(html2.text))

Then result1==result2 will be False

vtni
  • 860
  • 3
  • 11
  • 42