0

I'm completely stuck trying to log into this page:

https://login.wyborcza.pl/

I have tried this solution:

import requests

# Fill in your details here to be posted to the login form.
payload = {
    'name': 'username',
    'pass': 'password'
}

# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
    login_url = "https://login.wyborcza.pl/"
    p = s.post(login_url, data=payload)
    # print the html returned or something more intelligent to see if it's a successful login page.
    print p.text

    # the authorised request.
    r = s.get('A protected web page url')
    print r.text
    # etc...

that I found here, but I only get a 400 status.

Thanks for reading.


UPDATE:

Another problem occurs: When I'm trying to read in this page with request.get(), I get a message, that an adblocker is on, and the content of the page isn't loaded. But if I try to access the page in a browser, there's no problem - all content is loaded.

import requests

# Fill in your details here to be posted to the login form.
payload = {
    'username': 'username',
    'password': 'password'
}

# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
    login_url = "https://login.wyborcza.pl/services/ajax/wyborcza/login"
    p = s.post(login_url, data=payload)
    # print the html returned or something more intelligent to see if it's a successful login page.
    cookiesALL = s.cookies.get_dict()

    s.headers.update({
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
        "Accept-Encoding": "gzip, deflate, br",
        "Accept-Language": "en-US,en;q=0.9,nb;q=0.8",
        "Cache-Control": "max-age=0",
        "Connection": "keep-alive",
        "Content-Length": "101",
        "Content-Type": "application/x-www-form-urlencoded",
        "Cookie": "SsoSessionPermanent=da7c41fb3ce67a9c36068c8752ecb6f6c595261ec033bef85f5a00a09b992491; _gcl_au=1.1.1603784452.1550874547; __utmc=150475568; __utmz=150475568.1550874547.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); _ga=GA1.2.624566896.1550874547; _gid=GA1.2.1373698316.1550874547; _fbp=fb.1.1550874547334.2017607101; __gfp_64b=MOGLe6FatMqipvP6ZL6AdioAq5LZyXL4TZ4CaKZlx8H.U7; customDataLayer_customer=%7B%22state%22%3A%22anonymous%22%2C%22validPeriod%22%3A%22%22%7D; __gads=ID=6024a627e7962b38:T=1550874563:S=ALNI_MY5DVzG-IY0cLZRQFFrv-45kvL9AQ; GazetaPlUser=213A32A242A37k1550874561871; SquidLocalUID=f1b0394447af42427a2985c4; __utma=150475568.624566896.1550874547.1550874547.1550913993.2; __utmb=150475568.0.10.1550913993",
        "Host": "login.wyborcza.pl",
        "Origin": "http://wyborcza.pl",
        "Referer": "http://wyborcza.pl/1,76842,3360710.html",
        "Upgrade-Insecure-Requests": "1",
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
    })
    # the authorised request.
    r = s.get('https://wyborcza.pl/1,76842,3360710.html')
    print(r.text)
    # etc...
eyllanesc
  • 190,383
  • 15
  • 87
  • 142

1 Answers1

0

This script should solve your issue:

import requests

# Fill in your details here to be posted to the login form.
payload = {
    'username': 'username',
    'password': 'password'
}

# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
    login_url = "https://login.wyborcza.pl/services/ajax/wyborcza/login"
    p = s.post(login_url, data=payload)
    # print the html returned or something more intelligent to see if it's a successful login page.
    cookiesALL = s.cookies.get_dict()

    s.headers.update({
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
        "Accept-Language": "en-US,en;q=0.5",
        "Accept-Encoding": "gzip, deflate",
        "DNT": "1",
        "Connection": "close",
        "Upgrade-Insecure-Requests": "1",
        "Cookie":"SsoSessionPermanent={}; GW_SID=220D44FAAD9071FAC49796195720D348.tomwybo17; ag-rd-params=; __gfp_64b=-TURNEDOFF; customDataLayer_customer=%7B%22state%22%3A%22anonymous%22%2C%22validPeriod%22%3A%22%22%7D; bwGuidv2=b952c7409c97c249520c9e8a; SquidLocalUID=3c9df34214b0a589cf4863b7; wyborczaXYZ=test; test=131A251A253A104k1550911251283; bwVisitId=f5a2c74d1ba13dfde8d36c40".format(cookiesALL['SsoSessionPermanent'])
    })
    # the authorised request.
    r = s.get('https://wyborcza.pl/1,76842,3360710.html')
    print(r.text)
    # etc...

the problem you were dealing with was linked with you faulty parameter names (in payloads) and the login_url you were sending a POST request to.

Hope this helps

Nazim Kerimbekov
  • 3,965
  • 6
  • 23
  • 48
  • Thank you so much, @Fozoro! But another problem has arised...when I request.get() this page: http://wyborcza.pl/1,76842,3360710.html I'm told that I should turn off my adblocker, and no content is shown. This is possible to see even without logging in to the site. You have already helped me a lot, and I don't expect you to spend more time on me. It's only if it's an issue you have come across earlier. Hope you'll have as wonderful a day as you just made mine - thank you! :) – Nicolai Nyströmer Feb 22 '19 at 22:38
  • really happy it worked! Could you please update your question so I can see more precisely what you are trying to do. strangely enough, when I visit this link I don't have any adblock message popping up. – Nazim Kerimbekov Feb 22 '19 at 22:50
  • I think this might be due to the absence of a header in your code. I've just edited my code, try it out and keep me updated if it works or not. – Nazim Kerimbekov Feb 22 '19 at 23:23
  • Hmmmm, well I’m off to sleep now. I’ll try doing it tomorrow. I’ll keep you updated! – Nazim Kerimbekov Feb 22 '19 at 23:29
  • Just updated the code now, should work. Though I'm not sure how flexible this code will be when it comes to visiting other webpages on the website. let me know if you face any issues with it – Nazim Kerimbekov Feb 23 '19 at 08:48
  • The problem remains. I have updated my code with the header information I get, when I access the log in page. – Nicolai Nyströmer Feb 23 '19 at 10:04
  • hmm, the header that you are using is completely is completely different from the one I've put in my answer. Could you try running my code? – Nazim Kerimbekov Feb 23 '19 at 10:10
  • I'm sorry - I wasn't very clear about it. I tried to run your updated code, which unfortunately did not work. Gave me the same result as earlier. I updated my code to show the header on my computer, when I entered the login page. Maybe it could bring some clarity to the problem. – Nicolai Nyströmer Feb 23 '19 at 16:07
  • I've updated my code since yesterday. When was the last time you tried my code? strangely enough it is working for me – Nazim Kerimbekov Feb 23 '19 at 16:11
  • I've just tried it again. It doesn't work for me - I still only get a message about AdBlock. Are you able to see the content starting with the title "Wystąpienie Jana Rokity na konwencji PO"? – Nicolai Nyströmer Feb 23 '19 at 16:26
  • Well I'm getting the full text here is a part of the output: `Zastanawiałem się przed rozpoczęciem tej debaty`. When running the code are you changing `payload` dictionary? if so try leaving it `'username': 'username'` and `'password': 'password'` and let me know what you get. – Nazim Kerimbekov Feb 23 '19 at 16:34
  • Yes, I'm changing the values of the payload dictionary so they use my login credentials. I just ran your code exactly as you posted it, i.e. "username": "username" and "password": "password". The result is the same - it doesn't work. I really want to help you help me, but I'm pretty clueless. – Nicolai Nyströmer Feb 23 '19 at 16:39
  • Sure thing - I'm located in Denmark working on a research project retrieving information from a lot of Polish news articles. adblock occurs 23 times. This is the pastebin: https://pastebin.com/Le2wnE4b – Nicolai Nyströmer Feb 23 '19 at 16:53
  • Ooooh my god, I just found the problem. I forgot that you are using Python 2. which means that it doesn't recognise `f"` (which is used in Python 3) I switched it with the old traditional `.format()`. – Nazim Kerimbekov Feb 23 '19 at 16:54
  • My time to say OMG - it works! I must admit I'm a newcomer in this field - I honestly thought I was using Python 3 but obviously not! Fozoro - you have been more than kind. This is my first question in here, and I'm sincerely overwhelmed by your helpfulness. I will do my best to pay it forward. Now it's time to scrutinize your code in order to fully understand it. Thank you so much! :) – Nicolai Nyströmer Feb 23 '19 at 17:02
  • No worries really happy we've made it! I'll try adding a few comments to my code to make it more understandable. Wish you all the best with your project. If you encounter any issues with the code (which is quite possible as I couldn't fully test it out without an actual account on the website) feel free to ask. (PS: the reason that I understood that you are on Python 2 is because of `print r.text` which doesn't work on Python 3 and is replaced with `print(r.text)`) – Nazim Kerimbekov Feb 23 '19 at 17:07