0

I googled my user agent and put that code into my program but no luck

import requests
from bs4 import BeautifulSoup
URL = 'Servicenow blah blah'
headers = {
"User-Agent": Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0'
}

page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
print(soup.prettify())

Very simple code so far.

Ultimately I am trying to get logged into this website (or even circumvent that by using a user-agent that is already logged in, if this is possible (This is my main question here)) and then parse the html for a certain element's html to monitor for changes

OR if there is a better, simpler tool for this I would LOVE to know

I'm seeing in the html that is printed "Your session has expired etc. etc."

JesseC
  • 35
  • 3

1 Answers1

1

Firstly - a user-agent is not typically how session data is tracked, it lets the website know details about what version of browser you are using. Typically this information is kept in your cookies.

For the log in issue, it sounds like you just need to perform the login request and keep track of the provided cookies, etc required. However, as you said "monitor for changes" I suspect there might be a chance of some Javascript down the line ;) I recommend looking into Selenium for this. It's a browser driver which means it just interacts with a normal browser and will take care of all Javascript execution and cookie tracking for you!

Rob P
  • 178
  • 1
  • 8
  • Awesome, thank you so much! I have been using selenium for everything browser automation for a while but I wanted to try something with a little less overhead if possible, and to switch things up. But Selenium is amazing and I think will be the best solution. I think you are right :). I actually already have a solution written up! – JesseC May 07 '20 at 11:59
  • No worries! And you are right, requests + BeautifulSoup has wayyyy less overhead and much more performant. Unfortunately getting around those issues caused by dynamic sites can be a bit trickier – Rob P May 07 '20 at 12:04