19

I am running very complex python-selenium tests on non-public webpages. In most cases these tests run fine, but sometimes one of these tests fail during the initialization of the webdriver itself.

Hint: This error happens when trying to initialize a webdriver, i.e. when doing something like this:

# Start of the tests
mydriver =  webdriver.Firefox(firefox_profile=profile, log_path=logfile)
# ERROR HAPPENS HERE

# Doing other stuff here
....
# Doing tests here
....
# Doing shutdown here
mydriver.quit()

Here is a full example of such an error:

___________ ERROR at setup of TestSuite.test_synaptic_events_fitting ___________

>   lambda: ihook(item=item, **kwds),
    when=when,
            )

/usr/local/lib/python2.7/dist-packages/flaky/flaky_pytest_plugin.py:273: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
conftest.py:157: in basedriver
    mydriver = firefox.get_driver(*args)
bsp_usecase_tests/tools/firefox.py:44: in get_driver
    driver = webdriver.Firefox(firefox_profile=profile, log_path=logfile)  #### INITIALIZING OF WEBDRIVER HERE
/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/webdriver.py:158: in __init__
    keep_alive=True)
/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py:154: in __init__
    self.start_session(desired_capabilities, browser_profile)
/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py:243: in start_session
    response = self.execute(Command.NEW_SESSION, parameters)
/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py:311: in execute
    self.error_handler.check_response(response)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <selenium.webdriver.remote.errorhandler.ErrorHandler object at 0x7efd3b702f90>
response = {'status': 500, 'value': '{"value":{"error":"unknown error","message":"connection refused","stacktrace":"stack backtra...s::imp::thread::{{impl}}::new::thread_start\n                        at /checkout/src/libstd/sys/unix/thread.rs:84"}}'}

    def check_response(self, response):
        """
            Checks that a JSON response from the WebDriver does not have an error.

            :Args:
             - response - The JSON response from the WebDriver server as a dictionary
               object.

            :Raises: If the response contains an error message.
            """
        status = response.get('status', None)
        if status is None or status == ErrorCode.SUCCESS:
            return
        value = None
        message = response.get("message", "")
        screen = response.get("screen", "")
        stacktrace = None
        if isinstance(status, int):
            value_json = response.get('value', None)
            if value_json and isinstance(value_json, basestring):
                import json
                try:
                    value = json.loads(value_json)
                    if len(value.keys()) == 1:
                        value = value['value']
                    status = value.get('error', None)
                    if status is None:
                        status = value["status"]
                        message = value["value"]
                        if not isinstance(message, basestring):
                            value = message
                            message = message.get('message')
                    else:
                        message = value.get('message', None)
                except ValueError:
                    pass

        exception_class = ErrorInResponseException
        if status in ErrorCode.NO_SUCH_ELEMENT:
            exception_class = NoSuchElementException
        elif status in ErrorCode.NO_SUCH_FRAME:
            exception_class = NoSuchFrameException
        elif status in ErrorCode.NO_SUCH_WINDOW:
            exception_class = NoSuchWindowException
        elif status in ErrorCode.STALE_ELEMENT_REFERENCE:
            exception_class = StaleElementReferenceException
        elif status in ErrorCode.ELEMENT_NOT_VISIBLE:
            exception_class = ElementNotVisibleException
        elif status in ErrorCode.INVALID_ELEMENT_STATE:
            exception_class = InvalidElementStateException
        elif status in ErrorCode.INVALID_SELECTOR \
                or status in ErrorCode.INVALID_XPATH_SELECTOR \
                or status in ErrorCode.INVALID_XPATH_SELECTOR_RETURN_TYPER:
            exception_class = InvalidSelectorException
        elif status in ErrorCode.ELEMENT_IS_NOT_SELECTABLE:
            exception_class = ElementNotSelectableException
        elif status in ErrorCode.ELEMENT_NOT_INTERACTABLE:
            exception_class = ElementNotInteractableException
        elif status in ErrorCode.INVALID_COOKIE_DOMAIN:
            exception_class = InvalidCookieDomainException
        elif status in ErrorCode.UNABLE_TO_SET_COOKIE:
            exception_class = UnableToSetCookieException
        elif status in ErrorCode.TIMEOUT:
            exception_class = TimeoutException
        elif status in ErrorCode.SCRIPT_TIMEOUT:
            exception_class = TimeoutException
        elif status in ErrorCode.UNKNOWN_ERROR:
            exception_class = WebDriverException
        elif status in ErrorCode.UNEXPECTED_ALERT_OPEN:
            exception_class = UnexpectedAlertPresentException
        elif status in ErrorCode.NO_ALERT_OPEN:
            exception_class = NoAlertPresentException
        elif status in ErrorCode.IME_NOT_AVAILABLE:
            exception_class = ImeNotAvailableException
        elif status in ErrorCode.IME_ENGINE_ACTIVATION_FAILED:
            exception_class = ImeActivationFailedException
        elif status in ErrorCode.MOVE_TARGET_OUT_OF_BOUNDS:
            exception_class = MoveTargetOutOfBoundsException
        elif status in ErrorCode.JAVASCRIPT_ERROR:
            exception_class = JavascriptException
        elif status in ErrorCode.SESSION_NOT_CREATED:
            exception_class = SessionNotCreatedException
        elif status in ErrorCode.INVALID_ARGUMENT:
            exception_class = InvalidArgumentException
        elif status in ErrorCode.NO_SUCH_COOKIE:
            exception_class = NoSuchCookieException
        elif status in ErrorCode.UNABLE_TO_CAPTURE_SCREEN:
            exception_class = ScreenshotException
        elif status in ErrorCode.ELEMENT_CLICK_INTERCEPTED:
            exception_class = ElementClickInterceptedException
        elif status in ErrorCode.INSECURE_CERTIFICATE:
            exception_class = InsecureCertificateException
        elif status in ErrorCode.INVALID_COORDINATES:
            exception_class = InvalidCoordinatesException
        elif status in ErrorCode.INVALID_SESSION_ID:
            exception_class = InvalidSessionIdException
        elif status in ErrorCode.UNKNOWN_METHOD:
            exception_class = UnknownMethodException
        else:
            exception_class = WebDriverException
        if value == '' or value is None:
            value = response['value']
        if isinstance(value, basestring):
            if exception_class == ErrorInResponseException:
                raise exception_class(response, value)
            raise exception_class(value)
        if message == "" and 'message' in value:
            message = value['message']

        screen = None
        if 'screen' in value:
            screen = value['screen']

        stacktrace = None
        if 'stackTrace' in value and value['stackTrace']:
            stacktrace = []
            try:
                for frame in value['stackTrace']:
                    line = self._value_or_default(frame, 'lineNumber', '')
                    file = self._value_or_default(frame, 'fileName', '<anonymous>')
                    if line:
                        file = "%s:%s" % (file, line)
                    meth = self._value_or_default(frame, 'methodName', '<anonymous>')
                    if 'className' in frame:
                        meth = "%s.%s" % (frame['className'], meth)
                    msg = "    at %s (%s)"
                    msg = msg % (meth, file)
                    stacktrace.append(msg)
            except TypeError:
                pass
        if exception_class == ErrorInResponseException:
            raise exception_class(response, message)
        elif exception_class == UnexpectedAlertPresentException and 'alert' in value:
            raise exception_class(message, screen, stacktrace, value['alert'].get('text'))
>       raise exception_class(message, screen, stacktrace)
E       WebDriverException: Message: connection refused

/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py:237: WebDriverException

These tests are running as part of a jenkins plan inside a docker container, to ensure the exact same environment all the time. Here is a list of the used packages and their versions:

  • python 2.7.12
  • pytest 3.6.1
  • selenium 3.8.0
  • geckodriver 0.19.1
  • firefox 62.0
  • flaky 3.4.0

The error roughly appears in about 1% of all tests. There are about 15 different tests, and the error seem to appear randomly (i.e. not always the same test).

Is this a bug in firefox/selenium/geckodriver? And is there a way to fix this?

The following code snippet is not some code I am using! It is just an idea of how to fix my problem described above. Is this maybe a good way to solve my original problem, or not?

while counter<5:
    try:
        webdriver = webdriver.Firefox(firefox_profile=profile, log_path=logfile) 
        break
    except WebDriverException:
        counter +=1

Is there a better way to do this?

Alex
  • 34,021
  • 64
  • 178
  • 371
  • Try updating Selenium (3.14.0) and geckodriver (0.22.0). There where a lot of changes, including in the initial connection. – Guy Sep 26 '18 at 09:00
  • Not involve any asynchronous or mulitprocessing right? And have you `close()` and `quit()` the webdriver once you finished one test? – MT-FreeHK Sep 27 '18 at 04:31
  • Yes; I do not involve any multiprocessing and I use 'quit' only... – Alex Sep 27 '18 at 05:38
  • @Alex Can you update the question with the complete error stack trace please? – DebanjanB Sep 27 '18 at 09:40
  • Use `quit` only is right for firefox (just find it). By the way, I can't reproduce the problem by simply send tons of requests to google.com, are you sure the problem is not from your testing page? – MT-FreeHK Sep 27 '18 at 10:13
  • I have posted the complete stack trace (as far as I know), and I have not loaded any webpage at this moment. I am just trying to get the webdriver!!! – Alex Sep 27 '18 at 12:12
  • I still can't reproduce your issue with older version of geckodriver(0.19.1).... That strange, that's the test file I used for, can you take a look if any difference except I used `multiprocessing`. https://gist.github.com/MichaelKHTai/b1607b4371fb85d10027cd5e956aee86 – MT-FreeHK Sep 28 '18 at 01:35
  • Me neither, it happens in rare cases only. In like 95% of all cases (the EXACT same test with the EXACT(?) same environment) it works fine!!! – Alex Sep 28 '18 at 06:40
  • @Alex , and what about `geckodriver.log`? Can you share the log with exception? – Andersson Sep 28 '18 at 11:49
  • @Alex I can't be certain if your case is the same but I've had similar issues with selenium and firefox/chrome in the past. After a lot of digging (which included tinkering with the selenium package itself) it turned out that the multiple selenium browser instances were eating up all my server's RAM and CPU util. If you are in fact running multiple instances at a time, it would explain why this only shows up in rare cases. I suggest that you monitor your system's/server's RAM while your code is running to confirm if this is the problem. – Keegan Oct 03 '18 at 05:42

1 Answers1

10

This error message...

{'status': 500, 'value': '{"value":{"error":"unknown error","message":"connection refused","stacktrace":"stack backtra...s::imp::thread::{{impl}}::new::thread_start\n at /checkout/src/libstd/sys/unix/thread.rs:84"}}'}

...implies that the GeckoDriver was unable to initiate/spawn a new WebBrowsing Session i.e. Firefox Browser session.

In a comment within the discussion DELETE '/session/{session id}' no longer working @andreastt mentions that:

geckodriver is implicitly ending the (previous) session when the last window closes. If driver.quit() is called as the subsequent command it will fail because the session has already been implicitly deleted.

In these cases GeckoDriver should detect that the session has been closed implicitly after driver.close() or ignore the response from driver.quit() in case the session has already been closed.

In such cases the following trace logs are generated:

1505753594121   webdriver::server   DEBUG   Last window was closed, deleting session
1505753594121   webdriver::server   DEBUG   Deleting session
1505753594121   geckodriver::marionette DEBUG   Stopping browser process
1505753594364   webdriver::server   DEBUG   <- 200 OK {"value": []}
1505753594523   webdriver::server   DEBUG   -> DELETE /session/a8312282-af00-4931-94d4-0d401abf01c9 
1505753594524   webdriver::server   DEBUG   <- 500 Internal Server Error {"value":{"error":"session not created","message":"Tried to run command without establishing a connection","stacktrace":"stack backtrace:\n   0:           0x4f388c - backtrace::backtrace::trace::h736111741fa0878e\n   1:           0x4f38c2 - backtrace::capture::Backtrace::new::h63b8a5c0787510c9\n   2:           0x442c61 - webdriver::error::WebDriverError::new::hc4fe6a1ced4e57dd\n   3:           0x42a926 - <webdriver::server::Dispatcher<T, U>>::run::hba9181b5aacf8f04\n   4:           0x402c59 - std::sys_common::backtrace::__rust_begin_short_backtrace::h19de262639927233\n   5:           0x40c065 - std::panicking::try::do_call::h6c1659fc4d01af51\n   6:           0x5e38ec - panic_unwind::__rust_maybe_catch_panic\n                        at /checkout/src/libpanic_unwind/lib.rs:98\n   7:           0x420d32 - <F as alloc::boxed::FnBox<A>>::call_box::h953e5f59694972c5\n   8:           0x5dc00b - alloc::boxed::{{impl}}::call_once<(),()>\n                        at /checkout/src/liballoc/boxed.rs:661\n                         - std::sys_common::thread::start_thread\n                        at /checkout/src/libstd/sys_common/thread.rs:21\n                         - std::sys::imp::thread::{{impl}}::new::thread_start\n                        at /checkout/src/libstd/sys/unix/thread.rs:84"}}
1505753594533   webdriver::server   DEBUG   -> DELETE /session/a8312282-af00-4931-94d4-0d401abf01c9 
1505753594542   webdriver::server   DEBUG   <- 500 Internal Server Error {"value":{"error":"session not created","message":"Tried to run command without establishing a connection","stacktrace":"stack backtrace:\n   0:           0x4f388c - backtrace::backtrace::trace::h736111741fa0878e\n   1:           0x4f38c2 - backtrace::capture::Backtrace::new::h63b8a5c0787510c9\n   2:           0x442c61 - webdriver::error::WebDriverError::new::hc4fe6a1ced4e57dd\n   3:           0x42a926 - <webdriver::server::Dispatcher<T, U>>::run::hba9181b5aacf8f04\n   4:           0x402c59 - std::sys_common::backtrace::__rust_begin_short_backtrace::h19de262639927233\n   5:           0x40c065 - std::panicking::try::do_call::h6c1659fc4d01af51\n   6:           0x5e38ec - panic_unwind::__rust_maybe_catch_panic\n                        at /checkout/src/libpanic_unwind/lib.rs:98\n   7:           0x420d32 - <F as alloc::boxed::FnBox<A>>::call_box::h953e5f59694972c5\n   8:           0x5dc00b - alloc::boxed::{{impl}}::call_once<(),()>\n                        at /checkout/src/liballoc/boxed.rs:661\n                         - std::sys_common::thread::start_thread\n                        at /checkout/src/libstd/sys_common/thread.rs:21\n                         - std::sys::imp::thread::{{impl}}::new::thread_start\n                        at /checkout/src/libstd/sys/unix/thread.rs:84"}}
1505753594549   webdriver::server   DEBUG   -> GET /shutdown 
1505753594551   webdriver::server DEBUG <- 404 Not Found {"value":{"error":"unknown command","message":"GET /shutdown did not match a known command","stacktrace":"stack backtrace:\n 0: 0x4f388c - backtrace::backtrace::trace::h736111741fa0878e\n 1: 0x4f38c2 - backtrace::capture::Backtrace::new::h63b8a5c0787510c9\n 2: 0x442d88 - webdriver::error::WebDriverError::new::hea6d4dbf778b2b24\n 3: 0x43c65f - <webdriver::server::HttpHandler<U> as hyper::server::Handler>::handle::hd03629bd67672697\n 4: 0x403a04 - std::sys_common::backtrace::__rust_begin_short_backtrace::h32e6ff325c0d7f46\n 5: 0x40c036 - std::panicking::try::do_call::h5f902dc1eea01ffe\n 6: 0x5e38ec - panic_unwind::__rust_maybe_catch_panic\n at /checkout/src/libpanic_unwind/lib.rs:98\n 7: 0x4209a2 - <F as alloc::boxed::FnBox<A>>::call_box::h032bafb4b576d1cd\n 8: 0x5dc00b - alloc::boxed::{{impl}}::call_once<(),()>\n 

Though the error codes for the error you are seeing is 'status': 500 and the error sample I have provided is 404 Not Found, apparently looks different, the core reason is similar as:

"message":"connection refused" 

due to:

imp::thread::{{impl}}::new::thread_start

from:

/checkout/src/libstd/sys/unix/thread.rs:84

From another perspective, while you use GeckoDriver, Selenium and Firefox ensure that the binaries are compatible as follows:

Geckodriver Releases


Analysis

There had been significant changes in the geckodriver binary since the availability of geckodriver 0.19.1. A couple of the changes are:

  • GeckoDriver v0.22.0 (2018-09-15):
    • The HTTP status codes used for [script timeout] and [timeout] errors has changed from Request Timeout (408) to Internal Server Error (500) in order to not break HTTP/1.1 Keep-Alive support, as HTTP clients interpret the old status code to mean they should duplicate the request.
    • The HTTP/1.1 Keep-Alive timeout for persistent connections has been increased to 90 seconds.
    • An [invalid session ID] error is now returned when there is no active session.
    • The handshake when geckodriver connects to Marionette has been hardened by killing the Firefox process if it fails.
    • The handshake read timeout has been reduced to 10 seconds instead of waiting forever.
  • GeckoDriver v0.21.0 (2018-06-15):
    • WebDriver commands that do not have a return value now correctly return {value: null} instead of an empty dictionary.
    • Force use of IPv4 network stack.
    • On certain system configurations, where localhost resolves to an IPv6 address, geckodriver would attempt to connect to Firefox on the wrong IP stack, causing the connection attempt to time out after 60 seconds. We now ensure that geckodriver uses IPv4 consistently to both connect to Firefox and for allocating a free port.

  • GeckoDriver v0.20.1 (2018-04-06):
    • Avoid attempting to kill Firefox process that has stopped.
    • With the change to allow Firefox enough time to shut down in 0.20.0, geckodriver started unconditionally killing the process to reap its exit status. This caused geckodriver to inaccurately report a successful Firefox shutdown as a failure.

  • GeckoDriver v0.20.0 (2018-03-08):
    • Backtraces from geckodriver no longer substitute for missing Marionette stacktraces.
    • The Firefox process is now given ample time to shut down, allowing enough time for the Firefox shutdown hang monitor to kick in.
    • Firefox has an integrated background monitor that observes long-running threads during shutdown. These threads will be killed after 63 seconds in the event of a hang. To allow Firefox to shut down these threads on its own, geckodriver has to wait that time and some additional seconds.


Solution

  • Upgrade Selenium to current levels Version 3.14.0.
  • Upgrade GeckoDriver to GeckoDriver v0.22.0 level.
  • Upgrade Firefox version to Firefox v62.0.2 levels.
  • If your base Web Client version is too old, then uninstall it through Revo Uninstaller and install a recent GA and released version of Web Client.
  • Always invoke driver.quit() within tearDown(){} method to close & destroy the WebDriver and Web Client instances gracefully.
  • Execute your Test as a non-root user.

Update

As per your question update of-coarse you can induce a loop for multiple trials to initialize the selenium webdriver instance as follows:

  • Ensure that there is no dangling instances of geckodriver by invoking taskkill command (WindowsOS specific) as follows:

    os.system("taskkill /f /im geckodriver.exe /T") 
    
  • Ensure that there is no dangling instances of geckodriver by invoking kill() command (Cross platform) as follows:

    from selenium import webdriver
    import psutil
    from selenium.common.exceptions import WebDriverException
    
    for counter in range(5):
        try:
            webdriver = webdriver.Firefox(executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe') 
            print("WebDriver and WebBrowser initialized ...")
            break
        except WebDriverException:
            #Cross platform
            PROCNAME = "geckodriver"
            for proc in psutil.process_iter():
                # check whether the process name matches
                if proc.name() == PROCNAME:
                    proc.kill()        
            print("Retrying ...")
    print("Out of loop ...")
    
DebanjanB
  • 118,661
  • 30
  • 168
  • 217
  • @Alex It's pretty clear from your code trials that you are using a loop as in `while counter<5` to initiate the _webdriver_ process which warrants a **clean** `quit()` – DebanjanB Sep 27 '18 at 12:54
  • No. That loop code snippet is just a *suggestion* of a possible solution. I am asking if that loop code snippet is a good way to fix my original problem... – Alex Sep 27 '18 at 13:01
  • @Alex I don't have a _Test Environment_ matching to your requirement with _selenium 3.8.0_, _geckodriver 0.19.1_ and _firefox 62.0_. So the effort was to construct an answer based on the minimal logs and code trials you have provided. – DebanjanB Sep 27 '18 at 13:16
  • Yes. I did already update the selenium version and the geckodriver number, but it is hard to say that it works now (or better), as this error only appears in 1% of all cases or even more rarely... – Alex Sep 27 '18 at 13:26
  • @Alex Checkout my answer update and let me know the status. – DebanjanB Oct 01 '18 at 08:15
  • Thanks for the update, I will try it out, but it might take a while (~1 week) until I know that the error rate is well below 1% or 5% ... – Alex Oct 01 '18 at 08:21
  • I am still getting some errors at the very basic level. This time: "connection refused" – Alex Oct 11 '18 at 07:37
  • 1
    running quit() before opening new connections solved my issue working with RSelenium – tim.farkas Aug 13 '20 at 17:21