8

I am building a tool in Go that needs to make a very large number of simultaneous HTTP requests to many different servers. My initial prototype in Python had no problem doing a few hundred simultaneous requests.

However, I have found that in Go this almost always results in a Get http://www.google.com: dial tcp 216.58.205.228:80: i/o timeout for some if the number of simultaneous requests exceeds ~30-40.

I've tested on macOS, openSUSE, different hardware, in different networks and with different domain lists, and changing the DNS server as described in other Stackoverflow answers does not work either.

The interesting thing is that the failed requests do not even produce a packet, as can be seen when checking with Wireshark.

Is there anything that I am doing wrong or is that a bug in Go?

Minimum reproducible program below:

package main

import (
    "fmt"
    "net/http"
    "sync"
)

func main() {
    domains := []string{/* large domain list here, eg from https://moz.com/top500 */}

    limiter := make(chan string, 50) // Limits simultaneous requests

    wg := sync.WaitGroup{} // Needed to not prematurely exit before all requests have been finished

    for i, domain := range domains {
        wg.Add(1)
        limiter <- domain

        go func(i int, domain string) {
            defer func() { <-limiter }()
            defer wg.Done()

            resp, err := http.Get("http://"+domain)
            if err != nil {
                fmt.Printf("%d %s failed: %s\n", i, domain, err)
                return
            }

            fmt.Printf("%d %s: %s\n", i, domain, resp.Status)
        }(i, domain)
    }

    wg.Wait()
}

Two particular error messages are happening, a net.DNSError that does not make any sense and a non-descript poll.TimeoutError:

&url.Error{Op:"Get", URL:"http://harvard.edu", Err:(*net.OpError)(0xc00022a460)}
&net.OpError{Op:"dial", Net:"tcp", Source:net.Addr(nil), Addr:net.Addr(nil), Err:(*net.DNSError)(0xc000aca200)}
&net.DNSError{Err:"no such host", Name:"harvard.edu", Server:"", IsTimeout:false, IsTemporary:false}

&url.Error{Op:"Get", URL:"http://latimes.com", Err:(*net.OpError)(0xc000d92730)}
&net.OpError{Op:"dial", Net:"tcp", Source:net.Addr(nil), Addr:net.Addr(nil), Err:(*poll.TimeoutError)(0x14779a0)}
&poll.TimeoutError{}

Update:

Running the requests with a seperate http.Client as well as http.Transport and net.Dialer does not make any difference as can be seen when running code from this playground.

Neverbolt
  • 89
  • 1
  • 3
  • 1
    You are making all requests with http.DefaultClient. What happens when you distribute the requests over a few independent http clients? Perhaps the connection pool is limited to some number of connections. – Peter Aug 25 '18 at 14:05
  • reworked your code (https://play.golang.org/p/HnKdFG5roj-) and yes i also find some results rather suspicious. Not sure why it would not resolve web.mit.edu / fda.gov / geocities.jp / clickbank.net. However imho it is not related to concurrency rate. – mh-cbon Aug 25 '18 at 15:25
  • Also found this along the road, `2018/08/25 17:24:53 Unsolicited response received on idle HTTP channel starting with "HTTP/1.0 408 Request Time-out\r\nServer: AkamaiGHost\r\nMime-Version: 1.0\r\nDate: Sat, 25 Aug 2018 15:24:53 GMT\r\nContent-Type: text/html\r\nContent-Length: 218\r\nExpires: Sat, 25 Aug 2018 15:24:53 GMT\r\n\r\n\nRequest Timeout\n\n

    Request Timeout

    \nThe server timed out while waiting for the browser's request.

    \nReference #2.3ff90a17.1535210693.0\n\n"; err=`

    – mh-cbon Aug 25 '18 at 15:26
  • @Peter see the update, it does not make a difference – Neverbolt Aug 26 '18 at 14:51
  • @mh-cbon have you tried to lower the concurrency, because with ~5-10 concurrent requests its running without problems – Neverbolt Aug 26 '18 at 14:53
  • 1
    yes, it is very similar to my previous tests, 40 failures or so. Still some i dont quiet because dig resolves them. Even `googleusercontent.com` constantly fails. See also https://github.com/golang/go/issues/18588. I ran it on 1.10, i have not took time to switch to 1.11 yet, might worth the test. – mh-cbon Aug 26 '18 at 16:52
  • @mh-cbon that issue seems pretty much like what is happening here, thank you – Neverbolt Aug 27 '18 at 12:39
  • @Neverbolt Hey did you solve your problem? – CriticalRebel Feb 27 '20 at 17:55
  • @CriticalRebel no I did not, I have reduced the amount of parallel requests and chose to work with multiple instances of the same program, which seems to point to the open file limit that is mentioned in the issue mh-cbon mentioned and is not yet resolved from a go standard library standpoint. – Neverbolt Mar 27 '20 at 09:57
  • @Neverbolt, there is a good chance that the DNS server is causing your bottleneck. Google [explicitly states](https://developers.google.com/speed/public-dns/docs/security#rate_limit) it will alter the queries per second per client if it thinks something odd is going on. I cannot imagine it is the only DNS provider that has this defensive measure built in. A way to test this is overriding the default [DNS Resolver](https://koraygocmen.medium.com/custom-dns-resolver-for-the-default-http-client-in-go-a1420db38a5d) to use a cache like (here)[https://stackoverflow.com/a/40252460/1987437]. – Liam Kelly May 25 '21 at 13:42
  • @LiamKelly As I said in the question, taking a python client to do the very same thing did not result in any performance issues, so I don't think that the DNS server is the bottleneck, as both were using the same server. – Neverbolt May 26 '21 at 14:06
  • @Neverbolt there is a good chance that the python code is just slower given the GIL. Surprised that there is not a DNS tool to measure QPS. Seem pretty straight forward to do in `gopacket` but probably even more useful to implement via `ebpf`. – Liam Kelly May 27 '21 at 13:27

0 Answers0