Do bots actually set the referer url?

Question

As from here and here, I was under the impression that bots don't set referer url.

But i just discovered otherwise, unless the situations are different. We have this javascript call:

<script>aCallToToWebApiEndAndUpdateDom(params)</script>

and from the api end, we create some user session to log views, and along we also log the userAgent and urlReferrer. oh boy, i just found a record with the following:

url referer: the actual page visited
user agent : Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Am i missing one or two points? Is this the normal behavior? and if I want to log only human visits, it is the case of detecting them bots manually instead of checking for empty referer.

Normal users can block their referrer too. Or can inject some malicious stuff, you cannot rely on that at all. — emix, Mar 29 '18 at 08:08
Most of the request headers can be set to almost anything by the bot, you can never really rely on any information in the headers, unless you correlate it with something server side :) — scagood, Mar 29 '18 at 08:34
@scagood how would you correlate the headers server side? userAgent for example. — waitforit, Mar 29 '18 at 08:42
I mean things like the [Authorization Header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Authorization) as you can actively check that's valid — scagood, Mar 29 '18 at 08:43
I'm not sure I get that. here i'm talking of all visits (authenticated or not) or to put it in short, authentication is not a requirement here. — waitforit, Mar 29 '18 at 08:51
I'm just saying you cannot trust anything in the request directly. :D — scagood, Mar 29 '18 at 08:57
I would imagine that bots can do whatever they choose. There might be some convention, but nothing which says it's guaranteed. — ADyson, Mar 29 '18 at 09:04
A bot has no requirement to adhere to standards or best practices beyond those that allow it to function. Some maybe advanced enough to interpret JavaScript. Others maybe so simplified they leave out headers altogether. Any attempts to differentiate client types should be treated as a rough guide only because the only source of information you have to base your numbers on can be forged. — Gary Ott, Mar 29 '18 at 09:09

Do bots actually set the referer url?

0 Answers0