532

Sometimes the spaces get URL encoded to the + sign, some other times to %20. What is the difference and why should this happen?

the Tin Man
  • 150,910
  • 39
  • 198
  • 279
Muhammad Hewedy
  • 26,344
  • 42
  • 116
  • 201
  • 12
    possible duplicate of [URL encoding the space character: + or %20?](http://stackoverflow.com/questions/1634271/url-encoding-the-space-character-or-20) – Cole Johnson Jun 06 '14 at 15:24

5 Answers5

513

+ means a space only in application/x-www-form-urlencoded content, such as the query part of a URL:

http://www.example.com/path/foo+bar/path?query+name=query+value

In this URL, the parameter name is query name with a space and the value is query value with a space, but the folder name in the path is literally foo+bar, not foo bar.

%20 is a valid way to encode a space in either of these contexts. So if you need to URL-encode a string for inclusion in part of a URL, it is always safe to replace spaces with %20 and pluses with %2B. This is what eg. encodeURIComponent() does in JavaScript. Unfortunately it's not what urlencode does in PHP (rawurlencode is safer).

See Also HTML 4.01 Specification application/x-www-form-urlencoded

Roman Snitko
  • 3,654
  • 21
  • 28
bobince
  • 498,320
  • 101
  • 621
  • 807
61

So, the answers here are all a bit incomplete. The use of a '%20' to encode a space in URLs is explicitly defined in RFC3986, which defines how a URI is built. There is no mention in this specification of using a '+' for encoding spaces - if you go solely by this specification, a space must be encoded as '%20'.

The mention of using '+' for encoding spaces comes from the various incarnations of the HTML specification - specifically in the section describing content type 'application/x-www-form-urlencoded'. This is used for posting form data.

Now, the HTML 2.0 Specification (RFC1866) explicitly said, in section 8.2.2, that the Query part of a GET request's URL string should be encoded as 'application/x-www-form-urlencoded'. This, in theory, suggests that it's legal to use a '+' in the URL in the query string (after the '?').

But... does it really? Remember, HTML is itself a content specification, and URLs with query strings can be used with content other than HTML. Further, while the later versions of the HTML spec continue to define '+' as legal in 'application/x-www-form-urlencoded' content, they completely omit the part saying that GET request query strings are defined as that type. There is, in fact, no mention whatsoever about the query string encoding in anything after the HTML 2.0 spec.

Which leaves us with the question - is it valid? Certainly there's a LOT of legacy code which supports '+' in query strings, and a lot of code which generates it as well. So odds are good you won't break if you use '+'. (And, in fact, I did all the research on this recently because I discovered a major site which failed to accept '%20' in a GET query as a space. They actually failed to decode ANY percent encoded character. So the service you're using may be relevant as well.)

But from a pure reading of the specifications, without the language from the HTML 2.0 specification carried over into later versions, URLs are covered entirely by RFC3986, which means spaces ought to be converted to '%20'. And definitely that should be the case if you are requesting anything other than an HTML document.

zgwortz
  • 651
  • 5
  • 6
55

http://www.example.com/some/path/to/resource?param1=value1

The part before the question mark must use % encoding (so %20 for space), after the question mark you can use either %20 or + for a space. If you need an actual + after the question mark use %2B.

cerberos
  • 6,721
  • 3
  • 37
  • 43
  • 6
    @DaveVandenEynde Why not? – cerberos Jun 24 '14 at 11:53
  • 10
    because it's wrong. It's part of the old application/x-www-form-urlencoded media type that doesn't apply to URLs. Also, `decodeURIComponent` doesn't decode it. – Dave Van den Eynde Jun 24 '14 at 12:03
  • 1
    @DaveVandenEynde "Within the query string, the plus sign is reserved as shorthand notation for a space." http://www.w3.org/Addressing/URL/uri-spec.html#z5. I don't know if this has been deprecated however browsers will have to support plus as a space forever as not doing so would break existing links that use that encoding. – cerberos Jun 24 '14 at 14:03
  • 3
    Yeah it's probably copied over from RFC 1630 and never really was a standard. http://tools.ietf.org/html/rfc3986 is the standard (updated again for IPv6 or something). Sure browsers still "support" it but what does that mean? It's either the server or client code that reads the query string and decodes it, not the browser. The browser simply passes it back and forth, and since the `+` is a *reserved character* it will be preserved by the browser. – Dave Van den Eynde Jun 24 '14 at 14:22
  • 19
    Google uses +'s for spaces in it's search urls (https://www.google.com/#q=perl+equivalent+to+php+urlencode+spaces+as+%2B). – Justin Jun 27 '14 at 16:57
  • 2
    FYI: Rails also decodes spaces in with `+` by default (```{ foo: 'bar bar'}.to_query``` => ```foo=bar+bar```) – wrtsprt Nov 09 '15 at 15:00
  • 2
    @DaveVandenEynde (or anyone who might know) I tend to agree with you - particularly based on an issue I'm dealing with at present - that the plus sign is `part of the old application/x-www-form-urlencoded media type that doesn't apply to URLs`. But is it known why even in the latest Java (8 as of now) in the class `java.net.URLEncoder` _`The space character " " is converted into a plus sign "+"`_ ? And are there other cases where "high rep" software like the Java language enforce anti-standards **instead of** the actual standard (not browsers, as they support + but also the actual standard) ? – SantiBailors May 10 '16 at 15:05
  • 2
    "+" is better, because the URL query is more readable. %20 is just gibberish to regular people compared to + – Mārtiņš Briedis Oct 13 '16 at 18:50
  • @MārtiņšBriedis I agree the URL becomes more readable with `+`. I suppose that's how Google thinks too. I just had a look and they don't URL encode `:` either: https://www.google.se/#q=google+doesn%27t+encode+:+and+uses+%2B+instead+of+spaces — I suppose that, too, is also to make the URL a bit more readable. `:` is already in URLs (`http:...`) so probably fairly safe — most other stuff they seem to URL encode though. – KajMagnus Oct 13 '16 at 20:52
  • One potentially overlooked point is- forms posted from (every?) browser encodes spaces with pluses. So no matter your preference, the modern browser is choosing to use this. – Brady Moritz May 16 '18 at 18:28
  • 1
    "+" is also 2 bytes shorter than "%20". This could be significant if the URL is being optimized. – Neil Monroe Jun 11 '18 at 15:16
  • URL encoding definition allows using + sign as spaces. If you use regex for decode this one, you should simply escape + in regexes like this: \+ . You don't need use ascii encoded plus sign by %28 etc. – Znik Nov 13 '18 at 14:02
9

Its better to always encode spaces as %20, not as "+".

It was RFC-1866 (HTML 2.0 specification), which specified that space characters should be encoded as "+" in "application/x-www-form-urlencoded" content-type key-value pairs. (see paragraph 8.2.1. subparagraph 1.). This way of encoding form data is also given in later HTML specifications, look for relevant paragraphs about application/x-www-form-urlencoded.

Here is an example of such a string in URL where RFC-1866 allows encoding spaces as pluses: "http://example.com/over/there?name=foo+bar". So, only after "?", spaces can be replaced by pluses, according to RFC-1866. In other cases, spaces should be encoded to %20. But since it's hard to determine the context, it's the best practice to never encode spaces as "+".

I would recommend to percent-encode all character except "unreserved" defined in RFC-3986, p.2.3

unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
Maxim Masiutin
  • 3,857
  • 30
  • 50
2

What's the difference: See other answers.

When use + instead of %20? Use + if, for some reason, you want to make the URL query string (?.....) or hash fragment (#....) more readable. Example: You can actually read this:

https://www.google.se/#q=google+doesn%27t+encode+:+and+uses+%2B+instead+of+spaces (%2B = +)

But the following is a lot harder to read: (at least to me)

https://www.google.se/#q=google%20doesn%27t%20oops%20:%20%20this%20text%20%2B%20is%20different%20spaces

I would think + is unlikely to break anything, since Google uses + (see the 1st link above) and they've probably thought about this. I'm going to use + myself just because readable + Google thinks it's OK.

KajMagnus
  • 10,177
  • 14
  • 69
  • 115
  • 8
    I say the "readability" argument is the best defense for '+'. The "google does it" argument is fallacious https://en.wikipedia.org/wiki/Argument_from_authority – FlipMcF Jan 24 '17 at 17:57
  • 2
    @FlipMcF The fallacious argument-from-authority Wikipedia page is about "when an authority is cited on a topic _outside their area of expertise_ or when the authority cited is _not a true expert_" — I think, however, that computers, HTTP and URL encoding is stuff _within_ Google's area of expertise. – KajMagnus Mar 20 '17 at 08:17
  • 3
    @FlipMcF Citing google's behavior, in this case, is a valid argument to using "+" in URLs. It's not that google is an authority, but that google is probably the biggest internet company and if they do something in some way, it is highly unlikely that browsers will one day decide to stop supporting that practice. Also, google chrome is one of the browsers with highest share, and they will support whatever google wants to. All in all, I'd say that no one using "+" instead of "%20" will have a hard time because of that in the foreseeable future. – jdferreira May 13 '17 at 10:29
  • I would love to continue this argument somewhere else, where there is an appeal to popularity to refuse to acknowledge an appeal to authority. At least we can all agree on one thing: '+' is superior to '%20' – FlipMcF May 24 '17 at 16:32
  • @KajMagnus "*I think [that] URL encoding is stuff within Google's area of expertise.*" I trust they have experts on the topic, but just because Google Search does it, does not mean it's valid. They can have a URL like `google.se/?your?query?here` and it would be invalid, but what do they care as long as their servers interpret it correctly and return your search results? It would be more interesting to see how they encode outgoing requests (like in fetching RSS), but even then, it's not a valid argument. Google is huge: no way that the url experts look at every place url encoding is used. – Luc Sep 05 '19 at 09:02
  • @Luc: about *"but what do they care as long as their servers interpret it correctly and return your search results?"* — I'm thinking Google has taken care to ensure other external software works properly with links *to* Google. Meaning, Google likely does things correctly, follows the standards. — Google wants to be easy to link to? Doesn't want to make other external software confused & with broken links to Google's search results, because then Google would get a bit less traffic, and Bing etc would have a small advantage, in being better to link to. – KajMagnus Sep 05 '19 at 20:29
  • 3
    Actually the URL with %20 is a lot easier to read because (desktop) browsers show the decoded URL at the bottom of the window if you move the mouse cursor over the link. Plus signs are displayed unchanged. – Martin Oct 10 '19 at 17:40