0

In the various answers in the SO, it is mentioned that you should escape ampersand, greater than and less than symbols. Even &ndash and &mdash should be escaped as far I understood.

Source: Do I really need to encode '&' as '&'?, check out the answers in there!

Can anyone show me how exactly security can be breached or cookie stealing can happen if I do not escape the symbols I have mentioned. It does not make sense to me the people can hack the websites because of this.

Community
  • 1
  • 1
Mahesh
  • 41
  • 5
  • Please downvote by mentioning the reason :( – Mahesh Nov 24 '15 at 05:49
  • I didn't downvote you ... but what the heck are you talking about? Could you please cite a reference link that says "you should escape ampersand"? To give us some context in which that statement was made? – paulsm4 Nov 24 '15 at 05:50
  • haha, alright! That's what I have thought "what the heck". Let me put a link in there – Mahesh Nov 24 '15 at 05:51
  • Read the answers, they gave generic replies on why one should escape. They said cookie stealing and all .. can you check – Mahesh Nov 24 '15 at 05:56
  • *"Read the answers, they gave generic replies..."* - @Mahesh: Perhaps you did not read that thread thoroughly enough. Did you see this answer - http://stackoverflow.com/a/3493425/1355315 ? – Abhitalks Nov 24 '15 at 06:04
  • 1
    This is fundamentally not a security issue. It's a matter of writing validating HTML, and avoiding possible anomalies in how your HTML behaves. The link you yourself cite provides all the information you could possibly want about the issue. –  Nov 24 '15 at 06:05
  • @abhi " which is a huge problem for user-submitted data, which could very well lead to HTML and script injection, cookie stealing and other exploits." regarding this .. I'm talking about. It's in the answer you have mentioned – Mahesh Nov 24 '15 at 06:08
  • @Mahesh: That line is prefixed with *"you might also not be escaping tag delimiters..."*. That is out of context from ampersand thingy. – Abhitalks Nov 24 '15 at 06:12
  • oh alright! Thanks for the info – Mahesh Nov 24 '15 at 06:14
  • 1
    @Mahesh: Also regarding the security on script injection, that answer talks about other things. For example, you could get data which contains ` – Abhitalks Nov 24 '15 at 06:17

1 Answers1

1

If your question is "should I always use & (and never "&") - then yes.

If for no other reason than "good style".

Here's why:

HTML comes from SGML, and SGML/HTML have a notion of "entities", which are delimited in SGML text by "&" .. ";".

The ampersand character & is must be defined as an entity, to differentiate it from the start of an entity. So must HTML brackets < and > (&lt; and &gt; respectively). And so on.

Other HTML entities are simply defined for "convenience", such as &copy; or &euro;.

Here is a complete list of W3C-conforming, HTML5 entities:

PS:

As torazaburo noted above, "this is not fundamentally a security issue". It's merely the way HTML works ;)

Community
  • 1
  • 1
paulsm4
  • 99,714
  • 15
  • 125
  • 160