-3

We're having conflict with coworkers on whether we should htmlencode user input and then save it to db ( vs saving it straight forward as it is)

I also found various answers which says that DB should save the plain(!) input.

Why ? because DB should know that the user length is 1 in < and not 4 in &lt;

The html encoding should only be made when outputting.

but:

Having said that , I see that Stackoverflow is not following this rule.

When I save a question here at SO , which contains plain < it does show (obviously) the &lt; in the preview pane. BUT when I submit the question : it submit the content as json with html encode !

Json.stringify is not doing html encode

So If I type in the input :

enter image description here

And submits :

I see (via fiddler) that it sends actually the html encode value :

enter image description here

Question :

As you see - i'm a bit confused. the common logic says that db should save whatever the user type 1:1.

The sanitization should be made at the output

Community
  • 1
  • 1
Royi Namir
  • 131,490
  • 121
  • 408
  • 714

1 Answers1

0

You must encode input sent to the server, because otherwise the anti Cross Site Scripting protection on the server will block the entire request. However, you decode this input before saving to the DB.

In other, what you see in the POST isn't necessarily what is saved to the database.

Craig Stuntz
  • 123,797
  • 12
  • 247
  • 268
  • Craig , what is the problem (with all the respect to the XSS library) to send 1:1 to the db , and then - when output - sanitize + encode... ? I mean - why doing the decode again at the server just becuase of this layer.. – Royi Namir Mar 11 '14 at 12:20
  • Allowing unencoded requests to the server is flat-out dangerous. It's worth the extra steps for the (relatively infrequent!) times you want to allow markup in submissions to ensure that those fields and those fields only contain markup. Additionally, you don't usually allow any HTML at all in the POST. Usually it's only a carefully limited whitelist of tags. So you'll be encoding / decoding anyway. – Craig Stuntz Mar 11 '14 at 17:39
  • 1
    It's important to recognize that not all servers have anti-XSS features, and also important to recognize that if you were to rely upon client-side encoding, an attacker could use Fiddler to simply send an unencoded attack string to the server. – EricLaw Mar 11 '14 at 20:01
  • @ericlaw, I didn't suggest relying on client-side encoding! I agree that would not be safe. Rather, I said you should block unencoded requests at the server. – Craig Stuntz Mar 12 '14 at 01:17
  • still i dont see what's the problem with unencoded requests at the server ( when going to insert to db). db should save data. not person data or readeable data or safe data - but DATA. but when outputing - I agree that I should do : "make this data not harming when displayed". – Royi Namir Mar 12 '14 at 08:03
  • Royi, the problem is that to allow an unencoded POST you need to subvert the anti CSS protection which is on by default in IIS/ASP.NET. Most of the time, you want this. Typically you allow markup in only a couple of fields on the page. It is far safer to keep the Anti CSS feature turned on for the request and decode only the fields where markup is allowed (checking a tag whitelist at the same time) than to turn off the Anti CSS feature for the whole request. – Craig Stuntz Mar 12 '14 at 12:34
  • If you encode the input on the client, you can't reliably encode the input on the server, because it will end up with double-encoded entities. – EricLaw Mar 12 '14 at 15:20
  • Yes, I meant XSS. I disagree with @EricLaw's comment above, though. As soon as you allow any form of markup whatsoever, it is your responsibility to track encoding and decoding. So the fix for double encoding is trivial: Don't do it. The path I recommend for the tiny minority of fields where markup is permitted is: Encode markup field on client. Decode that field only on server and check whitelist. Store unencoded data in DB. Encode all fields for subsequent display. Don't disable Anti-XSS – Craig Stuntz Mar 12 '14 at 15:46