I've done input sanitizing before to prevent HTML injection on the client and SQL injection on the back-end. I got to a point now where I store data which is not for any client in particular, it's a generic data store, but I am responsible for malicious data consumed from my store. A bit unfair I think if you ask me, but that's life.
I've went to the OWASP site and I kind of found what I was expecting, nothing. In my head sanitizing has to target a given grammar/interpreter and it's not viable to run a generic one, as you'll either have holes on the validation or you'll be making a valid grammar an invalid one at the cost of sanitizing another language.
So I though about the following options:
Bundle sanitizing
Just run the main sanitizing you can think of, for example, HTML and SQL. It seems odd to me that the first sanitizer could corrupt the input for the second of vice-versa
Whitelisting
At some point this would cause a conflict, if a given language requires a char I don't support for security reasons.
Any thoughts? Is this just a trade off in the end?