0

The android.net.Uri documentation states:

"In the interest of performance, this class performs little to no validation. Behavior is undefined for invalid input. This class is very forgiving--in the face of invalid input, it will return garbage rather than throw an exception unless otherwise specified."

It supplies a method normalizeScheme() that states:

"This method does not validate bad URI's, or 'fix' poorly formatted URI's - so do not use it for input validation. A Uri will always be returned, even if the Uri is badly formatted to begin with and a scheme component cannot be found."

The above surely explains what the class does not do. But it has no reference to how you should go about doing validation if you need to. Searching through Stack Overflow and Google I did not turn up anything useful either.

So, if you are receiving Uri data from user input how should you go about validating it?

nbonbon
  • 1,411
  • 2
  • 15
  • 26
  • In general, you don't. I mean, you can try [one of a zillion regex patterns](https://stackoverflow.com/q/161738/115145), most of which have some flaws. Or, you try using it and provide an error response to the user if it doesn't work. – CommonsWare Mar 24 '20 at 19:02
  • @CommonsWare the reason for the question is because a static analysis tool is flagging this as a security concern. Seeing as the user could supply paths that could present security issues. So the issue with just "trying to use it" is that if you do that, you potentially just let an attacker in. – nbonbon Mar 24 '20 at 19:52
  • "Seeing as the user could supply paths that could present security issues" -- none of that has anything to do with "validating a `Uri`", IMHO. Validation usually refers to whether it is syntactically correct. For example, with email addresses, we can validate whether `asdfasd.asdfasd` is an email address based on (flaky) regular expressions. We only know if the address works if we try sending it email. In your case, your concern is not whether the `Uri` is valid, but whether the content returned by your use of that `Uri` is safe to use. That's a significantly different problem (again IMHO). – CommonsWare Mar 24 '20 at 19:57
  • I can see your confusion from my choice of verbiage. Problem still remains though. The issue that needs to be solved is being able to 'normalize' (as java.net.URI calls it) an android.net.Uri. So characters that could allow for path traversal of files that the user should not have access to. – nbonbon Mar 24 '20 at 20:10
  • This is the issue I am trying to resolve as reported by Coverity: "Path manipulation vulnerabilities can be addressed by proper input validation. Blacklisting characters that allow unsafe path traversal can improve the safety of the input, but the recommended approach is to whitelist the set of expected characters. This should exclude absolute paths and upward directory traversal." So there is some overlap between the verbiage of validation/normalization/etc. – nbonbon Mar 24 '20 at 20:10
  • "So characters that could allow for path traversal of files that the user should not have access to" -- IMHO, that's the server's job, not the clients. The client has no idea what the use can and cannot access on a server. Regardless if you want to normalize or canonicalize the URL, `java normalize url` turns up lots of stuff, such as [this](https://stackoverflow.com/q/2993649/115145) and [this](https://github.com/shekhargulati/urlcleaner). – CommonsWare Mar 24 '20 at 20:15

0 Answers0