Base64 as method of sanitizing user input for Mysql

Question

I would assume that the only way to guarantee that a user input is mysql safe is to use either a whitelist where you only allow mysql safe input, or by encoding the input into something that will always return mysql safe. Base64 seams to me like one candidate for such an encoding, along with mysql_escape_string... Is there any reason besides user readability that one should not use Base64 for input sanitation? Is it considered bad practice? I also have a sense that all the filtering of user data to ensure proper safety could have been easier to achieve if the problem was handled by mysql rather then relying on the individual programmer to ensure that the data is safe... For example by using individual functions rather than a query based string, so that the command and user input remains separate throughout the call...

Like: mysql_query("SELECT * FROM Users WHERE username=","$userinput");

Alternatively one could use encoding:

$input = encode($_POST['UserInput']);
mysql_query("SELECT * FROM Users WHERE username='$input'");

No, don't use base64 encoding as a way to close SQL Injection vulnerabilities. It's not a problem that needs to be solved. The earliest commercial relational databases (DB2, Oracle, Sybase) provided _prepared statements_ with _bind variables_. And MySQL has provided the "escape string" function since way back, version 3.x at least. The real problem is developers ignoring SQL injection vulnerabilities in the code they write "oh, this is just for development", and continue promoting bad coding practices. Best practice is prepared statements with bind variables. Next best is real_escape_string. — spencer7593, Jun 12 '14 at 04:06

score 4 · Accepted Answer · edited May 23 '17 at 11:49

Don't "sanitize" input as a means to prevent SQL Injection⁽¹⁾ - use placeholders (or proper escaping), always. Be consistent. Be safe. The problem is already solved.

This case will be "safe" due to the limited domain of the base64_encode function. However..

Is there any reason besides user readability that one should not use Base64 for input sanitation? Is it considered bad practice?

It is bad practice and storing base64-encoded values (such that the shown query may work²) carries several negative implications as it changes the information stored: it destroys the ordering of values, makes the information not trivially-searchable, requires an additional "encode/decode" step, and even consumes more space - ouch!

Thus, while there may be specific cases to base64-encode data, this approach is not well-suited as a means to mitigate SQL Injection.

.. proper safety could have been easier to achieve if the problem was handled by mysql rather then relying on the individual programmer to ensure that the data is safe..

The problem is due to accessing SQL via a text protocol where the query command/shape and values are intermixed. The use of correct escaping techniques (e.g. mysql_real_escape_string) fixes this by ensuring that the information is escaped so that the SQL text is parsed as intended - however, unlike a base64-encode step it does not actually change the information supplied!

For example by using individual functions rather than a query based string, so that the command and user input remains separate throughout the call...

This is exactly what placeholders provide! Placeholders are the universally correct approach and should be encouraged. Placeholders allow the query and values to be sent to the database separately when supported by the library/database; and are emulated by escaping otherwise. Correct placeholder usage eliminates SQL Injection and the need for user code to intermix values into SQL command text, which can also make queries easier to write and maintain.

To prevent "individual programmers" from writing terrible queries, the solution is to prevent ad-hoc queries from being scattered around in code: collect the data-access operations into a Data Access Layer (DAL) (possibly in conjunction with an ORM) and only expose relevant actions, ensuring proper use of SQL within the DAL. In simpler projects the DAL is also a suitable location to centrally manage the business rules for sanitation¹ and other validation logic.

¹More accurately:

Sanitize values for business rules; this should prevent "bad information", such as a username that is too short, contains restricted characters, or otherwise fails to meet business requirements.
Use placeholders to prevent SQL Injection. This is strictly related to transfering the data to SQL and has no bearing on the information therein.

² While MySQL 5.6.1 adds FROM_BASE64, such that the encoding might simply be used in the SQL command text, this still adds an additional explicit decode step and complicates the query when using such an encoding scheme. This base64 approach is simply not necessary, as there are already proven techniques to prevent SQL injection, and was not proposed in the initial question.

I agree. Base64 is far from an ideal option. However, escaping data is still a form of "non-invasive" encoding to keep user input from mixing with the actual query. If mysql required query data and user data to be separate by for example supplying them in separate arguments in the query function, or any other isolated variable, this would make the query safer by nature. Yes, the queries would have to be structured very differently to allow for multiple separate arguments in one function, in stead of one large string... I do however agree that if used properly, sql injection is already solved. — Daniel Valland, Jun 12 '14 at 13:27
@DanielValland I disagree, as I've explained in the 3rd paragraph - using base64 *changes* the information so the database is now filled with junk. If not using placeholders (the best solution), the correct alternative is to use a function specifically designed to encode the values per SQL string literal rules (i.e. mysql_real_escape_string). Such a function prevents SQL injection when used consistently and avoids all of the negative issues associated with the base64 proposal as it does *not* change the information after the SQL is parsed. — user2864740, Jun 12 '14 at 16:56
@DanielValland While using `"SELECT * FROM Users WHERE username = FROM_BASE64('$base64Input')"` would avoid the issues associated with *storing* the base64-encoded data it is an unneeded non-standard approach that complicates the queries and is neither safer nor advantageous compared to using a proper escaping (i.e. mysql_real_escape_string) function which *is* a standard, albeit legacy, technique for this task. But alas, just use mysqli/PDO and *placeholders*. Placeholders solve the problem of SQL Injection cleanly and grants the wish of not muddling the query shape and data. — user2864740, Jun 12 '14 at 17:03

score 2 · Answer 2 · answered Jun 12 '14 at 03:24

2

If you don't want to use a prepared statement, as simple method is to use real_escape_string

This a standard method for protecting mysql from user input. There's no need to reinvent the wheel.

answered Jun 12 '14 at 03:24

David Fairbanks

572
5
17

score 2 · Answer 3 · answered Jun 12 '14 at 04:54

No, don't use base64 encoding as a way to close SQL Injection vulnerabilities. Firstly, SQL Injection is not a problem that needs to be solved; it's been solved already.

It was already solved in the earliest releases of commercial relational databases (e.g. DB2, Oracle, Sybase) All of those provided interfaces that supported prepared statements using placeholders for bind values.

And MySQL has provided the "real escape string" function since aleast since at least version 3.x, fifteen years ago (back in 2000.)

The real problem is developers ignoring SQL injection vulnerabilities in the code they write. When these vulnerabilities are pointed out, the most common response seems to be "Oh, this code is just for learning. I don't need to be concerned with that right now." And they learn by crafting code that is vulnerable, and they adopt these bad coding practices. They share their newly gained know-how, and examples of bad coding practices proliferate, and new programmers learn from those bad examples, and adopt the same bad coding practices.

The best practice to avoid SQL Injection vulnerabilities is to use prepared statements with bind values. That does EXACTLY what you propose in your question, separating the static SQL text which is known to be "safe", from the potentially "unsafe" values supplied. That's the right way to write the code.

(This discussion doesn't address the issue of inserting potentially unsafe values into the database, and how bad code retrieves those values and handles them as if they were "safe", rather than handling them as if they were potentially "unsafe".)

You'd be quite right to point out that earlier versions of MySQL didn't support real "server side" prepared statements; the client library emulated support for prepared statements. The client library accepted the prepared statements and the bind values, and bundled those together into SQL text it sent to the server, but at least the framework was there in the interface. The MySQL client library took care of "escaping" the bind values to make them safe for inclusion in the SQL text, but that responsibility was handled by the client library, not by the caller.)

Some MySQL interfaces (in particular, the deprecated and ever popular PHP mysql interface) does not provide support for prepared statements and bind variables. But it does provide the next best thing, access to the mysql_real_escape_string function. The documentation explains that this function should be used, but you wouldn't know it's use was at all important, based on the overwhelming volume of code examples that don't use it.

If you have to use the PHP mysql_ interface, then do it right, use the mysql_real_escape_string function BEFORE including potentially unsafe strings within SQL text.

$sqltext = "SELECT * FROM Users WHERE username= '" . mysql_real_escape_string($userinput) . "'"; 
mysql_query($sqltext);

If you don't have to use the deprecated mysql interface, then use mysqli or PDO. Both of those interface support prepared statements and bind variables. (Not that you would ever know that these actually provide support for these features, based on the sheer volume of code examples that entirely ignore these features, and blithely continue the same bad coding practices that plague the examples of mysql interface. (Some developers seem to think that mysqli and PDO "fix" SQL injection vulnerabilities, without understanding that its the code that they write that has the SQL Injection vulnerabilities, not the interface they use.

It would be unfair to single out PHP and MySQL as the sole source of SQL injection vulnerabilities. These same vulnerabilities are prevalent in numerous examples of vulnerable code, written using Microsoft Visual Basic, Visual C++, C#, et al.

Bottom line, base64 encoding is NOT the solution to SQL Injection vulnerabilities. (There may be some good reasons to use base64 encoding, but decidedly, avoiding SQL Injection is NOT one of those reasons.)

dkellner · Answer 4 · 2019-11-17T00:45:57.980

Yes, it will do the job

Technically, you could do this:

    $name = "Joe Smith";
    $b64_name = base64_encode($name);
    $sql = "select * from users where username=from_base64('$b64_name');";

...and it would work. Quite to the point, actually: you prevented the string to break out of its place. You convert it, and mysql converts it back. It's a minimalistic way to protect stuff without changing the whole DB access infrastructure. Oldschool? Sure. Comes in handy? Well, yes, sometimes it does. I work a lot with legacy code and in certain situations this method saves a lot of plumbing.

But you have better options

We're not in the old days anymore. PDO can bind values for real, it's fast, it makes your code cleaner, it's convenient and easy to get used to. (Just remember to always turn off ATTR_EMULATE_PREPARES. Silly how it's not the default.)

Concatenating queries was never really the way, we just did it because there was nothing else around. So just go with PDO and keep this dirty trick for cases where you have nothing else in reach.

Base64 as method of sanitizing user input for Mysql

4 Answers4

Yes, it will do the job

But you have better options

Linked