0

Commonly passwords are encoded with MD5 on web sites. I'm considering encoding user names as file names in MD5 too. I'd use PHP on a Linux based server. Are there any drawbacks to encrypting a file name with PHP to MD5 besides being indistinguishable without decryption?

<? php 
if(isset($_POST['register'])){
    $username = md5($_POST['username']);
    $email = htmlentities($_POST['email'], ENT_QUOTES|ENT_XML1);
    $password = $_POST['password'];
    $c_password = $_POST['c_password'];

    $xml = new SimpleXMLElement('<user></user>');
    $xml->addChild('password', md5($password));
    $xml->addChild('email', $email);
    $xml->asXML('users/'.$username . '.xml');

    header('Location: validate.php');
    die;
}
?>
Jarrett Mattson
  • 825
  • 2
  • 7
  • 14
  • 1
    Yes, there are: [MD5 is broken.](http://cryptocrats.com/crypto/md5-the-hash-algorithm-is-now-broken/) If you want industry-level hashing, use the SHA family of hash functions. Also, this is not quite an encryption method - in theory, the data is irrecoverable from the hash. If you want true encryption, consider using RSA. –  Nov 13 '12 at 22:09
  • 2
    encryption != hashing. See also http://stackoverflow.com/q/549/427545, http://security.stackexchange.com/q/12009/2630 – Lekensteyn Nov 13 '12 at 22:10
  • It appears you try to use hashing to obfuscate whos credentials are stored in what file. That does _not_ add any security to your site. Anyone can do the same using a username and the md5 hash algorithm. – arkascha Nov 13 '12 at 22:15
  • Not too worried about encrypting, just seemed like a pretty safe transform for a user name to file name. Sorry about using the term encrypting so loosely. – Jarrett Mattson Nov 13 '12 at 22:15
  • I studied those chinese works of md5 collisions deeply when writing my magister diploma and may say that they have little practical correlation with resources avaliable for genuine hacker. If your system don't need to be HIGHLY secure, use md5 freely. And yes, it will make good filename, you can make it unique if needed with salt. – Vyacheslav Voronchuk Nov 13 '12 at 22:19
  • 1
    Do not listen to Vyacheslav. MD5 is easily crackable by any script kiddy out there. It takes a trivial amount of time on a desktop computer, plus there are searchable has/rainbow table databases online. – Jonathan Amend Nov 13 '12 at 22:23
  • 1
    Paranoics :) Ok, give me a source for the hash like that: 5b31bbe8670cc968ff9e63088120614a if it's "trivial". – Vyacheslav Voronchuk Nov 13 '12 at 22:27
  • 3
    The question is what the heck is the point of doing it? Unless you're planning to let people use question marks and asterisks in their names, passing it through MD5 gains you nothing. – cleong Nov 13 '12 at 22:37
  • 1
    @cleong: That's exactly it, I'm trying to extend the applicable range to anything. – Jarrett Mattson Nov 13 '12 at 22:41
  • 1
    Well, good luck trying to distinguish user [4 spaces] from user [6 spaces]. LOL. – cleong Nov 13 '12 at 22:56
  • @cleong: Yea no kidding. It's usually accompanied by a email address, a practice I think I'll keep. – Jarrett Mattson Nov 13 '12 at 23:27
  • @ the two -1ers and 5 that voted to close... Why is there an accepted answer? Maybe you didn't even read the question or title. – Jarrett Mattson Nov 14 '12 at 18:21

3 Answers3

2

As H2C03 mentions, MD5 is broken (see his link in the comments.) There are also the following factors to consider:

  1. Anyone who can crack the username can crack the password, and vice versa, so you've gained nothing
  2. This will make writing a lot of user management queries and code a complete nightmare
  3. The reason hashing passwords is valuable is because there's (ideally) no way to get the plaintext back out. In order to make your site work, you'll need to include code to decrypt the usernames to plaintext, and a hacker that's already on your system will, if he's any good, simply add some code to your decryption routine to divert the plaintext passwords as people use your site.

Executive summary: A lot of extra work, dubious benefit even while using modern encryption algorithms.

Winfield Trail
  • 5,251
  • 2
  • 22
  • 43
  • That's fine and dandy, but does it make a good file name? – Jarrett Mattson Nov 13 '12 at 22:20
  • Sort of? If you need files not to be traceable back to the individual users though it's still much better to hash or randomize **every** part of the filename, to prevent trivial snooping from being able to absolutely identify the owner of the file via URL. – Winfield Trail Nov 13 '12 at 22:47
  • 1
    To clarify, I'm suggesting that you do `sha1($username.$OriginalFilename).$FileExtension` rather than `sha1($username).$OriginalFilename.$FileExtension`. Much more secure. You could also use a hashed directory name but with the filename untouched, which has the benefit of preserving the filename upon download. A few of the big players in the media/archive download site scene do this. – Winfield Trail Nov 13 '12 at 22:48
1

Any Linux filesystem you're using can accept any character in a filename except for the directory separator. So why don't you either replace any / characters with something else or, better yet, reject any attempt to register with a username that contains a / (and probably any other nonprintable character)? "Oh, but what about collisions"? If you're using a hashing algorithm, you're not eliminating the possibility of collisions, you're just reducing it while adding useless computational complexity. To generate a unique identifier, either use an incrementing value (like Unix does with "User IDs") or just generate a uuid: http://php.net/manual/en/function.uniqid.php - and store that mapping in a database.

Maintaining a mapping of usernames to IDs is what everyone else does for a reason. :)

dannysauer
  • 3,581
  • 18
  • 28
  • I think I'll do that. I'll accept their email addresses too as user names while I'm at. For some reason email autocompletes for some of them. This could help to combat same email, multiple users during registration. – Jarrett Mattson Nov 14 '12 at 00:15
0

It looks like you're only using md5 to map a user name to a file name. Nothing wrong with that, it's a common one-way hash algorithm.

I wouldn't use it to encrypt the password though.

Brad Koch
  • 16,415
  • 18
  • 102
  • 128
  • Cool, just need a safe transposition, that won't break anything. +1 and answered if you've got a better one, or nobody else does. – Jarrett Mattson Nov 13 '12 at 22:36
  • The big question everyone is asking right now is why? What is the goal you are trying to achieve by doing this? – Brad Koch Nov 13 '12 at 22:39
  • Extending the range of applicable file names that double as user names. – Jarrett Mattson Nov 13 '12 at 22:44
  • If your only goal is to map any username string to a clean filename, there is no need for a two-way function. Just put the username in the file as one of the fields. – Brad Koch Nov 13 '12 at 22:48
  • Actually I would for user management. Found the drawback perhaps. – Jarrett Mattson Nov 13 '12 at 22:51
  • Sounds like you're having trouble separating your concerns. You want 1) a storage engine capable of retrieving user data based on a string of random characters and 2) the ability to perform a "user management" task. You could just use a database table; no need for this obfuscation. – Brad Koch Nov 13 '12 at 22:57
  • Yes 1 and 2; users love random characters and I'm going to need all the help I can muster to keep them happy. Thought about the database table, but a two way system seems easier and more efficient. – Jarrett Mattson Nov 13 '12 at 23:08
  • a two way system is definitely less efficient than a database lookup, because a hashing calculation has to be performed every time. it's also less secure, simply by virtue of being reversible. – Winfield Trail Nov 13 '12 at 23:15
  • I'm testing using UTF-8 encoding, but that's going to break any moment without a lot of work. – Jarrett Mattson Nov 13 '12 at 23:30