2

I am in the process of making a web application. It allows you to upload a .txt or .log file (IIS Logs for example).

The current way I am checking if it is a .txt or .log is checking the file extension. Now I don't like this as it allows anyone to change virus.exe to virus.txt and it will upload.

How can I verify if it really is a text file? I am sure this is a common problem, but I can't seem to find any good solutions.

Jamie Rees
  • 7,325
  • 2
  • 39
  • 71
  • you can check the mimetype of the file – user1666620 Aug 14 '15 at 16:26
  • You can check the first 256 bytes of the file. See [previous answer][1] . [1]: http://stackoverflow.com/questions/58510/using-net-how-can-you-find-the-mime-type-of-a-file-based-on-the-file-signature – James Harcourt Aug 14 '15 at 16:26
  • 1
    @user1666620: But you can't trust it. – SLaks Aug 14 '15 at 16:27
  • 1
    @SLaks you can't trust the file extension either. – user1666620 Aug 14 '15 at 16:27
  • 1
    @user1666620 That's why I am asking this question. – Jamie Rees Aug 14 '15 at 16:27
  • Open it as a text file and see if it validates according to what you are expecting. There are some [fuzzy ways](http://stackoverflow.com/questions/4520184/how-to-detect-the-character-encoding-of-a-text-file) you can check the encoding of the file as well. – ElGavilan Aug 14 '15 at 16:28
  • @ElGavilan I would rather detect it a bit earlier than when I need to use it. – Jamie Rees Aug 14 '15 at 16:29
  • 1
    virus.exe to virus.txt is the smaller part of the problem. virus.vbs to virus.txt is worst, since vbs files **are** textual and not binary to begin with.... – Zohar Peled Aug 14 '15 at 16:31
  • Check it when it is uploaded not when you need to use it. – freedomn-m Aug 14 '15 at 16:33
  • Maybe I'm missing something here but if you change virus.exe to virus.txt you would have nothing to worry about any way because it will no longer be an executable. – Stephen Brickner Aug 14 '15 at 17:04
  • Correct, but that's not the point. I want to only be able to upload text files – Jamie Rees Aug 14 '15 at 17:14
  • apart from checking extension, i would suggest to remove the execute permission of the file, which will not allow file to execute even if its an exe – harishr Aug 15 '15 at 08:32

1 Answers1

0

As far as I know there is no perfect solution to this.

You can read a portion of bytes from the file and make an educated guess of the file type from that. Try reading through the answers from this SO post :

Using .NET, how can you find the mime type of a file based on the file signature not the extension

Community
  • 1
  • 1
Jonathan Carroll
  • 790
  • 1
  • 4
  • 20