0

There are plenty of discussions around that question across the Internet. example 1 example 2 but I didn't find any answers address to .net core.

So, does anyone know the proper approach to checked file format signature, in order to prevent a user from uploading a masqueraded file?

  • 2
    What does not work for you in .NET Core for the proposed approaches? – Adriano Repetti May 18 '20 at 09:47
  • Presumably, I cannot use Urlmon.dll as well as the Windows registry because I host my APIs on different platforms. So, the one solution which left it is manually comparing first bytes with examples. It looks like a legacy approach and I try to figure out some software that makes it more accurate. – Artem Beziazychnyi May 18 '20 at 09:59
  • 2
    Unless you're looking for an external library (which is something I'd avoid if you're dealing just with a few file formats) then to compare a few bytes from the input is a perfectly fine approach (and what libraries do anyway). If you have a long list of file types to support (or you need some more advanced heuristic - which is often NOT the case) then you might also consider to use `FindMimeFromData()` (where supported) and `file --mime-type` elsewhere (should work fine for linux and macos/freebsd). – Adriano Repetti May 18 '20 at 10:11
  • There is also `libgio` but I'm unsure about its support outside Linux (and I'd tend to consider it an overkill if you _just_ need to detect the file type). Note that `FindMimeFromData()` is pretty rudimentary...) – Adriano Repetti May 18 '20 at 10:18

2 Answers2

1

I didn't find any 'lib/nuget/class' specific for .net core that might make our life easier. So, I returned to a common approach which is comparing byte file header with examples.

private readonly Dictionary<string, byte[]> _mimeTypes = new Dictionary<string, byte[]>
    {
        {"image/jpeg", new byte[] {255, 216, 255}},
        {"image/jpg", new byte[] {255, 216, 255}},
        {"image/pjpeg", new byte[] {255, 216, 255}},
        {"image/apng", new byte[] {137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82}},
        {"image/png", new byte[] {137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82}},
        {"image/bmp", new byte[] {66, 77}},
        {"image/gif", new byte[] {71, 73, 70, 56}},
    };

private bool ValidateMimeType(byte[] file, string contentType)
    {
        var imageType = _mimeTypes.SingleOrDefault(x => x.Key.Equals(contentType));

        return file.Take(imageType.Value.Length).SequenceEqual(imageType.Value);
    }
  • do you have byte array information for HEIC and HVEIC? Thx in advance – Arthur Melo Sep 03 '20 at 02:51
  • 1
    @ArthurMelo - see my answer below to find that info yourself or see my answer here: https://stackoverflow.com/questions/51018272/heic-file-signature/65241885#65241885 – Sha Dec 10 '20 at 20:33
1

Microsoft has an excellent article on file uploads which is a highly recommended read - see: https://docs.microsoft.com/en-us/aspnet/core/mvc/models/file-uploads?view=aspnetcore-5.0

The article covers topics such as checking file signatures, allowed extensions, matching extension with file signatures, renaming files, storing files and more.

In that article, they also explain how to check file signatures, which is very similar to the answer you already provided. However, it goes a bit deeper and some file types can have multiple signatures:

FileHelper.cs (full file here: https://github.com/dotnet/AspNetCore.Docs/blob/master/aspnetcore/mvc/models/file-uploads/samples/3.x/SampleApp/Utilities/FileHelpers.cs):

    private static readonly Dictionary<string, List<byte[]>> _fileSignature = new Dictionary<string, List<byte[]>>
    {
        { ".gif", new List<byte[]> { new byte[] { 0x47, 0x49, 0x46, 0x38 } } },
        { ".png", new List<byte[]> { new byte[] { 0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A } } },
        { ".jpeg", new List<byte[]>
            {
                new byte[] { 0xFF, 0xD8, 0xFF, 0xE0 },
                new byte[] { 0xFF, 0xD8, 0xFF, 0xE2 },
                new byte[] { 0xFF, 0xD8, 0xFF, 0xE3 },
            }
        },
        { ".jpg", new List<byte[]>
            {
                new byte[] { 0xFF, 0xD8, 0xFF, 0xE0 },
                new byte[] { 0xFF, 0xD8, 0xFF, 0xE1 },
                new byte[] { 0xFF, 0xD8, 0xFF, 0xE8 },
            }
        },
        { ".zip", new List<byte[]> 
            {
                new byte[] { 0x50, 0x4B, 0x03, 0x04 }, 
                new byte[] { 0x50, 0x4B, 0x4C, 0x49, 0x54, 0x45 },
                new byte[] { 0x50, 0x4B, 0x53, 0x70, 0x58 },
                new byte[] { 0x50, 0x4B, 0x05, 0x06 },
                new byte[] { 0x50, 0x4B, 0x07, 0x08 },
                new byte[] { 0x57, 0x69, 0x6E, 0x5A, 0x69, 0x70 },
            }
        },
    };

            // File signature check
            // --------------------
            // With the file signatures provided in the _fileSignature
            // dictionary, the following code tests the input content's
            // file signature.
            var signatures = _fileSignature[ext];
            var headerBytes = reader.ReadBytes(signatures.Max(m => m.Length));

            return signatures.Any(signature => 
                headerBytes.Take(signature.Length).SequenceEqual(signature));

For more signatures - please see https://www.filesignatures.net/ and https://www.garykessler.net/library/file_sigs.html

Also, you can easily check the signature of any file yourself using a hexadecimal file viewer. E.g. in Windows 10, using PowerShell you can simply write the following command in a PowerShell prompt (https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/format-hex?view=powershell-7.1):

PS> format-hex c:\myfile.gif

format-hex output for gif example file

Translated to C# that gives:

new byte[] { 0x47, 0x49, 0x46, 0x38, 0x39, 0x61 } // 'GIF89a'
new byte[] { 0x47, 0x49, 0x46, 0x38, 0x37, 0x61 } // 'GIF87a'
new byte[] { 0x47, 0x49, 0x46, 0x38 } // 'GIF8' (detect above two signatures by only looking at the first four bytes)
Sha
  • 1,429
  • 1
  • 22
  • 50