2

I have a FileField Serializer for uploaded base64 audio, I noticed that the Base64 string does not start with data:****. How do I determine the Mime Type of the uploaded file? i_need_the_file_extension_mimetype()

class AudioField(serializers.FileField):
    def to_internal_value(self, data):
        if isinstance(data, basestring):
            data = re.sub(r"^data\:.+base64\,(.+)$", r"\1", data)

            # Try to base64 decode the data url.
            try:
                decoded = base64.b64decode(data)
            except TypeError:
                raise serializers.ValidationError(_('Not a valid file'))

            file_name, file_ext, mime_type = self.i_need_the_file_extension_mimetype(decoded)


            data = ContentFile(decoded, name=file_name)

            return super(AudioField, self).to_internal_value(data)
Aneesh R S
  • 3,312
  • 4
  • 21
  • 33
Paullo
  • 1,776
  • 3
  • 17
  • 44
  • See [this post](https://stackoverflow.com/questions/43580/how-to-find-the-mime-type-of-a-file-in-python) for some suggestions. – Ralf Apr 18 '18 at 23:40
  • Thanks @Ralf however that does not apply to my case. I that Mime Type is already known, in my case I Mime Type is unknown – Paullo Apr 19 '18 at 00:26
  • I believe some external libraries mentioned in those answers, like `python-magic` for example, can be used to determine the file-type and mime-type of your uploaded files. – Ralf Apr 19 '18 at 00:33
  • What do you need the MIME type for? Are you trying to determine if the file contains valid audio data? – Blender Apr 19 '18 at 01:23
  • Duplicate of https://stackoverflow.com/questions/34287819/python-can-someone-guess-the-type-of-a-file-only-by-its-base64-encoding – Anup Yadav Apr 19 '18 at 05:21
  • @AnupYadav there's big difference between Image & Audio file processing. This question is for Base64 Audio file processing Not Image. – Paullo Apr 19 '18 at 08:50
  • @Ralf it seems I am making headway with python-magic I will post my solution later – Paullo Apr 19 '18 at 08:50
  • Please mention that in your question not in comments, second it really doesn't matter whether you use audio or image, that is applicable in either case. – Anup Yadav Apr 19 '18 at 09:09
  • @AnupYadav "uploaded base64 audio" is English phase that does NOT need any further interpretation for anyone that understands written English and knows what s/he is doing. Once again Image processing solution can never be same as Audio – Paullo Apr 19 '18 at 09:33
  • Ok, May be type does matter. – Anup Yadav Apr 19 '18 at 09:36

1 Answers1

0

I finally got this sorted with python-magic thanks to @Ralf for the pointer

class AudioField(serializers.FileField):
    def to_internal_value(self, data):
        # Check to see if it's a base64 encoded file.
        if isinstance(data, basestring):
            # Strip out the data header if it exists.
            data = re.sub(r"^data\:.+base64\,(.+)$", r"\1", data)

            try:
                decoded = base64.b64decode(data)
                mime_type = magic.from_buffer(decoded, mime=True)
                file_ext = mimetypes.guess_extension(mime_type)

            except TypeError:
                raise serializers.ValidationError(_('Not a valid file'))

            file_name = "{}{}".format(uuid.uuid4(), file_ext)

            # Check if it's a valid file extension.
            if file_ext[1:] not in settings.VOICE_VALID_FILE_EXTENSIONS:
                raise serializers.ValidationError(_('Invalid file type.'))

            # Update the data dict with new values.
            data = ContentFile(decoded, name=file_name)

            return super(AudioField, self).to_internal_value(data)

Required: pip install python-magic

Paullo
  • 1,776
  • 3
  • 17
  • 44