3

I was looking at the documentation and in the example section, I don't see how to create a UUID based on File Contents. Google did not help me either.

I've tried this:

>>> import uuid
>>> data = open('/media/emmc/DCIM/100ABC06/00059.JPG','rb')
>>> contents = data.read()
>>> len(contents)
9155
>>> uuid = uuid.UUID(contents)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/uuid.py", line 134, in __init__
ValueError: badly formed hexadecimal UUID string

Also this:

>>> uuid = uuid.UUID(str(contents))
>>> uuid = uuid.UUID(contents.decode('ascii'))
>>> uuid = uuid.UUID(contents.decode('utf8'))

Please help me understand how to generate a UUID based on File contents in Python 2.7.

PhilBot
  • 1,344
  • 17
  • 70
  • 143
  • 1
    The documents you link tell you how to use the function. They don't say you can convert a whole file into a UUID. Probably what you want is a [hash](http://stackoverflow.com/questions/1131220/get-md5-hash-of-big-files-in-python). – Peter Wood Jun 02 '15 at 13:52

2 Answers2

2

When you pass a string to uuid.UUID(), the string must be either 32 or 16 hexadecimal digits.

refer to the docs https://docs.python.org/2/library/uuid.html

Create a UUID from either a string of 32 hexadecimal digits, a string of 16 bytes as the bytes argument, a string of 16 bytes in little-endian order as the bytes_le argument, a tuple of six integers (32-bit time_low, 16-bit time_mid, 16-bit time_hi_version, 8-bit clock_seq_hi_variant, 8-bit clock_seq_low, 48-bit node) as the fields argument, or a single 128-bit integer as the int argument. When a string of hex digits is given, curly braces, hyphens, and a URN prefix are all optional. For example, these expressions all yield the same UUID:

UUID('{12345678-1234-5678-1234-567812345678}')
UUID('12345678123456781234567812345678')
UUID('urn:uuid:12345678-1234-5678-1234-567812345678')
UUID(bytes='\x12\x34\x56\x78'*4)
UUID(bytes_le='\x78\x56\x34\x12\x34\x12\x78\x56' +
              '\x12\x34\x56\x78\x12\x34\x56\x78')
UUID(fields=(0x12345678, 0x1234, 0x5678, 0x12, 0x34, 0x567812345678))
UUID(int=0x12345678123456781234567812345678)
bpgergo
  • 14,789
  • 2
  • 37
  • 60
2

If you want to create a hash of a file content, you probably don't need UUID. Instead, you should use hashlib and MD5, SHA-1, SHA-256 or any other supported algorithm to create a fingerprint of your file.

Antwane
  • 15,262
  • 5
  • 37
  • 71