I have a large file which I will upload in chunks using Python. Each chunk will be ~4MB and the file could be quite large. I would like to (most efficiently) calculate an MD5 value for each of the chunks as well as an MD5 for the entire file. I fully understand how to calculate MD5 based on the hashlib reference docs and other stackoverflow questions on efficiently calculating MD5 values for large files.
The easiest solution I see is to have a hashlib.md5() instance for each chunk and one for the total data. However, this means effectively running the md5 algorithm twice over the full data and doing a bunch of digesting. I can optimize this ever so slightly by calling copy() on the first hashlib.md5() value after it processes the first chunk, but after that point I don't see how to do this more effectively.
Is there a better way I can basically combine the MD5 values for each chunk into a total MD5 for the full file using Python?