0

I'm using the base64.b64encode library in python to convert my images into their b64 version. I noticed that while the b64 specs says that it takes 1.3 times the memory of the original data. I observe that it takes almost twice if not more of the original size.

Can anyone explain the behavior? Do we have memory fragmentation when doing base 64 encoding?

tsar2512
  • 2,282
  • 2
  • 28
  • 46
  • How do you "observe" that? Did you use [sys.getsizeof](https://docs.python.org/3/library/sys.html#sys.getsizeof)? – Mike Scotty Apr 03 '18 at 12:33
  • What quantities are you comparing exactly? 1.3 is only the ratio between the length of the encoded `bytes` object (the output of `b64encode`) and the length of the given `bytes` object (the input of `b64encode`). – jdehesa Apr 03 '18 at 12:37
  • @MikeScotty I used `memory_profiler` a python package, which I believe internally uses `sys.getsizeof` – tsar2512 Apr 03 '18 at 12:37
  • @jdehesa Given that all are character bytes shouldn't memory usage have the same ratio? – tsar2512 Apr 03 '18 at 12:38
  • @tsar2512 No, not necessarily, more about it here: [What is the difference between len() and sys.getsizeof() methods in python?](https://stackoverflow.com/questions/17574076/what-is-the-difference-between-len-and-sys-getsizeof-methods-in-python) (in that case is about Python 2 `str` which is pretty much like Python 3 `bytes`). – jdehesa Apr 03 '18 at 12:45
  • The only reason it would not have the same ratio is because it's not contiguous.. Which is what I want to understand – tsar2512 Apr 03 '18 at 12:45
  • Why not provide a [mcve] to your observation yourself? ``sys.getsizeof( b"" ), sys.getsizeof( b"abc" ), and sys.getsizeof( base64.b64encode(b"abc") )`` return ``33, 36, and 37`` on my system. Twice the memory usage would mean the third value would be 39, but it's not. – Mike Scotty Apr 03 '18 at 12:46

0 Answers0