5

I have an object which needs to be "tagged" with 0-3 strings (out of a set of 20-some possibilities); these values are all unique and order doesn't matter. The only operation that needs to be done on the tags is checking if a particular one is present or not (specific_value in self.tags).

However, there's an enormous number of these objects in memory at once, to the point that it pushes the limits of my old computer's RAM. So saving a few bytes can add up.

With so few tags on each object, I doubt the lookup time is going to matter much. But: is there a memory difference between using a tuple and a frozenset here? Is there any other real reason to use one over the other?

Draconis
  • 2,771
  • 14
  • 24
  • 2
    Don't overlook the option of buying more RAM. – user2357112 supports Monica Jul 07 '19 at 04:50
  • You might be able to save some overhead by inverting things - have 3x global dicts (tag1Map, tag2Map, tag3Map) that maps from the object (identity) to the tag of that type (if present). This will help most if the tags are sparse...(you're saving the overhead that comes with creating a collection for each object) – eddiewould Jul 07 '19 at 04:50
  • @eddiewould Excellent suggestion! But I was unclear: each object has only three tags at most, but there are about twenty possible tags to choose from. I'll edit the question. – Draconis Jul 07 '19 at 04:54
  • @user2357112 That's probably a better solution in the long run… – Draconis Jul 07 '19 at 04:54
  • 1
    @Draconis I think my suggestion will still work. However if you're limited to a set of ~20 possible tags, you could consider some kind of bitmap (flags) approach i.e. instead of storing the strings, just store the presence/absense of a given flag as one of the "bits" on a 32-bit integer. You'd then have a (once off) mapping somewhere else from the flag values to the actual strings. – eddiewould Jul 07 '19 at 04:55
  • @eddiewould Oh, of course, if there are less than 32 possibilities I can store them in a single integer. (Maybe more if Python uses 64-bit integers, I haven't checked.) Feel free to post that as an answer! – Draconis Jul 07 '19 at 04:56

4 Answers4

6

Tuples are very compact. Sets are based on hash tables, and depend on having "empty" slots to make hash collisions less likely.

For a recent enough version of CPython, sys._debugmallocstats() displays lots of potentially interesting info. Here under a 64-bit Python 3.7.3:

>>> from sys import _debugmallocstats as d
>>> tups = [tuple("abc") for i in range(1000000)]

tuple("abc") creates a tuple of 3 1-character strings, ('a', 'b', 'c'). Here I'll edit out almost all the output:

>>> d()
Small block threshold = 512, in 64 size classes.

class   size   num pools   blocks in use  avail blocks
-----   ----   ---------   -------------  ------------
...
    8     72       17941         1004692             4

Since we created a million tuples, it's a very good bet that the size class using 1004692 blocks is the one we want ;-) Each of the blocks consumes 72 bytes.

Switching to frozensets instead, the output shows that those consume 224 bytes each, a bit over 3x more:

>>> tups = [frozenset(t) for t in tups]
>>> d()
Small block threshold = 512, in 64 size classes.

class   size   num pools   blocks in use  avail blocks
-----   ----   ---------   -------------  ------------
...
   27    224       55561         1000092             6

In this particular case, the other answer you got happens to give the same results:

>>> import sys
>>> sys.getsizeof(tuple("abc"))
72
>>> sys.getsizeof(frozenset(tuple("abc")))
224

While that's often true, it's not always so, because an object may require allocating more bytes than it actually needs, to satisfy HW alignment requirements. getsizeof() doesn't know anything about that, but _debugmallocstats() shows the number of bytes Python's small-object allocator actually needs to use.

For example,

>>> sys.getsizeof("a")
50

On a 32-bit box, 52 bytes actually need to be used, to provide 4-byte alignment. On a 64-bit box, 8-byte alignment is currently required, so 56 bytes need to be used. Under Python 3.8 (not yet released), on a 64-bit box 16-byte alignment is required, and 64 bytes will need to be used.

But ignoring all that, a tuple will always need less memory than any form of set with the same number of elements - and even less than a list with the same number of elements.

Boris
  • 7,044
  • 6
  • 62
  • 63
Tim Peters
  • 55,793
  • 10
  • 105
  • 118
  • This is so. dang. cool. My only hangup is that the `_` in `_debugmallocstats()` indicates it's semi-public. Should we be concerned? – Charles Landau Jul 07 '19 at 05:26
  • 2
    Concerned about what? ;-) The underscore is appropriate because CPython's small-object allocator is, of course, specific to CPython - it's an implementation detail, not something that the language itself defines. – Tim Peters Jul 07 '19 at 05:32
  • I guess Python OOP has trained me to look at methods beginning with `_` as (at least) semi-private. I think I confused that with the implementation detail you referenced, unless I'm missing something – Charles Landau Jul 07 '19 at 05:43
  • 1
    a closer look at the docs (as usual) clarifies the point https://docs.python.org/3/library/sys.html#sys._debugmallocstats – Charles Landau Jul 07 '19 at 05:54
4

sys.getsizeof seems like the stdlib option you want... but I feel queasy about your whole use case

import sys
t = ("foo", "bar", "baz")
f = frozenset(("foo","bar","baz"))
print(sys.getsizeof(t))
print(sys.getsizeof(f))

https://docs.python.org/3.7/library/sys.html#sys.getsizeof

All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

...So don't get comfy with this solution

EDIT: Obviously @TimPeters answer is more correct...

Charles Landau
  • 3,697
  • 1
  • 5
  • 19
2

If you're trying to save memory, consider

  • Trading off some elegance for some memory savings by extracting the data structure of which tags are present into an external (singleton) data structure
  • Using a "flags" (bitmap) type approach, where each tag is mapped to a bit of a 32-bit integer. Then all you need is a (singleton) dict mapping from the object (identity) to a 32-bit integer (flags). If no flags are present, no entry in the dictionary.
eddiewould
  • 1,308
  • 14
  • 32
1

`There is a possibility to reduce memory if replace tuple with a type from recordclass library:

>>> from recordclass import make_arrayclass
>>> Triple = make_arrayclass("Triple", 3)
>>> from sys import getsizeof as sizeof
>>> sizeof(Triple("ab","cd","ef"))
40
>>> sizeof(("ab","cd","ef"))
64

The difference is equal to the sizeof(PyHC_Head) + sizeof(Py_ssize_t).

P.S.: The numbers are mesured on 64-bit Python 3.8.

intellimath
  • 1,936
  • 1
  • 8
  • 10