2

I am working with Python code that calls into C wrappers, but the C code is very buggy (and I have no other alternative) and causes a segfault when a Python object managed in C goes out of scope, so I have to keep references to each object created.

Is there any good way to make an ergonomic "unique" wrapper where each class can only have one instance per set of constructor arguments, e.g.

@unique
class Test:
    cls_val = 0
    def __init__(self, val):
        self.val = val

a = Test(1)
b = Test(1)

assert a is b

c = Test(2)
d = Test(2)

assert c is not b and c is not a
assert c is d

I've made this decorator, but it prevents any @unique-decorated class from being used as a base class (constructing an instance of the derived class calls the __new__ of the decorator).

def unique(unique_cls):
    class Unique:
        instances = {}
        unique_class = unique_cls
        def __new__(cls, *args, **kwargs):
            if not Unique.instances.get(
                (
                     cls.unique_class,
                     f_args := frozenset(args),
                     f_kwargs := frozenset(kwargs),
                )                
            ):
                Unique.instances[
                     (Unique.unique_class, f_args, f_kwargs)
                ] = Unique.unique_class(*args, **kwargs)
            return Unique.instances[(Unique.unique_class, f_args, f_kwargs)]
        def __getattr__(self, name):
            # Overloaded to get class attributes working for decorated classes
            return object.__getattribute__(Unique.unique_class, name)
    return Unique
  • I think you should have a look at the metaclass implementation of singletons here: https://stackoverflow.com/q/6760685/5079316 it can be adapted to cache singletons based on arguments as well – Olivier Melançon Feb 14 '20 at 19:44
  • The singleton pattern means there's just one instance of a class; here, there should just be one instance of a class for a given argument. Essentially, the OP wants `__new__` to be memoized. – chepner Feb 14 '20 at 19:45
  • @chepner I agree, but the singleton pattern basically caches an instance with the class as key. Adding the arguments in the key will work. This allows to customize the behaviour of `lru_cache` suggested below with regard to the order of `kwargs`, by sorting them before caching, by example. – Olivier Melançon Feb 14 '20 at 19:50

1 Answers1

6

You can use functools.lru_cache to store all instances in a cache, and set no limit on the cache size. When you call the constructor, you'll get a new instance, and it will be stored in the cache. Then whenever you call the constructor with the same arguments again, you'll get the cached instance. This also means that every object always has a reference from the cache.

from functools import lru_cache

@lru_cache(maxsize=None)
class Test:
    def __init__(self, val):
        self.val = val

Demonstration:

>>> a = Test(1)
>>> a2 = Test(1)
>>> a is a2
True
>>> b = Test(2)
>>> b2 = Test(2)
>>> b is b2
True
>>> a is b
False

If you need to be able to subclass Test (or really, do anything with Test itself except create instances), then you can override __new__ and apply the decorator there. This works because cls is an argument to __new__, so the cache will distinguish between different instances by their class as well as by their __init__ arguments.

class Test:
    @lru_cache(maxsize=None)
    def __new__(cls, *args, **kwargs):
        return object.__new__(cls)
    def __init__(self, val):
        self.val = val

Demonstration:

>>> Test(1) is Test(1)
True
>>> Test(1) is Test(2)
False
>>> class SubTest(Test): pass
... 
>>> Test(1) is SubTest(1)
False
>>> SubTest(1) is SubTest(1)
True
kaya3
  • 31,244
  • 3
  • 32
  • 61
  • 2
    I will just point out that when you have keyword arguments, their order can matter for the caching – Olivier Melançon Feb 14 '20 at 19:47
  • That appears to be an (unfortunate) side effect of dicts remembering their insertion order as of Python 3.7. (I suspect it's true in CPython 3.6 as well, if I weren't too lazy to check.) – chepner Feb 14 '20 at 19:50
  • That problem also applies before Python 3.6, because e.g. `Test(1)` and `Test(val=1)` are different calls as far as `lru_cache` is concerned. I would recommend making the arguments positional-only, like `def __init__(self, val, /):` if this is a concern. – kaya3 Feb 14 '20 at 19:53
  • Even if order is not remembered, order of insertion can affect appearance in the underlying hashtable (in case of collisions). So this is to be expected in all versions of Python, but for a different reasons through versions. – Olivier Melançon Feb 14 '20 at 19:55
  • Whoa, `lru_cache` takes classes? I can't even find that in its documentation. How does that work then? Does `lru_cache` have extra code for classes, or are classes somehow functions and this works for any function decorator? – Kelly Bundy Feb 14 '20 at 20:09
  • @HeapOverflow Python classes are callable, and decorators tend to work on anything callable (unless the specific decorator does something other than call the thing it's decorating). But because the decorator returns a function, the first example won't look much like a class, so it can't be subclassed, you can't call static/class methods, and so on. So the first example is really a hack. – kaya3 Feb 14 '20 at 20:10
  • 1
    Ah, ok. Callables. Maybe its documentation should say that then. Though maybe it shouldn't, if it's a hack with such a downside :-) – Kelly Bundy Feb 14 '20 at 20:17