31

I've been reading up on Python 3.7's dataclass as an alternative to namedtuples (what I typically use when having to group data in a structure). I was wondering if dataclass is compatible with the property decorator to define getter and setter functions for the data elements of the dataclass. If so, is this described somewhere? Or are there examples available?

martineau
  • 99,260
  • 22
  • 139
  • 249
GertVdE
  • 575
  • 1
  • 4
  • 9

12 Answers12

31

It sure does work:

from dataclasses import dataclass

@dataclass
class Test:
    _name: str="schbell"

    @property
    def name(self) -> str:
        return self._name

    @name.setter
    def name(self, v: str) -> None:
        self._name = v

t = Test()
print(t.name) # schbell
t.name = "flirp"
print(t.name) # flirp
print(t) # Test(_name='flirp')

In fact, why should it not? In the end, what you get is just a good old class, derived from type:

print(type(t)) # <class '__main__.Test'>
print(type(Test)) # <class 'type'>

Maybe that's why properties are nowhere mentioned specifically. However, the PEP-557's Abstract mentions the general usability of well-known Python class features:

Because Data Classes use normal class definition syntax, you are free to use inheritance, metaclasses, docstrings, user-defined methods, class factories, and other Python class features.

shmee
  • 3,611
  • 2
  • 14
  • 22
  • 13
    I guess I kinda wish that dataclasses would allow for a property to override the getting or setting without having to name fields with a leading underscore. Part of the data class sugar is the initialization which would mean that you'd end up with `Test(_name='foo')` -- that means that you're interface would differ from your creation. This is a small price but still, there is so little difference between dataclasses and named tuples that this would be something else useful (that differentiates it more and hence, gives it more purpose). – Marc Sep 05 '18 at 15:03
  • 1
    @Marc They do! Use classic getters and setters and call the setter function in the init instead of assigning directly. `def set_booking_ref(self, value:str): self._booking_ref = value.strip()` ... `booking_ref = property(get_booking_ref, set_booking_ref)` ... `def __init__(self, booking_ref :str): self.set_booking_ref(self, booking_ref)`. Not sure how you would do this with `@property` decorator. – Alan Sep 20 '18 at 22:12
  • 14
    @Marc I had the same concern. [here](https://blog.florimondmanca.com/reconciling-dataclasses-and-properties-in-python) is a good explanation of how to solve this problem. – JorenV Jan 10 '19 at 13:29
  • @JorenV thank you for the explanation. That is the best way IMHO to do it currently. I *still* wish dataclasses even did that dance for you (but I can settle for this) - it is more explicit after all. – Marc Jan 10 '19 at 18:18
  • @JorenV, Thank you for the link to that explanation. I read through it and tried implementing it myself and then started to wonder why I was going through all this trouble when I could just keep a regular class instead of dataclass and avoid all of this. – JasonArg123 Jun 06 '19 at 17:02
  • @JorenV, you should consider creating an answer with your comment, as it's a great solution but is somewhat buried in the comments here. – Dan Coates Apr 12 '20 at 21:29
  • @DanCoates, thanks for pointing it out. I just created a proper answer. – JorenV Apr 13 '20 at 16:08
6

TWO VERSIONS THAT SUPPORT DEFAULT VALUES

Most published approaches don't provide a readable way to set a default value for the property, which is quite an important part of dataclass. Here are two possible ways to do that.

The first way is based on the approach referenced by @JorenV. It defines the default value in _name = field() and utilises the observation that if no initial value is specified, then the setter is passed the property object itself:

from dataclasses import dataclass, field


@dataclass
class Test:
    name: str
    _name: str = field(init=False, repr=False, default='baz')

    @property
    def name(self) -> str:
        return self._name

    @name.setter
    def name(self, value: str) -> None:
        if type(value) is property:
            # initial value not specified, use default
            value = Test._name
        self._name = value


def main():
    obj = Test(name='foo')
    print(obj)                  # displays: Test(name='foo')

    obj = Test()
    obj.name = 'bar'
    print(obj)                  # displays: Test(name='bar')

    obj = Test()
    print(obj)                  # displays: Test(name='baz')


if __name__ == '__main__':
    main()

The second way is based on the same approach as @Conchylicultor: bypassing the dataclass machinery by overwriting the field outside the class definition.

Personally I think this way is cleaner and more readable than the first because it follows the normal dataclass idiom to define the default value and requires no 'magic' in the setter.

Even so I'd prefer everything to be self-contained... perhaps some clever person can find a way to incorporate the field update in dataclass.__post_init__() or similar?

from dataclasses import dataclass


@dataclass
class Test:
    name: str = 'foo'

    @property
    def _name(self):
        return self._my_str_rev[::-1]

    @_name.setter
    def _name(self, value):
        self._my_str_rev = value[::-1]


# --- has to be called at module level ---
Test.name = Test._name


def main():

    obj = Test()
    print(obj)                      # displays: Test(name='foo')

    obj = Test()
    obj.name = 'baz'
    print(obj)                      # displays: Test(name='baz')

    obj = Test(name='bar')
    print(obj)                      # displays: Test(name='bar')


if __name__ == '__main__':
    main()
Martin CR
  • 609
  • 6
  • 16
  • As someone pointed out on another thread, if you find yourself going to this much trouble then it's probably better to just use a normal class... – Martin CR May 23 '20 at 08:10
4

Here's what I did to define the field as a property in __post_init__. This is a total hack, but it works with dataclasses dict-based initialization and even with marshmallow_dataclasses.

from dataclasses import dataclass, field, asdict


@dataclass
class Test:
    name: str = "schbell"
    _name: str = field(init=False, repr=False)

    def __post_init__(self):
        # Just so that we don't create the property a second time.
        if not isinstance(getattr(Test, "name", False), property):
            self._name = self.name
            Test.name = property(Test._get_name, Test._set_name)

    def _get_name(self):
        return self._name

    def _set_name(self, val):
        self._name = val


if __name__ == "__main__":
    t1 = Test()
    print(t1)
    print(t1.name)
    t1.name = "not-schbell"
    print(asdict(t1))

    t2 = Test("llebhcs")
    print(t2)
    print(t2.name)
    print(asdict(t2))

This would print:

Test(name='schbell')
schbell
{'name': 'not-schbell', '_name': 'not-schbell'}
Test(name='llebhcs')
llebhcs
{'name': 'llebhcs', '_name': 'llebhcs'}

I actually started off from this blog post mentioned somewhere in this SO, but ran into the issue that the dataclass field was being set to type property because the decorator is applied to the class. That is,

@dataclass
class Test:
    name: str = field(default='something')
    _name: str = field(init=False, repr=False)

    @property
    def name():
        return self._name

    @name.setter
    def name(self, val):
        self._name = val

would make name to be of type property and not str. So, the setter will actually receive property object as the argument instead of the field default.

3

Some wrapping could be good:

#         DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE 
#                     Version 2, December 2004 
# 
#  Copyright (C) 2020 Xu Siyuan <inqb@protonmail.com> 
# 
#  Everyone is permitted to copy and distribute verbatim or modified 
#  copies of this license document, and changing it is allowed as long 
#  as the name is changed. 
# 
#             DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE 
#    TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 
# 
#   0. You just DO WHAT THE FUCK YOU WANT TO.

from dataclasses import dataclass, field

MISSING = object()
__all__ = ['property_field', 'property_dataclass']


class property_field:
    def __init__(self, fget=None, fset=None, fdel=None, doc=None, **kwargs):
        self.field = field(**kwargs)
        self.property = property(fget, fset, fdel, doc)

    def getter(self, fget):
        self.property = self.property.getter(fget)
        return self

    def setter(self, fset):
        self.property = self.property.setter(fset)
        return self

    def deleter(self, fdel):
        self.property = self.property.deleter(fdel)
        return self


def property_dataclass(cls=MISSING, / , **kwargs):
    if cls is MISSING:
        return lambda cls: property_dataclass(cls, **kwargs)
    remembers = {}
    for k in dir(cls):
        if isinstance(getattr(cls, k), property_field):
            remembers[k] = getattr(cls, k).property
            setattr(cls, k, getattr(cls, k).field)
    result = dataclass(**kwargs)(cls)
    for k, p in remembers.items():
        setattr(result, k, p)
    return result

You can use it like this:

@property_dataclass
class B:
    x: int = property_field(default_factory=int)

    @x.getter
    def x(self):
        return self._x

    @x.setter
    def x(self, value):
        self._x = value
InQβ
  • 406
  • 3
  • 16
3

An @property is typically used to store a seemingly public argument (e.g. name) into a private attribute (e.g. _name) through getters and setters, while dataclasses generate the __init__() method for you. The problem is that this generated __init__() method should interface through the public argument name, while internally setting the private attribute _name. This is not done automatically by dataclasses.

In order to have the same interface (through name) for setting values and creation of the object, the following strategy can be used (Based on this blogpost, which also provides more explanation):

from dataclasses import dataclass, field

@dataclass
class Test:
    name: str
    _name: str = field(init=False, repr=False)

    @property
    def name(self) -> str:
        return self._name

    @name.setter
    def name(self, name: str) -> None:
        self._name = name

This can now be used as one would expect from a dataclass with a data member name:

my_test = Test(name='foo')
my_test.name = 'bar'
my_test.name('foobar')
print(my_test.name)

The above implementation does the following things:

  • The name class member will be used as the public interface, but it actually does not really store anything
  • The _name class member stores the actual content. The assignment with field(init=False, repr=False) makes sure that the @dataclass decorator ignores it when constructing the __init__() and __repr__() methods.
  • The getter/setter for name actually returns/sets the content of _name
  • The initializer generated through the @dataclass will use the setter that we just defined. It will not initialize _name explicitly, because we told it not to do so.
JorenV
  • 153
  • 9
  • This is the best answer IMHO but lacks the (important) ability to set default values for properties that aren't specified when the class is instantiated. See my answer for a tweak to allow that. – Martin CR Apr 28 '20 at 13:12
  • Note that mypy will complain about the double definition of `name`! No runtime errors though. – gmagno Aug 07 '20 at 22:27
2

Currently, the best way I found was to overwrite the dataclass fields by property in a separate child class.

from dataclasses import dataclass, field

@dataclass
class _A:
    x: int = 0

class A(_A):
    @property
    def x(self) -> int:
        return self._x

    @x.setter
    def x(self, value: int):
        self._x = value

The class behave like a regular dataclass. And will correctly define the __repr__ and __init__ field (A(x=4) instead of A(_x=4). The drawback is that the properties cannot be read-only.

This blog post, tries to overwrite the wheels dataclass attribute by the property of the same name. However, the @property overwrite the default field, which leads to unexpected behavior.

from dataclasses import dataclass, field

@dataclass
class A:

    x: int

    # same as: `x = property(x)  # Overwrite any field() info`
    @property
    def x(self) -> int:
        return self._x

    @x.setter
    def x(self, value: int):
        self._x = value

A()  # `A(x=<property object at 0x7f0cf64e5fb0>)`   Oups

print(A.__dataclass_fields__)  # {'x': Field(name='x',type=<class 'int'>,default=<property object at 0x>,init=True,repr=True}

One way solve this, while avoiding inheritance would be to overwrite the field outside the class definition, after the dataclass metaclass has been called.

@dataclass
class A:
  x: int

def x_getter(self):
  return self._x

def x_setter(self, value):
  self._x = value

A.x = property(x_getter)
A.x = A.x.setter(x_setter)

print(A(x=1))
print(A())  # missing 1 required positional argument: 'x'

It should probably possible to overwrite this automatically by creating some custom metaclass and setting some field(metadata={'setter': _x_setter, 'getter': _x_getter}).

Conchylicultor
  • 2,293
  • 2
  • 20
  • 31
  • For your first approach, it seems also possible to make it inside-out. Defining `_A` with getter and setter while `@dataclass` the outer `A(_A)`. – InQβ Nov 25 '19 at 13:38
2

Here's another way which allows you to have fields without a leading underscore:

from dataclasses import dataclass


@dataclass
class Person:
    name: str = property

    @name
    def name(self) -> str:
        return self._name

    @name.setter
    def name(self, value) -> None:
        self._name = value

    def __post_init__(self) -> None:
        if isinstance(self.name, property):
            self.name = 'Default'

The result is:

print(Person().name)  # Prints: 'Default'
print(Person('Joel').name)  # Prints: 'Joel'
print(repr(Person('Jane')))  # Prints: Person(name='Jane')
PaperNick
  • 31
  • 2
2

A solution with minimal additional code and no hidden variables is to override the __setattr__ method to do any checks on the field:

@dataclass
class Test:
    x: int = 1

    def __setattr__(self, prop, val):
        if prop == "x":
            self._check_x(val)
        super().__setattr__(prop, val)

    @staticmethod
    def _check_x(x):
        if x <= 0:
            raise ValueError("x must be greater than or equal to zero")
teebr
  • 21
  • 2
0

From the ideas from above, I created a class decorator function resolve_abc_prop that creates a new class containing the getter and setter functions as suggested by @shmee.

def resolve_abc_prop(cls):
    def gen_abstract_properties():
        """ search for abstract properties in super classes """

        for class_obj in cls.__mro__:
            for key, value in class_obj.__dict__.items():
                if isinstance(value, property) and value.__isabstractmethod__:
                    yield key, value

    abstract_prop = dict(gen_abstract_properties())

    def gen_get_set_properties():
        """ for each matching data and abstract property pair, 
            create a getter and setter method """

        for class_obj in cls.__mro__:
            if '__dataclass_fields__' in class_obj.__dict__:
                for key, value in class_obj.__dict__['__dataclass_fields__'].items():
                    if key in abstract_prop:
                        def get_func(self, key=key):
                            return getattr(self, f'__{key}')

                        def set_func(self, val, key=key):
                            return setattr(self, f'__{key}', val)

                        yield key, property(get_func, set_func)

    get_set_properties = dict(gen_get_set_properties())

    new_cls = type(
        cls.__name__,
        cls.__mro__,
        {**cls.__dict__, **get_set_properties},
    )

    return new_cls

Here we define a data class AData and a mixin AOpMixin implementing operations on the data.

from dataclasses import dataclass, field, replace
from abc import ABC, abstractmethod


class AOpMixin(ABC):
    @property
    @abstractmethod
    def x(self) -> int:
        ...

    def __add__(self, val):
        return replace(self, x=self.x + val)

Finally, the decorator resolve_abc_prop is then used to create a new class with the data from AData and the operations from AOpMixin.

@resolve_abc_prop
@dataclass
class A(AOpMixin):
    x: int

A(x=4) + 2   # A(x=6)

EDIT #1: I created a python package that makes it possible to overwrite abstract properties with a dataclass: dataclass-abc

0

Following a very thorough post about data classes and properties that can be found here the TL;DR version which solves some very ugly cases where you have to call MyClass(_my_var=2) and strange __repr__ outputs:

from dataclasses import field, dataclass

@dataclass
class Vehicle:

    wheels: int
    _wheels: int = field(init=False, repr=False)

    def __init__(self, wheels: int):
       self._wheels = wheels

    @property
    def wheels(self) -> int:
         return self._wheels

    @wheels.setter
    def wheels(self, wheels: int):
        self._wheels = wheels
bluesummers
  • 7,494
  • 4
  • 55
  • 85
  • 1
    You neither need nor want to create an instance attribute named `wheels`. If you want `__init__` to initialize `_wheels` via the setter, use `wheels = InitVar[int]`, then use `__post_init__` to set `self.wheels = wheels`. – chepner Mar 23 '20 at 16:05
0

After trying different suggestions from this thread I've come with a little modified version of @Samsara Apathika answer. In short: I removed the "underscore" field variable from the __init__ (so it is available for internal use, but not seen by asdict() or by __dataclass_fields__).

from dataclasses import dataclass, InitVar, field, asdict

@dataclass
class D:
    a: float = 10.                # Normal attribut with a default value
    b: InitVar[float] = 20.       # init-only attribute with a default value 
    c: float = field(init=False)  # an attribute that will be defined in __post_init__
    
    def __post_init__(self, b):
        if not isinstance(getattr(D, "a", False), property):
            print('setting `a` to property')
            self._a = self.a
            D.a = property(D._get_a, D._set_a)
        
        print('setting `c`')
        self.c = self.a + b
        self.d = 50.
    
    def _get_a(self):
        print('in the getter')
        return self._a
    
    def _set_a(self, val):
        print('in the setter')
        self._a = val


if __name__ == "__main__":
    d1 = D()
    print(asdict(d1))
    print('\n')
    d2 = D()
    print(asdict(d2))

Gives:

setting `a` to property
setting `c`
in the getter
in the getter
{'a': 10.0, 'c': 30.0}


in the setter
setting `c`
in the getter
in the getter
{'a': 10.0, 'c': 30.0}
Roman Zhuravlev
  • 767
  • 1
  • 4
  • 17
0

This method of using properties in dataclasses also works with asdict and is simpler too. Why? Fields that are typed with ClassVar are ignored by the dataclass, but we can still use them in our properties.

@dataclass
def SomeData:
    uid: str
    _uid: ClassVar[str]

    @property
    def uid(self) -> str:
        return self._uid

    @uid.setter
    def uid(self, uid: str) -> None:
        self._uid = uid
Remolten
  • 2,303
  • 2
  • 22
  • 25