107

I don't know what the __setstate__ and __getstate__ methods do, so help me with a simple example.

shahjapan
  • 11,781
  • 21
  • 66
  • 98
zjm1126
  • 52,371
  • 71
  • 163
  • 213

4 Answers4

85

Here's a very simple example for Python that should supplement the pickle docs.

class Foo(object):
  def __init__(self, val=2):
     self.val = val
  def __getstate__(self):
     print("I'm being pickled")
     self.val *= 2
     return self.__dict__
  def __setstate__(self, d):
     print("I'm being unpickled with these values: " + repr(d))
     self.__dict__ = d
     self.val *= 3

import pickle
f = Foo()
f_data = pickle.dumps(f)
f_new = pickle.loads(f_data)
Mike T
  • 34,456
  • 15
  • 128
  • 169
BrainCore
  • 4,650
  • 3
  • 28
  • 35
  • 12
    To supplement this answer it prints "I'm being pickled", then "I'm being unpickled with these values: {'val': 4}", and f_new.val is 12. – timidpueo Mar 14 '19 at 14:24
51

Minimal example

Whatever comes out of getstate, goes into setstate. It does not need to be a dict.

Whatever comes out of getstate must be pickeable, e.g. made up of basic built-ins like int, str, list.

class C(object):
    def __init__(self, i):
        self.i = i
    def __getstate__(self):
        return self.i
    def __setstate__(self, i):
        self.i = i
assert pickle.loads(pickle.dumps(C(1), -1)).i == 1

Default __setstate__

The default __setstate__ takes a dict.

self.__dict__ is a good choice as in https://stackoverflow.com/a/1939384/895245 , but we can construct one ourselves to better see what is going on:

class C(object):
    def __init__(self, i):
        self.i = i
    def __getstate__(self):
        return {'i': self.i}
assert pickle.loads(pickle.dumps(C(1), -1)).i == 1

Default __getstate__

Analogous to __setstate__.

class C(object):
    def __init__(self, i):
        self.i = i
    def __setstate__(self, d):
        self.i = d['i']
assert pickle.loads(pickle.dumps(C(1), -1)).i == 1

__slots__ objects don't have __dict__

If the object has __slots__, then it does not have __dict__

If you are going to implement both get and setstate, the default-ish way is:

class C(object):
    __slots__ = 'i'
    def __init__(self, i):
        self.i = i
    def __getsate__(self):
        return { slot: getattr(self, slot) for slot in self.__slots__ }
    def __setsate__(self, d):
        for slot in d:
            setattr(self, slot, d[slot])
assert pickle.loads(pickle.dumps(C(1), -1)).i == 1

__slots__ default get and set expects a tuple

If you want to reuse the default __getstate__ or __setstate__, you will have to pass tuples around as:

class C(object):
    __slots__ = 'i'
    def __init__(self, i):
        self.i = i
    def __getsate__(self):
        return (None, { slot: getattr(self, slot) for slot in self.__slots__ })
assert pickle.loads(pickle.dumps(C(1), -1)).i == 1

I'm not sure what this is for.

Inheritance

First see that pickling works by default:

class C(object):
    def __init__(self, i):
        self.i = i
class D(C):
    def __init__(self, i, j):
        super(D, self).__init__(i)
        self.j = j
d = pickle.loads(pickle.dumps(D(1, 2), -1))
assert d.i == 1
assert d.j == 2

Inheritance custom __getstate__

Without __slots__ it is easy, since the __dict__ for D contains the __dict__ for C, so we don't need to touch C at all:

class C(object):
    def __init__(self, i):
        self.i = i
class D(C):
    def __init__(self, i, j):
        super(D, self).__init__(i)
        self.j = j
    def __getstate__(self):
        return self.__dict__
    def __setstate__(self, d):
        self.__dict__ = d
d = pickle.loads(pickle.dumps(D(1, 2), -1))
assert d.i == 1
assert d.j == 2

Inheritance and __slots__

With __slots__, we need to forward to the base class, and can pass tuples around:

class C(object):
    __slots__ = 'i'
    def __init__(self, i):
        self.i = i
    def __getstate__(self):
        return { slot: getattr(self, slot) for slot in C.__slots__ }
    def __setstate__(self, d):
        for slot in d:
            setattr(self, slot, d[slot])

class D(C):
    __slots__ = 'j'
    def __init__(self, i, j):
        super(D, self).__init__(i)
        self.j = j
    def __getstate__(self):
        return (
            C.__getstate__(self),
            { slot: getattr(self, slot) for slot in self.__slots__ }
        )
    def __setstate__(self, ds):
        C.__setstate__(self, ds[0])
        d = ds[1]
        for slot in d:
            setattr(self, slot, d[slot])

d = pickle.loads(pickle.dumps(D(1, 2), -1))
assert d.i == 1
assert d.j == 2

Unfortunately it is not possible to reuse the default __getstate__ and __setstate__ of the base: https://groups.google.com/forum/#!topic/python-ideas/QkvOwa1-pHQ we are forced to define them.

Tested on Python 2.7.12. GitHub upstream.

13

These methods are used for controlling how objects are pickled and unpickled by the pickle module. This is usually handled automatically, so unless you need to override how a class is pickled or unpickled you shouldn't need to worry about it.

Pär Wieslander
  • 26,752
  • 5
  • 48
  • 52
1

A clarification to @BrainCore's answer. In practice, you probably won't want to modify self inside __getstate__. Instead construct a new object that will get pickled, leaving the original unchanged for further use. Here's what that would look like:

import pickle

class Foo:
    def __init__(self, x:int=2, y:int=3):
        self.x = x
        self.y = y
        self.z = x*y

    def __getstate__(self):
        # Create a copy of __dict__ to modify values and return;
        # you could also construct a new dict (or other object) manually
        out = self.__dict__.copy()
        out["x"] *= 3
        out["y"] *= 10
        # You can remove attributes, but note they will not get set with
        # some default value in __setstate__ automatically; you would need
        # to write a custom __setstate__ method yourself; this might be
        # useful if you have unpicklable objects that need removing, or perhaps
        # an external resource that can be reloaded in __setstate__ instead of
        # pickling inside the stream
        del out["z"]
        return out

    # The default __setstate__ will update Foo's __dict__;
    # so no need for a custom implementation here if __getstate__ returns a dict;
    # Be aware that __init__ is not called by default; Foo.__new__ gets called,
    # and the empty object is modified by __setstate__

f = Foo()
f_str = pickle.dumps(f)
f2 = pickle.loads(f_str)

print("Pre-pickle:", f.x, f.y, hasattr(f,"z"))
print("Post-pickle:", f2.x, f2.y, hasattr(f2,"z"))
Azmisov
  • 4,400
  • 4
  • 40
  • 63