6

Is there some introspection method allowing to reliably obtain the underlying data structure of an object instance, that is unaffected by any customizations?

In Python 3 an object's low-level implementation can be deeply obscured: Attribute lookup can be customized, and even the __dict__ and __slots__ attributes may not give a full picture, as they are writeable. dir() is explicitly meant to show "interesting" attributes rather than actual attributes, and even the inspect module doesn't seem to provide such functionality.

Not a duplicate. This question has been flagged as duplicate of Is there a built-in function to print all the current properties and values of an object?. However, that other question only talks about the standard ways of introspecting classes, which here are explicitly listed as not reliable on a lower level.

As an example consider the following script with an intentionally obscured class.

import inspect

actual_members = None  # <- For showing the actual contents later.

class ObscuredClass:
    def __init__(self):
        global actual_members
        actual_members = dict()
        self.__dict__ = actual_members
        self.actual_field = "actual_value"
    def __getattribute__(self, name):
        if name == "__dict__":
            return { "fake_field": "fake value - shown in __dict__" }
        else:
            return "fake_value - shown in inspect.getmembers()"

obj = ObscuredClass()
print(f"{actual_members          = }")
print(f"{dir(obj)                = }")
print(f"{obj.__dict__            = }")
print(f"{inspect.getmembers(obj) = }")

which produces the output

actual_members          = {'actual_field': 'actual_value'}
dir(obj)                = ['fake_field']
obj.__dict__            = {'fake_field': 'fake value - shown in __dict__'}
inspect.getmembers(obj) = [('fake_field', 'fake_value - shown in inspect.getmembers()')]
kdb
  • 3,277
  • 20
  • 37
  • "Standard"? Even `inspect.get_members` doesn't work? – user202729 Feb 04 '21 at 14:00
  • @user202729 I added an example to demonstrate. Even `inspect.get_members` relies on the object not intentionally hiding its internals. – kdb Feb 04 '21 at 15:05
  • 1
    Idea: something similar to `object.__getattribute__` can be used like suggested in https://stackoverflow.com/questions/371753/how-do-i-implement-getattribute-without-an-infinite-recursion-error . Of course it doesn't work for objects implemented in C. (that having said, even though there are many Python classes that override `__getattr__`, I haven't seen much override `__getattributes__`) – user202729 Feb 04 '21 at 15:14
  • @user202729 That seems to work. – kdb Feb 04 '21 at 15:30
  • You might want to post it as an answer rather than editing it into the question. – user202729 Feb 04 '21 at 15:39
  • @user202729 You don't want the reputation points? :) – kdb Feb 04 '21 at 16:47
  • ... I might consider writing an answer tomorrow. – user202729 Feb 04 '21 at 16:48
  • `object.__getattribute__` will work around overridden `__getattribute__` methods, but nothing else - it won't show the actual instance data in the face of missing or shadowed `__dict__` or slot descriptors, and it won't tell you what names you should be looking for. It also won't tell you about non-attribute data, like list elements or dict entries if an object is a list or a dict. – user2357112 supports Monica Feb 04 '21 at 17:10
  • It's an important tool to be aware of - overridden `__getattribute__` methods are way more common than messing with the `__dict__` and slot descriptors - but not a fully general solution, even for objects implemented in Python. – user2357112 supports Monica Feb 04 '21 at 17:14

2 Answers2

4

There's nothing completely general, particularly for objects implemented in C. Python types just don't store enough instance layout metadata for a general solution. That said, gc.get_referents is pretty reliable even in the face of really weird Python-level modifications, including deleted or shadowed slot descriptors and a deleted or shadowed __dict__ descriptor.

gc.get_referents will give all references an object reports to the garbage collection system. It won't tell you why an object had a particular reference, though - it won't tell you that one dict was __dict__ and one dict was an unrelated slot that happened to have a dict in it.

For example:

import gc

class Foo:
    __slots__ = ('__dict__', 'a', 'b')
    __dict__ = None
    def __init__(self):
        self.x = 1
        self.a = 2
        self.b = 3

x = Foo()
del Foo.a
del Foo.b

print(gc.get_referents(x))

for name in '__dict__', 'x', 'a', 'b':
    try:
        print(name, object.__getattribute__(x, name))
    except AttributeError:
        print('object.__getattribute__ could not look up', name)

This prints

[2, 3, {'x': 1}, <class '__main__.Foo'>]
__dict__ None
x 1
object.__getattribute__ could not look up a
object.__getattribute__ could not look up b

gc.get_referents manages to retrieve the real instance dict and the a and b slots, even when the relevant descriptors are all missing. Unfortunately, it gives no information about the meaning of any references it retrieves.

object.__getattribute__ fails to retrieve the instance dict or the a or b slots. It does manage to find x, because it doesn't rely on the __dict__ descriptor to find the instance dict when retrieving other attributes, but you need to already know x is a name you should look for - object.__getattribute__ can't discover what names you should look for on this object.

user2357112 supports Monica
  • 215,440
  • 22
  • 321
  • 400
  • It is interesting for adhoc-analysis of a misbehaving object, but it doesn't say anything about where the objects are stored. For a list, it returns the items of the list; For an object with `__dict__` data fields, it returns various things, including `__dict__`, but with nothing ensuring that the dictionary *is* the `__dict__` field. For classes with `__slots__` it returns the plain values, with no relation to the fields. – kdb Feb 04 '21 at 08:30
  • That's a weird thing to do. Also related https://stackoverflow.com/questions/4912499/using-python-descriptors-with-slots (which shows how something like that could in theory be done) – user202729 Feb 06 '21 at 08:14
0

user202729 has suggested in a comment to use object.__getattribute__(obj, "field").

Building a custom function around this (using object.__getattribute__(obj,"__dict__") and object.__getattribute__(obj,"__slots__")) seems viable for my original intent of dumping internal data on sparsely documented code.

It has however also been pointed out, that things might be even more obfuscated, or not accessible for classes implemented in C.

kdb
  • 3,277
  • 20
  • 37