7

Here's a simple class created declaratively:

class Person:
    def say_hello(self):
        print("hello")

And here's a similar class, but it was defined by invoking the metaclass manually:

def say_hello(self):
    print("sayolala")

say_hello.__qualname__ = 'Person.say_hello'

TalentedPerson = type('Person', (), {'say_hello': say_hello})

I'm interested to know whether they are indistinguishable. Is it possible to detect such a difference from the class object itself?

>>> def was_defined_declaratively(cls):
...     # dragons
...
>>> was_defined_declaratively(Person)
True
>>> was_defined_declaratively(TalentedPerson)
False
wim
  • 266,989
  • 79
  • 484
  • 630
  • 1
    AFAIK the only difference between the two is that the `class` statement calls the hidden `__build_class__` function (which then calls the metaclass to construct the class). That's only true for CPython though, and as far as I can tell `__build_class__` doesn't do anything special to the class anyway. So I'm 99.99% sure it's not possible to detect a difference, even if you rely on CPython implementation details. – Aran-Fey Nov 05 '18 at 17:21
  • The `class` statement appears to call `__builtins__.type`, not whatever the global name `type` currently refers to. If you can patch `type` early enough, you could add your own backdoor to make them distinguishable. (I base this on setting `type = int` before trying either approach; the `class` statement was unaffected, while the call to `type` obvious failed. I'm also not sure if this is a CPython implementation detail.) – chepner Nov 05 '18 at 17:26

3 Answers3

5

This should not matter, at all. Even if we dig for more attributes that differ, it should be possible to inject these attributes into the dynamically created class.

Now, even without the source file around (from which, things like inspect.getsource can make their way, but see below), class body statements should have a corresponding "code" object that is run at some point. The dynamically created class won't have a code body (but if instead of calling type(...) you call types.new_class you can have a custom code object for the dynamic class as well - so, as for my first statement: it should be possible to render both classes indistinguishable.

As for locating the code object without relying on the source file (which, other than by inspect.getsource can be reached through a method's .__code__ attibute which anotates co_filename and co_fistlineno (I suppose one would have to parse the file and locate the class statement above the co_firstlineno then)

And yes, there it is: given a module, you can use module.__loader__.get_code('full.path.tomodule') - this will return a code_object. This object has a co_consts attribute which is a sequence with all constants compiled in that module - among those are the code objects for the class bodies themselves. And these, have the line number, and code objects for the nested declared methods as well.

So, a naive implementation could be:

import sys, types

def was_defined_declarative(cls):
    module_name = cls.__module__
    module = sys.modules[module_name]
    module_code = module.__loader__.get_code(module_name)
    return any(
        code_obj.co_name == cls.__name__ 
        for code_obj in module_code.co_consts 
        if isinstance(code_obj, types.CodeType)
    )

For simple cases. If you have to check if the class body is inside another function, or nested inside another class body, you have to do a recursive search in all code objects .co_consts attribute in the file> Samething if you find if safer to check for any attributes beyond the cls.__name__ to assert you got the right class.

And again, while this will work for "well behaved" classes, it is possible to dynamically create all these attributes if needed - but that would ultimately require one to replace the code object for a module in sys.__modules__ - it starts to get a little more cumbersome than simply providing a __qualname__ to the methods.

update This version compares all strings defined inside all methods on the candidate class. This will work with the given example classess - more accuracy can be achieved by comparing other class members such as class attributes, and other method attributes such as variable names, and possibly even bytecode. (For some reason, the code object for methods in the module's code object and in the class body are different instances,though code_objects should be imutable) .

I will leave the implementation above, which only compares the class names, as it should be better for understanding what is going on.

def was_defined_declarative(cls):
    module_name = cls.__module__
    module = sys.modules[module_name]
    module_code = module.__loader__.get_code(module_name)
    cls_methods = set(obj for obj in cls.__dict__.values() if isinstance(obj, types.FunctionType))
    cls_meth_strings = [string for method in cls_methods for string in method.__code__.co_consts  if isinstance(string, str)] 

    for candidate_code_obj in module_code.co_consts:
        if not isinstance(candidate_code_obj, types.CodeType):
            continue
        if candidate_code_obj.co_name != cls.__name__:
            continue
        candidate_meth_strings = [string  for method_code in candidate_code_obj.co_consts if isinstance(method_code, types.CodeType) for string in method_code.co_consts if isinstance(string, str)]
        if candidate_meth_strings == cls_meth_strings:
            return True
    return False
jsbueno
  • 77,044
  • 9
  • 114
  • 168
  • I get `TypeError: 'SourceFileLoader' object is not callable` on `imp.__loader__`. – wim Nov 08 '18 at 17:56
  • sorry - it is incorrect -the call should only be on the `.get_code` method. I am fixing it. – jsbueno Nov 08 '18 at 18:06
  • Now I get `ImportError: loader for imp cannot handle ` – wim Nov 08 '18 at 18:09
  • Wow - as the `get_code` requires the full qualified module name, I thought one could use any `__loader__` object - but I tested it here -this has to be called on the module you are interested in getting the code from. I had usd `imp` just because I thought it would make sense - more changes ahead - there it is -try it now. – jsbueno Nov 08 '18 at 18:18
  • It's not able to correctly distinguish `Person` and `TalentedPerson` in the question, but I can see the idea now and can see how it could work in less pathological cases. – wim Nov 08 '18 at 18:30
  • The failure in this case was just because `TalendedPerson.__name__` is "Person" the code I wrote just compares the `__name__` attribute. It can be make to work even in this case, if it would compare other attributes than `__name__`. – jsbueno Nov 08 '18 at 18:44
  • I guess the inverse of my question was "can `type` be used to exactly replicate any declarative defined class" (and the answer seems to be, yes, `type` can do anything `class` can do). – wim Nov 08 '18 at 18:50
  • Not `type` - because it won't attach a code object to to the module to be used on the class - but `types.new_class` can - as it provides the one mechanism `type` does not provide: one can pass a code object to be run as the class body. – jsbueno Nov 08 '18 at 18:52
  • *It can be make to work even in this case, if it would compare other attributes than `__name__`* ...how is that? what other attributes are there that could be used to distinguish? – wim Nov 08 '18 at 18:56
  • "what other attributes" - the class code_object have a `co_consts` attribute were there lie the code objects to its methods, and the attribute names as strings - those could be compared to attributes in the `cls` itself. If the poiint is to improve detection beyond `__name__`, it can be done. But if the point is whether it is possible to fake these as well: yes, it is possible to fake it beyond the point of detection. – jsbueno Nov 08 '18 at 19:05
  • 1
    OK! If you can update it to work with the `TalentedPerson` / `Person` example given, I will accept this answer. – wim Nov 08 '18 at 19:11
2

It is not possible to detect such difference at runtime with python. You can check the files with a third-party app but not in the language since no matter how you define your classes they should be reduced to the objects which the interpreter knows how to manage.

Everything other is syntax sugar and its death with at the preprocessing step of the operations on the text.

The whole metaprogramming is a technique that lets you close to the compiler/interpreter work. Revealing some of the type traits and giving you the freedom to work on the type with code.

Petar Velev
  • 2,213
  • 10
  • 21
  • 1
    As you can see in my answer, it is possible to detect the difference - the main thing being that classes created with a `class` statement have to have an associated "code" object somewhere. – jsbueno Nov 10 '18 at 21:22
2

It is possible — somewhat.

inspect.getsource(TalentedPerson) will fail with an OSError, whereas it will succeed with Person. This only works though if you don't have a class of that name in the file where it was defined:

If your file consists of both of these definitions, and TalentedPerson also believes it is Person, then inspect.getsource will simply find Person's definition.

Obviously this relies on the source code still being around and findable by inspect — this won't work with compiled code, e.g. in the REPL, can be tricked, and is sort of cheating. The actual code objects don't differ AFAIK.

L3viathan
  • 23,792
  • 2
  • 46
  • 64