How to replace existing Python class methods of (or otherwise extend) RepoSurgeon by means of 'exec' and 'eval'?

Question

The documentation (at the time of this writing) on the topic is scarce. How can I extend reposurgeon functionality if the use of macros (define) isn't sufficient for my purposes?

The only clues it gives is that:

The code has full access to all internal data structures. Functions defined are accessible to later eval calls.

But what does that even mean?

We also learn that:

Typically this will be a call to a function defined by a previous exec. The variables _repository and _selection will have the obvious values. Note that _selection will be a list of integers, not objects.

score 2 · Answer 1 · edited May 23 '17 at 12:15

Preliminary note

I will use italic inline code (like this) to denote Python code and "normal" inline code (like this) to denote RepoSurgeon commands. For code blocks, an introductory description should provide the context, i.e. whether it's a RepoSurgeon command or Python code.

This writeup discusses version 3.10 of RepoSurgeon, which is the latest as of this writing.

Intro

RepoSurgeon is written in Python and explicitly allows to execfile() other Python code within it. The syntax of the RepoSurgeon command for it is:

exec </path/to/python-source.py

This much we can gather from the documentation.

We can use this from within a lift script or on the RepoSurgeon prompt.

Where does your code end up?

As already pointed out in this Q&A you need to observe the rules imposed by the surrounding code when running in the context of RepoSurgeon. In particular your Python code is going to be executed within the context of the __main__.RepoSurgeon instance, so this is the first thing to keep in mind.

You also will always have to give a selection with eval. It doesn't appear to be legit to give no selection and expect an implied "all selected" as for list or other built-in commands, although you can arguably leverage exec to change that behavior as we'll see in a bit.

Also make sure to use eval myfunc() and not eval myfunc. Obviously myfunc is a valid Python statement, but don't expect it to do anything. You'll have to call the function. Everything after eval is handed straight to Python's eval().

While execfile() (exec as a RepoSurgeon) runs you can abuse the context you're running in and reference self, which is the instance of __main__.RepoSurgeon mentioned above. More about this later.

A first trivial example

Consider the following Python code that introduces a new unbound function myfunc:

def myfunc():
    print("Hello world!")

and the following command issued at the RepoSurgeon prompt:

exec </path/to/your/python-code.py

followed by:

=O eval myfunc()

This will yield the expected output:

Hello world!

You may want to use a different selection from mine, though. Whichever suits your needs.

Note: In any case even an empty selection will still result in your Python code being called! For example the selection =I in my loaded repo is empty, but I will still see the output as produced above. It's simply important to give any selection to have your code invoked.

Exploring the context in which our Python code runs

With the above trivial example we can check whether it works. Now on to explore what we can access besides _selection and _repository mentioned in the documentation.

Changing the function myfunc to:

def myfunc():
    from pprint import pprint
    pprint(globals())
    pprint(locals())

should give us a feeling what we're dealing with.

After the change (and saving it ;)) simply re-run:

exec </path/to/your/python-code.py

followed by:

=O eval myfunc()

you should see a dump of the contents of globals() and locals().

You will notice that even in the context of the eval you can still access self (part of the globals() in this case). That's pretty useful.

As I mentioned before, you can also modify the instance of __main__.RepoSurgeon within which your code runs (more about this below).

In order to see all methods etc, use dir(self) in your function (or at the top-level when exec-ing the Python code file).

So simply add this line to myfunc:

dir(self)

making it:

def myfunc():
    from pprint import pprint
    pprint(globals())
    pprint(locals())
    dir(self)

after invoking the exec and eval commands again (on Linux recall it as you would in the shell using cursor Up) you should now see most of the functions listed you'd also be able to find the the RepoSurgeon code.

Note: simply re-running RepoSurgeon's exec command followed by another eval myfunc() will now add the output of the attributes of __main__.RepoSurgeon.

While all of this is cool so far and should give you a feeling for how to run your own Python code in RepoSurgeon, you can also replace existing __main__.RepoSurgeon methods. Read on.

Hooking into RepoSurgeon and replacing functionality

With the access to self comes the power to add functionality and to modify existing functionality.

RepoSurgeon.precmd looks like a worthy candidate for this. It's the method that gets called prior to running the actual command and performs a syntax check as well as setting the selection set that is so vital in many RepoSurgeon commands.

What we need is the prototype of precmd. Here it is:

def precmd(self, line):

What was the trick again in replacing method? Alex Martelli's answer here leads the way ...

We can simply use this as our (complete) Python file to exec:

if self:
    if not 'orig_precmd' in self.__dict__:
        setattr(self, 'orig_precmd', self.precmd) # save original precmd
    def myprecmd(self, line):
        print("[pre-precmd] '%s'" % line)
        orig_precmd = getattr(self, 'orig_precmd')
        return self.orig_precmd(line)
    setattr(self, 'precmd', myprecmd.__get__(self, self.__class__))

The if self: is merely to scope our code.
The check for 'orig_precmd' ensures we're not overwriting the value of this attribute again upon subsequent calls to exec.
myprecmd(self, line): contains our version of __main__.RepoSurgeon.precmd. The awesome new functionality it adds is to parrot the command that was entered.
Last but not least the second setattr() simply overrides the __main__.RepoSurgeon.precmd with our version. Any subsequent call RepoSurgeon makes to self.precmd() will go through our "hook" now.

Remember, we are overriding internal code of RepoSurgeon, so tread carefully and don't do anything silly. The code is very readable, albeit a whopping 10k LoC.

Next time you issue any command, you should have it parroted back at you. Consider the following (RepoSurgeon prompt plus excerpt from output):

reposurgeon% =O list
[pre-precmd] '=O list'

=O list is the command I entered and [pre-precmd] '=O list' the output it yields (followed by the actual output, since I am calling the original implementation of __main__.RepoSurgeon.precmd in my version).

Conclusion

The RepoSurgeon commands exec and eval provide a powerful means to override existing functionality and add new functionality in RepoSurgeon.

The hook example is a superset of "simply" extending RepoSurgeon by using eval with a previously exec'd function. It allows to sneak code into the guts of RepoSurgeon and bend it to our will where ever it has shortcomings (of which I have only found a handful so far).

Kudos to ESR for this design decision. There is no need for a plugin framework this way.