You may want to try SikuliX. This is an image-based framework that can be used with Python (via Jython), Ruby (via JRuby) and Java (via the optional Java API).
It is very useful to automate certain behavior on your screen such as opening Photoshop, clicking regions on your screen or typing via the key. For example:
//Using Java API, however the idea is the same for Python
Screen screen = new Screen()
screen.type(new Pattern("some-image.png"), "keyboard");
In Python:
def changed(event):
print "something changed in ", event.region
for ch in event.changes:
ch.highlight() # highlight all changes
sleep(1)
for ch in event.changes:
ch.highlight() # turn off the highlights
with selectRegion("select a region to observe") as r:
# any change in r larger than 50 pixels would trigger the changed function
onChange(50, changed)
observe(background=True)
wait(30) # another way to observe for 30 seconds
r.stopObserver()
It is quite a bit of work, but it allows you to create very robust scripts that perform your desired actions. You may also pipe console output back to your Python script via subprocess
in order to change your scripts behavior based on the environment.
Rest is all limited by your imagination.
Note: Not EVERYTHING has to be done with SikuliX, infact I wouldn't recommend actually doing everything. Just certain things that may require specific behavior on your screen.
If you are strictly on Ubuntu, you may also want to look at Xpresser
Update
So I have worked around with AutoIt and PyAutoIt and would genuinely think they are suitable tools for what you wish to achieve as they can be very potent against certain applications.