I'd like a class that provides an interface to some analysis results, but the source of the results are determined at runtime. The very first time a result is calculated, this must come from the original data file. This calculation takes a long time, so I'd like to store this in a result file for quick access in the future (I plan to use msgpack
for this). But if the result is still in memory then I'd like to use that version, so I don't need to keep reading the result file.
The program loops through many of these classes, so there is potentially a lot of memory usage. For this reason, I think it would be best to have a global cache of fixed size (or time limit). Unfortunately, which results will be needed isn't known in advance, so I can't handle this with some preprocessing step. I need the ability to get a new result from the original file if needed.
So I imagine the class will look something like:
class MyClass(object)
def __init__(self, orig_file, result_file, cache):
self.orig_file = orig_file
self.result_file = result_file
self.cache = cache
def get_result(self, name):
if name in self.cache:
return self.cache[name]
elif name in self.result_file:
return self.result_file[name]
else: # compute for first time
return self.orig_file.calculate(name)
Initial searches returned so many potential routes that I became overwhelmed. I'm considering some kind of persistent memoization like this, but I need the persistence to be specific to a class instance. Then I found this persistent lazy caching dictionary, which looks very nice (though I don't need lazy evaluation), but it's not clear how to work the original file into this.
Are there any tools which go some way towards answering this question? And is it reasonable to attempt to achieve this with decorators? (e.g. decorate methods that would do this source search)