3

Suppose I want PERL-like autovivication in Python, i.e.:

>>> d = Autovivifier()
>>> d = ['nested']['key']['value']=10
>>> d
{'nested': {'key': {'value': 10}}}

There are a couple of dominant ways to do that:

  1. Use a recursive default dict
  2. Use a __missing__ hook to return the nested structure

OK -- easy.

Now suppose I want to return a default value from a dict with a missing key. Once again, few way to do that:

  1. For a non-nested path, you can use a __missing__ hook
  2. try/except block wrapping the access to potentially missing key path
  3. Use {}.get(key, default) (does not easily work with a nested dict) i.e., There is no version of autoviv.get(['nested']['key']['no key of this value'], default)

The two goals seem in irreconcilable conflict (based on me trying to work this out the last couple hours.)

Here is the question:

Suppose I want to have an Autovivifying dict that 1) creates the nested structure for d['arbitrary']['nested']['path']; AND 2) returns a default value from a non-existing arbitrary nesting without wrapping that in try/except?

Here are the issues:

  1. The call of d['nested']['key']['no key of this value'] is equivalent to (d['nested'])['key']['no key of this value']. Overiding __getitem__ does not work without returning an object that ALSO overrides __getitem__.
  2. Both the methods for creating an Autovivifier will create a dict entry if you test that path for existence. i.e., I do not want if d['p1']['sp2']['etc.'] to create that whole path if you just test it with the if.

How can I provide a dict in Python that will:

  1. Create an access path of the type d['p1']['p2'][etc]=val (Autovivication);
  2. NOT create that same path if you test for existence;
  3. Return a default value (like {}.get(key, default)) without wrapping in try/except
  4. I do not need the FULL set of dict operations. Really only d=['nested']['key']['value']=val and d['nested']['key']['no key of this value'] is equal to a default value. I would prefer that testing d['nested']['key']['no key of this value'] does not create it, but would accept that.

?

Community
  • 1
  • 1
dawg
  • 80,841
  • 17
  • 117
  • 187
  • what do you need this for? perhaps optional nested dicts are not the best data structure? – Eevee Jun 30 '14 at 04:19
  • Can you explain your point #3? Are you saying you want to be able to do that in a nested way? If so, how? What is the actual usage with `get` that you want to be able to do? – BrenBarn Jun 30 '14 at 04:24
  • @Eevee: Hierarchical data, such as `animals={'amphibian':{'Bufoides':['Mawblang toad','Khasi Hills toad'], more amphibians...}, 'Insects':{'Ants':{'Acanthognathus'}[list of Acanthognathus]}}` where different entries may nest deeper than others... Also -- curiosity! – dawg Jun 30 '14 at 04:29
  • @BrenBarn: By point 3, do you mean `Return a default value (like {}.get(key, default)) without wrapping in try/except`? If so, I have not seen a rational case where I could do `val=autoviv.get(['nested']['key']['no key of this value'], default)` {}.get does not support recursion... – dawg Jun 30 '14 at 04:34
  • @dawg: There is no way you are going to get that exact syntax to work, because you can't specify indexing with `[]` using `get`. You could possibly get an approximation using something like `get(['nested', 'key', 'no key'], default)` where you specify the sequence of keys. – BrenBarn Jun 30 '14 at 04:41
  • @BrenBarn: I don't specifically need `{}.get()` to work; I am using that as an analogy. – dawg Jun 30 '14 at 04:48
  • It would be helpful if you could specify the actual set of operations you want to be able to do on this object. Like, do you need to be able to do `d['one']['two']['three']` and have it return a default value without creating intermediate objects, or do you just need to be able to call `d.get(['one', 'two', 'three'], default)`? – BrenBarn Jun 30 '14 at 05:36

3 Answers3

3

To create a recursive tree of dictionaries, use defaultdict with a trick:

from collections import defaultdict

tree = lambda: defaultdict(tree)

Then you can create your x with x = tree().

above from @BrenBarn -- defaultdict of defaultdict, nested

Community
  • 1
  • 1
johntellsall
  • 11,853
  • 3
  • 37
  • 32
  • Yes, I linked to a similar answer in my question. This works for `tree[1][2][3]=4` But once you instantiate `tree`, now try a non existing recursive path to test for existence, like `if tree['does']['not']['exist']` -- it returns an empty default dict factory AND creates the path you tested for existence. – dawg Jun 30 '14 at 04:33
2

Don't do this. It could be solved much more easily by just writing a class that has the operations you want, and even in Perl it's not a universally-appraised feature.

But, well, it is possible, with a custom autoviv class. You'd need a __getitem__ that returns an empty autoviv dict but doesn't store it. The new autoviv dict would remember the autoviv dict and key that created it, then insert itself into its parent only when a "real" value is stored in it.

Since an empty dict tests as falsey, you could then test for existence Perl-style, without ever actually creating the intermediate dicts.

But I'm not going to write the code out, because I'm pretty sure this is a terrible idea.

Eevee
  • 43,129
  • 10
  • 82
  • 119
  • `It could be solved much more easily by just writing a class that has the operations you want...` I am open to writing a class; I just am not seeing the class to write. I have tried a subclass of `dict` and a subclass of `defaultdict` I think that this idiom is common in Perl -- at least when that was my default language a few years ago... – dawg Jun 30 '14 at 04:45
  • 2
    consider that perhaps your keys should be the _paths themselves_, e.g. `'one', 'two', 'three'`, rather than having the paths split up across myriad separate dicts. – Eevee Jun 30 '14 at 04:51
  • 1
    @Eevee: Using tuples is a known solution. However it really sucks with large data sets and deep trees, effectively you are storing the entire depth of the tree for each leaf. With deep trees you can be multiplying the amount of data by a large constant (proportional to the depth), and even shallow trees may require you to use non-intuitive coding to compress the paths. – Rich Farmbrough Feb 22 '21 at 21:30
1

While it does not precisely match the dictionary protocol in Python, you could achieve reasonable results by implementing your own auto-vivification dictionary that uses variable getitem arguments. Something like (2.x):

class ExampleVivifier(object):
    """ Small example class to show how to use varargs in __getitem__. """

    def __getitem__(self, *args):
        print args

Example usage would be:

>>> v = ExampleVivifier()
>>> v["nested", "dictionary", "path"]
(('nested', 'dictionary', 'path'),)

You can fill in the blanks to see how you can achieve your desired behaviour here.

Brett Lempereur
  • 805
  • 5
  • 11