373

Is there a way to have a defaultdict(defaultdict(int)) in order to make the following code work?

for x in stuff:
    d[x.a][x.b] += x.c_int

d needs to be built ad-hoc, depending on x.a and x.b elements.

I could use:

for x in stuff:
    d[x.a,x.b] += x.c_int

but then I wouldn't be able to use:

d.keys()
d[x.a].keys()
codeforester
  • 28,846
  • 11
  • 78
  • 104
Jonathan
  • 84,911
  • 94
  • 244
  • 345
  • 6
    See similar question [_What is the best way to implement nested dictionaries in Python?_](http://stackoverflow.com/questions/635483/what-is-the-best-way-to-implement-nested-dictionaries-in-python). There's also some possibly useful information in Wikipedia's article on [_Autovivification_](https://en.wikipedia.org/wiki/Autovivification#Python). – martineau Jan 20 '14 at 20:08

6 Answers6

670

Yes like this:

defaultdict(lambda: defaultdict(int))

The argument of a defaultdict (in this case is lambda: defaultdict(int)) will be called when you try to access a key that doesn't exist. The return value of it will be set as the new value of this key, which means in our case the value of d[Key_doesnt_exist] will be defaultdict(int).

If you try to access a key from this last defaultdict i.e. d[Key_doesnt_exist][Key_doesnt_exist] it will return 0, which is the return value of the argument of the last defaultdict i.e. int().

fedorqui 'SO stop harming'
  • 228,878
  • 81
  • 465
  • 523
mouad
  • 59,529
  • 15
  • 108
  • 104
  • 8
    it works great! could you explain the rational behind this syntax? – Jonathan Oct 12 '11 at 08:25
  • 43
    @Jonathan: Yes sure, the argument of a `defaultdict` (in this case is `lambda : defaultdict(int)`) will be called when you try to access a key that don't exist and the return value of it will be set as the new value of this key which mean in our case the value of `d[Key_dont_exist]` will be `defaultdict(int)`, and if you try to access a key from this last defaultdict i.e. `d[Key_dont_exist][Key_dont_exist]` it will return 0 which is the return value of the argument of the last ``defaultdict`` i.e. `int()`, Hope this was helpful. – mouad Oct 12 '11 at 14:25
  • 26
    The argument to `defaultdict` should be a function. `defaultdict(int)` is a dictionary, while `lambda: defaultdict(int)` is function that returns a dictionary. – has2k1 Sep 22 '12 at 04:45
  • 31
    @has2k1 That is incorrect. The argument to defaultdict needs to be a callable. A lambda is a callable. – Niels Bom Jan 22 '13 at 15:11
  • 1
    this will work for only 2nd hierarchy, if you'll try to run: `d['a']['b']['c']` - it will again throw an error for a missing Key ... – Ricky Levi Mar 24 '19 at 09:08
  • 2
    @RickyLevi, if you want to have that working you can just say: `defaultdict(lambda: defaultdict(lambda: defaultdict(int)))` – darophi Mar 26 '19 at 15:21
53

The parameter to the defaultdict constructor is the function which will be called for building new elements. So let's use a lambda !

>>> from collections import defaultdict
>>> d = defaultdict(lambda : defaultdict(int))
>>> print d[0]
defaultdict(<type 'int'>, {})
>>> print d[0]["x"]
0

Since Python 2.7, there's an even better solution using Counter:

>>> from collections import Counter
>>> c = Counter()
>>> c["goodbye"]+=1
>>> c["and thank you"]=42
>>> c["for the fish"]-=5
>>> c
Counter({'and thank you': 42, 'goodbye': 1, 'for the fish': -5})

Some bonus features

>>> c.most_common()[:2]
[('and thank you', 42), ('goodbye', 1)]

For more information see PyMOTW - Collections - Container data types and Python Documentation - collections

Sparky
  • 94,381
  • 25
  • 183
  • 265
yanjost
  • 4,673
  • 1
  • 23
  • 27
  • 6
    Just to complete the circle here, you would want to use `d = defaultdict(lambda : Counter())` rather than `d = defaultdict(lambda : defaultdict(int))` to specifically address the problem as originally posed. – gumption Jun 20 '14 at 18:50
  • 4
    @gumption you can just use `d = defaultdict(Counter())` no need for a lambda in this case – Deb Aug 04 '17 at 08:53
  • 4
    @Deb you have a slight error- remove the inner parentheses so you pass a callable instead of a `Counter` object. That is: `d = defaultdict(Counter)` – Dillon Davis Aug 11 '18 at 07:00
32

I find it slightly more elegant to use partial:

import functools
dd_int = functools.partial(defaultdict, int)
defaultdict(dd_int)

Of course, this is the same as a lambda.

Katriel
  • 107,638
  • 19
  • 124
  • 160
  • 1
    Partial is also better than lambda here because it can be applied recursively :) see my answer below for a generic nested defaultdict factory method. – Campi Jan 19 '19 at 12:01
  • @Campi you don't need partial for recursive applications, AFAICT – Clément Nov 25 '19 at 20:00
22

Previous answers have addressed how to make a two-levels or n-levels defaultdict. In some cases you want an infinite one:

def ddict():
    return defaultdict(ddict)

Usage:

>>> d = ddict()
>>> d[1]['a'][True] = 0.5
>>> d[1]['b'] = 3
>>> import pprint; pprint.pprint(d)
defaultdict(<function ddict at 0x7fcac68bf048>,
            {1: defaultdict(<function ddict at 0x7fcac68bf048>,
                            {'a': defaultdict(<function ddict at 0x7fcac68bf048>,
                                              {True: 0.5}),
                             'b': 3})})
Clément
  • 10,212
  • 13
  • 62
  • 104
12

For reference, it's possible to implement a generic nested defaultdict factory method through:

from collections import defaultdict
from functools import partial
from itertools import repeat


def nested_defaultdict(default_factory, depth=1):
    result = partial(defaultdict, default_factory)
    for _ in repeat(None, depth - 1):
        result = partial(defaultdict, result)
    return result()

The depth defines the number of nested dictionary before the type defined in default_factory is used. For example:

my_dict = nested_defaultdict(list, 3)
my_dict['a']['b']['c'].append('e')
Campi
  • 1,150
  • 12
  • 19
  • Can you give a usage example? Not working the way I expected this to. `ndd = nested_defaultdict(dict) .... ndd['a']['b']['c']['d'] = 'e'` throws `KeyError: 'b'` – David Marx Feb 14 '19 at 20:14
  • Hey David, you need to define the depth of your dictionary, in your example 3 (as you defined the default_factory to be a dictionary too. nested_defaultdict(dict, 3) will work for you. – Campi Feb 16 '19 at 10:01
  • This was super helpful, thanks! One thing I noticed is that this creates a default_dict at `depth=0`, which may not always be desired if the depth is unknown at the time of calling. Easily fixable by adding a line `if not depth: return default_factory()`, at the top of the function, though there's probably a more elegant solution. – Brendan Oct 30 '19 at 23:44
7

Others have answered correctly your question of how to get the following to work:

for x in stuff:
    d[x.a][x.b] += x.c_int

An alternative would be to use tuples for keys:

d = defaultdict(int)
for x in stuff:
    d[x.a,x.b] += x.c_int
    # ^^^^^^^ tuple key

The nice thing about this approach is that it is simple and can be easily expanded. If you need a mapping three levels deep, just use a three item tuple for the key.

Steven Rumbalski
  • 39,949
  • 7
  • 78
  • 111
  • 4
    This solution means it isn't simple to get all of d[x.a], as you need to introspect every key to see if it has x.a as the first element of the tuple. – Matthew Schinckel Feb 18 '11 at 03:02
  • 5
    If you wanted nesting 3 levels deep, then just define it as 3 levels: d = defaultdict(lambda: defaultdict( lambda: defaultdict(int))) – Matthew Schinckel Feb 18 '11 at 03:03