How to create complex dictionary structure in Python?

Question

I'm trying to read in data from and create a nested dictionary of dictionaries. There is a similar question here, but I can't seem to figure out how to adapt a solution to my particular problem. I would be very grateful if someone could explain a solution to me for my problem.

Basically, I have a file that looks like this:

A    'abc'    12    0.001
B    'tex'    34    0.002  
B    'tex'    78    0.005
E    'yet'    88    0.090
A    'abc'    22    0.120

I need to create a complex dictionary that looks like this:

complete_dict = {A:{'abc':[[12, 0.001], [22, 0.120]]}, 
                 B:{'tex':[[34, 0.002], [78, 0.005]]}, 
                 E:{'yet':[[88, 0.090]]}}

I can create the inner dictionary, but I can't figure out how to create the outer dictionary. Here is my code for the inner dictionary:

with open('data.txt', mode="r") as data_file:
    fieldnames = ('character', 'string', 'value1', 'value2')
    reader = csv.DictReader(data_file, fieldnames=fieldnames, delimiter="\t")
    inner_dict = {}
    for row in reader:
        values = [int(row['value1']), float(row['value2'])] 
        string = row['string'] 
        if string in inner_dict:
            inner_dict[string].append(values)
        else:
            inner_dict[string] = values

Could someone explain how to create the outer dictionary? The only idea I have is to read the file and create the inner dictionary, then reread the file to create the outer dictionary. Surely there must be an easier way? Thanks in advance for the help!

retracile · Accepted Answer · 2011-10-14T15:11:46.443

Is this what you're looking to accomplish?

with open('data.txt', mode="r") as data_file:
    fieldnames = ('character', 'string', 'value1', 'value2')
    reader = csv.DictReader(data_file, fieldnames=fieldnames, delimiter="\t")

    complete_dict = {}
    for row in reader:
        char_dict = complete_dict.setdefault(row['character'], {})
        values_list = char_dict.setdefault(row['string'], [])
        values = [int(row['value1']), float(row['value2'])] 
        values_list.append(values)

pprint.pprint(complete_dict)

Note that in your example you have 'value2' where you want 'value1'. Also, this appears to include the single quotes around the strings as part of the string, so you may need to clean that up.

Fantastic! Thanks for the help, this made my day! – drbunsen Oct 14 '11 at 17:42 — drbunsen, Oct 14 '11 at 17:42

score 2 · Answer 2 · answered Oct 14 '11 at 15:44

Given:

$ cat data.txt
A   'abc'   12  0.001
B   'tex'   34  0.002
B   'tex'   78  0.005
E   'yet'   88  0.090
A   'abc'   22  0.120

This:

import csv

d={}
with open('data.txt', mode="r") as data_file:
    fieldnames = ('character', 'string', 'value1', 'value2')
    reader = csv.DictReader(data_file, fieldnames=fieldnames, delimiter="\t")
    for row in reader:
        c=row['character']
        values = [int(row['value1']), float(row['value2'])] 
        s = row['string']
        if c not in d: d[c]={}
        if s not in d[c]: d[c][s] = []
        d[c][s].append(values)

print d

Produces:

{'A': {"'abc'": [[12, 0.001], [22, 0.12]]}, 
 'B': {"'tex'": [[34, 0.002], [78, 0.005]]}, 
 'E': {"'yet'": [[88, 0.09]]}}

Steven Rumbalski · Answer 3 · 2011-10-14T15:56:45.467

2

Use a defaultdict.

from collections import defaultdict
complete_dict = defaultdict(lambda: defaultdict(list))

with open('data.txt', mode="rb") as data_file:
    reader = csv.reader(data_file, delimiter="\t")
    for c, s, v1, v in reader:
        complete_dict[c][s].append([v1, v2])

edited Oct 14 '11 at 15:56

answered Oct 14 '11 at 15:49

Steven Rumbalski

39,949
7
78
111

score 0 · Answer 4 · answered Apr 07 '20 at 07:52

using setdefault:

with open('data.txt', mode="r") as data_file:
    fieldnames = ('character', 'string', 'value1', 'value2')
    reader = csv.DictReader(data_file, fieldnames=fieldnames, delimiter="\t")

    result = {}
    for row in reader:
        result.setdefault(row['character'], {}).setdefault(row['string'], []).append([int(row['value1']), float(row['value2'])])

print(result)

score 0 · Answer 5 · answered Oct 14 '11 at 15:10

If you read the file in a variable called s for brevity, the following might work:

d = {}
for l in s.split('\n'):
    character, string, val1, val2 = l.split('\t')
    if not d.has_key(character):
        d[character] = { string: [] }
    d[character][string].append([val1, val2])

Assuming string is always the same for every character, but that wasn't explicitly specified in your question.

score 0 · Answer 6 · answered Oct 14 '11 at 15:25

Here's how I would do it. Not much shorter than yours. This way only keeps one copy of all the data in memory, only reading in one line at a time from the file.

f = open('data.txt', 'r')
rows = imap(lambda line: line.split('\t'), f)
result = {}
for key1, key2, val1, val2 in rows:
  key2 = eval(key2)  # safe only if you know the value is a quoted string
  if key1 not in result:
    result[key1] = {}
  if key2 not in result[key1]:
    result[key1][key2] = []
  result[key1][key2].append([int(val1), float(val2)])
f.close()  # prevent lingering open file

How to create complex dictionary structure in Python?

6 Answers6