-2

I have a CSV file shown below.I need to convert CSV to dictionary of dictionaries using python.

 userId movieId rating
1         16    4
1         24    1.5
2         32    4
2         47    4
2         50    4
3        110    4
3        150    3
3        161    4
3        165    3

The output should be like shown below

dataset={'1':{'16':4,'24':1.5},
         '2':{'32':4,'47':4,'50':4},
         '3':{'110':4,'150':3,'161':4,'165':3}}

Please let me know how to do this. Thanks in advance

akhil s
  • 117
  • 1
  • 11

3 Answers3

2

You are looking for nested dictionaries. Implement the perl's autovivification feature in Python (the detailed description is given here). Here is a MWE.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import csv

class AutoVivification(dict):
    """Implementation of perl's autovivification feature."""
    def __getitem__(self, item):
        try:
            return dict.__getitem__(self, item)
        except KeyError:
            value = self[item] = type(self)()
            return value

def main():
    d = AutoVivification()
    filename = 'test.csv'
    with open(filename, 'r') as f:
        reader = csv.reader(f, delimiter=',')
        next(reader)        # skip the header
        for row in reader:
            d[row[0]][row[1]] = row[2]

    print(d)
    #{'1': {'24': '1.5', '16': '4'}, '3': {'150': '3', '110': '4', '165': '3', '161': '4'}, '2': {'32': '4', '50': '4', '47': '4'}}

if __name__ == '__main__':
    main()

The content of test.csv,

userId,movieId,rating
1,16,4
1,24,1.5
2,32,4
2,47,4
2,50,4
3,110,4
3,150,3
3,161,4
3,165,3
Community
  • 1
  • 1
SparkAndShine
  • 14,337
  • 17
  • 76
  • 119
0
import numpy as np

col1,col2,col3 = np.loadtxt('test2.csv',delimiter=',',skiprows=1,unpack=True,dtype=int)

dataset = {}

for a,b,c in zip(col1,col2,col3):
    if str(a) in dataset:
        dataset[str(a)][str(b)]=str(c)
    else:
        dataset[str(a)]={str(b):str(c)}
print(dataset)

This should do. The example file above looks like a tsv (tab separated value). If so, remove the delimiter flag in my example.

Jannis
  • 321
  • 1
  • 3
  • 8
0
import csv
dataset = dict()
with open("file_name", "rb") as csv_file:
    data = csv.DictReader(csv_file)
    for row in data:
        old_data = dataset.get(row["userId"], None)

        if old_data is None:
            dataset["userId"] = {row["movieId"]: row["rating"] }
        else:
            old_data[row["movieId"]] = row["rating"]
            dataset[row["userId"]] = old_data