0

I have three pieces of data that I need to nest in a dictionary using python:

  1. ID (can be repeted)
  2. Date (of action - can be repeted)
  3. actions (lists of words that are associated to an ID and a date)

Example of data (tab-separated):

UID DATE    ACTIONS
abc123  12/25/2016  break, pullover
abc123  12/25/2016  stop
abc123  10/15/2015  break, pullover, turn
def456  6/14/2015   turn, wash, skid
def456  11/24/2016  stop, wash, pullover, break
ghi789  2/12/2015   pullover, stop

CODE - revised with @moogle comments

from collections import defaultdict

date = ['12/25/16','12/25/16','10/15/2015','6/14/2015','11/24/2016','2/12/2015']
uid = ['abc123','abc123', 'abc123','def456', 'def456', 'ghi789']
action = [['break', 'pullover'],['stop'],['break','pullover','turn'],['turn','wash','skid'],['stop','wash','pullover','break'],['pullover','stop']]

d = defaultdict(list)
for uid, date, action in zip(uid, date, action):
    d[id].append((date,action))

    print dict(d)

DESIRED OUTPUT The desired output is a nested dictionary of lists. Where the parent key is the ID and the parent value is nested dictionaries in which the nested key is the date and the nested value is a list of lists (actions).

current actual output

{'ghi789': [('2/12/2015', ['pullover', 'stop'])], 'def456': [('6/14/2015', ['turn', 'wash', 'skid']), ('11/24/2016', ['stop', 'wash', 'pullover', 'break'])], 'abc123': [('12/25/16', ['break', 'pullover']), ('12/25/16', ['stop']), ('10/15/2015', ['break', 'pullover', 'turn'])]}

**desired output**
{'abc123':[{'12/25/2016':[['break', 'pullover'],['stop']]}, {'10/15/2015':[['break','pullover','turn']]}],'def456':[{'6/14/2015':[['turn','wash','skid'],['stop','wash','pullover','break']},'ghi789':{'2/12/2915':[['pullover','stop']]}]}

I tried to obtain the above output with the above code, which I adapted from HERE and looked up HERE. However, I continuosuly am getting errors. I think it has to do with the fact that I am trying to nest in the value a list of lists, and I am unsure what direction to go in to fix it.

Community
  • 1
  • 1
owwoow14
  • 1,376
  • 5
  • 23
  • 39
  • `for id, date, action in zip(id, date, action):`... You overwrite your list variables with the for loop iterators... – OneCricketeer Jan 31 '17 at 16:06
  • 2
    Please don't do this... Use objects! – ospahiu Jan 31 '17 at 16:06
  • 1
    What is your actual output now? What error message are you getting? – Fruitspunchsamurai Jan 31 '17 at 16:06
  • @Fruitspunchsamurai the output I am getting right now is superficial, I copy it here - "TypeError: list indices must be integers, not tuple" – owwoow14 Jan 31 '17 at 16:11
  • 2
    You forgot a comma between two lists, which is why you get the TypeError. Also, you shouldn't use `id` as a variable name. – moogle Jan 31 '17 at 16:12
  • 1
    @moogle, that takes care of the error, but does not give OP the desired output as he specified. – Fruitspunchsamurai Jan 31 '17 at 16:16
  • @moogle You are right, 1. i edited question for variable name `id` -> `uid` and 2. with the comma, I know longer get error, but the output is still not desired. I fixed the code and put current output in the question – owwoow14 Jan 31 '17 at 16:17

2 Answers2

3

I think that an object based approach is much better for these data.

You can do something like:

class Event:
    def __init__(self, ID, date, actions):
        self.ID=ID
        self.date=date
        self.actions=actions

    def __repr__(self):
        return 'ID: {} date: {} actions: {}'.format(self.ID, self.date, self.actions)    

Then create a list of objects like so:

 >>> objs=[Event(id_, d, actions) for id_, d, actions in zip(uid, date, action)]
 >>> objs
 [ID: abc123 date: 12/25/16 actions: ['break', 'pullover'], ID: abc123 date: 12/25/16 actions: ['stop'], ID: abc123 date: 10/15/2015 actions: ['break', 'pullover', 'turn'], ID: def456 date: 6/14/2015 actions: ['turn', 'wash', 'skid'], ID: def456 date: 11/24/2016 actions: ['stop', 'wash', 'pullover', 'break'], ID: ghi789 date: 2/12/2015 actions: ['pullover', 'stop']]

Then that list of actions / events can be sorted, analyzed, saved as you wish.

Sort by date:

>>> sorted(objs, key=lambda o: o.date)
[ID: abc123 date: 10/15/2015 actions: ['break', 'pullover', 'turn'], ID: def456 date: 11/24/2016 actions: ['stop', 'wash', 'pullover', 'break'], ID: abc123 date: 12/25/16 actions: ['break', 'pullover'], ID: abc123 date: 12/25/16 actions: ['stop'], ID: ghi789 date: 2/12/2015 actions: ['pullover', 'stop'], ID: def456 date: 6/14/2015 actions: ['turn', 'wash', 'skid']]

By event:

>>> [o for o in objs if 'stop' in o.actions]
[ID: abc123 date: 12/25/16 actions: ['stop'], ID: def456 date: 11/24/2016 actions: ['stop', 'wash', 'pullover', 'break'], ID: ghi789 date: 2/12/2015 actions: ['pullover', 'stop']]

And then creating a dict similar to what you want (even though that example is not a legal Python dict....) is fairly obvious:

di={o.ID:[] for o in objs}
for user in di:
    di[user].append({o.date:o.actions for o in objs if o.ID==user})

>>> di
{'ghi789': [{'2/12/2015': ['pullover', 'stop']}], 'def456': [{'6/14/2015': ['turn', 'wash', 'skid'], '11/24/2016': ['stop', 'wash', 'pullover', 'break']}], 'abc123': [{'10/15/2015': ['break', 'pullover', 'turn'], '12/25/16': ['stop']}]}
dawg
  • 80,841
  • 17
  • 117
  • 187
1

Your code works for me, if I add the missing comma in the 'action' list...

I get the output:

{'ghi789': [('2/12/2015', ['pullover', 'stop'])], 'def456': [('6/14/2015', ['turn', 'wash', 'skid']), ('11/24/2016', ['stop', 'wash', 'pullover', 'break'])], 'abc123': [('12/25/16', ['break', 'pullover']), ('12/25/16', ['stop']), ('10/15/2015', ['break', 'pullover', 'turn'])]}

What about the following solution:

from collections import defaultdict

date = ['12/25/16','12/25/16','10/15/2015','6/14/2015','11/24/2016','2/12/2015']
uid = ['abc123','abc123', 'abc123','def456', 'def456', 'ghi789']
action = [['break', 'pullover'],['stop'],['break','pullover','turn'],['turn','wash','skid'],['stop','wash','pullover','break'],['pullover','stop']]

d = defaultdict(dict)
for uid, date, action in zip(uid, date, action):
    d[uid].setdefault(date,[]).append(action)

print dict(d)

output:

{'ghi789': {'2/12/2015': [['pullover', 'stop']]}, 'def456': {'6/14/2015': [['turn', 'wash', 'skid']], '11/24/2016': [['stop', 'wash', 'pullover', 'break']]}, 'abc123': {'10/15/2015': [['break', 'pullover', 'turn']], '12/25/16': [['break', 'pullover'], ['stop']]}}
blattmra
  • 11
  • 2