0

I've got the following code:

    x_sents = defaultdict(list)  


    for sent in book_xml.findall('.//s'):
        s_lang = sent.get('lang')
        if s_lang in PARSED_LANGS:
            x_sents[s_lang].append(sent)


    for lang, sents in x_sents.items():
        for sent_index, sent in enumerate(sents):
            parsed_s = parsed[lang][sent_index] 
        for index, pars_inf in enumerate(parsed_s):
            words = sent.findall('w')  
            if pars_inf[0] == '0':  
                head_id = words[int(pars_inf[0])-1].get('id')
                words[index].attrib['head'] = head_id
                words[index].attrib['deprel'] = pars_inf[1]

My problem is that sometimes, I get the error list index out of range in the line parsed_s = parsed[lang][sent_index]. How can I check that the list entry exist - and what can I insert if the index does not exist?

ForceBru
  • 36,993
  • 10
  • 54
  • 78
John
  • 15
  • 3

1 Answers1

0

I would change the line, parsed_s = parsed[lang][sent_index] to something like:

parsed_s = parsed[lang][sent_index] if lang in parsed and sent_index in parsed[lang] else None

This will set the parsed_s to the value at the index if it is set and None if it is not. Despite the comment on your question, you should always validate input. Never assume when dealing with data coming from somewhere outside of your codebase, always validate the data structure and if applicable the data's type. Even when dealing with data coming from your codebase it's good practice to validate it, if only to keep your co-workers honest. In case you're not familiar with the ternary operator, check out this answer.

Allie Fitter
  • 1,153
  • 8
  • 17