2

I am new to Python and i've got two solutions for my problem, but i am wondering if this is the right way to do it or are there smarter solutions?

I've got a list with objects and a second list with dicts in it. Both contains names. I want to compare if a name is contained in both lists.

The class for the lists with objects

class Person():
def __init__(self, name):
    self.name = name

Create both lists for the example

i = 0
while i < 100:
    person_list.append(Person("test" + str(i)))
    i += 1

search_list = list()
i = 300
while i > 0:
    search_list.append({"name": "test" + str(i)})
    i -= 1

Option a seems very inefficient for me, because it iterates through the whole person_list in every iteration of search_list

for search in search_list:
    if [p for p in person_list if p.name == search["name"]]:
        print(search["name"] + " found")
    else:
        print(search["name"] + " not found")

I don't want to delete something from the person_list to make the iterations shorter, because i'll reuse it later.

Option b create a new list which only contains the names before searching in search in this list

for search in search_list:
    if search["name"] in names:
        print(search["name"] + " found")
        names.remove(search["name"])
    else:
        print(search["name"] + " not found")
LuckyDev
  • 33
  • 5

1 Answers1

3

Both your approaches have to scan the entire pair of lists every time (that’s what x in some_list inside a loop over person_list does).

What I’d do is traverse the list of names first, building a dict of names to indices.

from collections import defaultdict

names_to_indices = defaultdict(list)

for index, search in enumerate(search_list):
    names_to_indices[search["name"]].append(index)

If you don’t care about the indices, you can just use a set instead:

distinct_names = set(map(lambda search: search["name"], search_list))

This is one pass over the search list and requires non-constant space.

You can then iterate over your list of Persons, probing the dict or set for names. This will generally be a constant-time operation. So, for the price of space proportional to the number of distinct keys in sort_list (if you use a set; otherwise it’s proportional to the number of entries, since you’d be storing the indices) you can now perform your check in an amount of time proportional to the number of entries n in person_list (rather than in an amount of time proportional to n * m).

crcvd
  • 924
  • 1
  • 5
  • 19
  • And because i've got instant access to the keys of the list i dont have to loop through the whole list, right? Thank you for explenation, escspally that my both options do the same. – LuckyDev Mar 27 '21 at 21:15
  • Yep: a check like `key in some_dict` or `key in some_set` will generally be a constant time operation. [You can read more about that here](https://stackoverflow.com/questions/4553624/hashmap-get-put-complexity) and (for Python-specific context) [here](https://stackoverflow.com/questions/17539367/python-dictionary-keys-in-complexity). – crcvd Mar 27 '21 at 21:19
  • 1
    This is a common technique, but I couldn't find a good duplicate. – Karl Knechtel Mar 27 '21 at 21:27