0

In the situation I encounter, I would like to define "elegant" being having 1) constant O(1) time complexity for checking if an item exists and 2) store only items, nothing more.

For example, if I use a list

num_list = []
for num in range(10): # Dummy operation to fill the container.
    num_list += num
if 1 in num_list:
    print("Number exists!")

The operation "in" will take O(n) time according to [Link]

In order to achieve constant checking time, I may employ a dictionary

num_dict = {}
for num in range(10): # Dummy operation to fill the container.
    num_dict[num] = True
if 1 in num_dict:
    print("Number exists!")

In the case of a dictionary, the operation "in" costs O(1) time according to [Link], but additional O(n) storage is required to store dummy values. Therefore, both implementations/containers seem inelegant.

What would be a better implementation/container to achieve constant O(1) time for checking if an item exists while only storing the items? How to keep resource requirement to the bare minimum?

Community
  • 1
  • 1
Sean
  • 565
  • 8
  • 22
  • 2
    What you probably want is a set, it still has O(n) storage requirements, but you don't have to store the `True` values for each value. Otherwise you could use something like a bloom filter if you don't want O(n) storage, but it has other trade-offs as it is a probabilistic data structure – Francisco C Oct 18 '16 at 19:39
  • 1
    I think you're looking for sets. – skrrgwasme Oct 18 '16 at 19:41
  • 1
    You guys are correct, set is exactly what I needed. I should've read my own link more thoroughly. Would you recommend I remove the question since it is so basic? To Francisco: Would you mind to submit an answer? I will accept it, but perhaps mods will delete my question later. If so, I apologize in advance. – Sean Oct 18 '16 at 19:47

2 Answers2

1

The solution here is to use a set, which doesnˈt requires you to save a dummy variable for each value.

Francisco C
  • 8,871
  • 4
  • 31
  • 41
0

Normally you can't optimise both space and time together. One thing you can do is have more details about the range of data(here min to max value of num) and size of data(here it is number of times loop runs ie., 10). Then you will have two options :

  1. If range is limited then go for dictionary method(or even use array index method)
  2. If size is limited then go for list method.

If you choose right method then you will probably achieve constant time and space for large sample

EDIT: Set It is a hash table, implemented very similarly the Python dict with some optimizations that take advantage of the fact that the values are always null (in a set, we only care about the keys). Set operations do require iteration over at least one of the operand tables (both in the case of union). Iteration isn't any cheaper than any other collection ( O(n) ), but membership testing is O(1) on average.

Hrishikesh Goyal
  • 409
  • 2
  • 11