4

I have a list of codes coming from a csv file:

file_path = 'c:\\temp\\list.csv'
csvfile =  open(file_path, 'rb')
reader = csv.reader(csvfile, delimiter=';')
rr = []
for sor in reader:
    if sor[1][0] == '1':
        rr.append(sor)
print type(rr)
<type 'list'>

set (rr)
Traceback (most recent call last):
  File "<pyshell#85>", line 1, in <module>
    set (rr)
TypeError: unhashable type: 'list'

If I do the very same on an other list coming from a database it works fine:

cur.execute('select code from mytable')
res = cur.fetchall()
res1 = []
res1.append(x[0] for x in res)
print type(res1)
<type 'list'>
set(res1)
set(['13561255', '11120088'])

What is the difference between rr and res1 as both are of list type?

Actually I'm looking for records in the database which doesn't exist in the csv file by doing

result = list(set(res1) - set(rr))

How can I achieve this (maybe in a more optimal/faster way)?

alecxe
  • 414,977
  • 106
  • 935
  • 1,083
Gabor
  • 1,163
  • 11
  • 20

2 Answers2

9

Every sor is a csv row - a list of row "cell" values, rr becomes a list of lists. Lists cannot be items of a set since lists are not hashable.

res1 on the other hand is a list of string values. Strings are hashable.


Here is an example that demonstrates the difference:

>>> l1 = [["1", "2", "3"]]
>>> l2 = ["1", "2", "3"]
>>> 
>>> set(l1)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    set(l1)
TypeError: unhashable type: 'list'
>>> set(l2)
set(['1', '3', '2'])
Community
  • 1
  • 1
alecxe
  • 414,977
  • 106
  • 935
  • 1,083
  • You're right. My mistake. I should have done exactly the same thing as I've done with the result of the database request: res1.append(x[0] for x in res) – Gabor Jul 12 '16 at 16:24
2

Here is a snippet that may be helpful. I used 'extends' instead of 'append' when pulling from a file. It must be a nuance related to importing from files.

base = ['dunk', 'slump', 'monk']

with open('dummytext.txt', 'r') as c_h_f:
        c_h = []
        for line in c_h_f:
                line = line.strip()
                tokens = line.split(',')
                c_h.extend(tokens)

s1 = set(base)
s2 = set(c_h)

print(s1)
print(s2)

Output...
{'slump', 'monk', 'dunk'}
{' lump', 'bunk', ' slump', ' monk', ' funk', ' junk', ' stump', ' dunk'}
K-Tides
  • 31
  • 1