"Series objects are mutable and cannot be hashed" error

Question

I am trying to get the following script to work. The input file consists of 3 columns: gene association type, gene name, and disease name.

cols = ['Gene type', 'Gene name', 'Disorder name']
no_headers = pd.read_csv('orphanet_infoneeded.csv', sep=',',header=None,names=cols)

gene_type = no_headers.iloc[1:,[0]]
gene_name = no_headers.iloc[1:,[1]]
disease_name = no_headers.iloc[1:,[2]]

query = 'Disease-causing germline mutation(s) in' ###add query as required

orph_dict = {}

for x in gene_name:
    if gene_name[x] in orph_dict:
        if gene_type[x] == query:
            orph_dict[gene_name[x]]=+ 1
        else:
            pass
    else:
        orph_dict[gene_name[x]] = 0

I keep getting an error that says:

Series objects are mutable and cannot be hashed

Any help would be dearly appreciated!

show us the full traceback so we can see the line on which the error is being thrown. my guess is it's `orph_dict[gene_name[x]] = 0`. the traceback would also show us the class of error being thrown. — dbliss, Apr 17 '15 at 13:50

score 32 · Answer 1 · edited May 23 '17 at 12:18

Shortly: gene_name[x] is a mutable object so it cannot be hashed. To use an object as a key in a dictionary, python needs to use its hash value, and that's why you get an error.

Further explanation:

Mutable objects are objects which value can be changed. For example, list is a mutable object, since you can append to it. int is an immutable object, because you can't change it. When you do:

a = 5;
a = 3;

You don't change the value of a, you create a new object and make a point to its value.

Mutable objects cannot be hashed. See this answer.

To solve your problem, you should use immutable objects as keys in your dictionary. For example: tuple, string, int.

score 12 · Answer 2 · answered Apr 17 '15 at 14:36

12

gene_name = no_headers.iloc[1:,[1]]

This creates a DataFrame because you passed a list of columns (single, but still a list). When you later do this:

gene_name[x]

you now have a Series object with a single value. You can't hash the Series.

The solution is to create Series from the start.

gene_type = no_headers.iloc[1:,0]
gene_name = no_headers.iloc[1:,1]
disease_name = no_headers.iloc[1:,2]

Also, where you have orph_dict[gene_name[x]] =+ 1, I'm guessing that's a typo and you really mean orph_dict[gene_name[x]] += 1 to increment the counter.

answered Apr 17 '15 at 14:36

jkitchen

683
9
15

1

How could I apply this technique of creating the Series from the start when I am splitting into a training and testing dataset? `X_train, X_test, y_train, y_test = train_test_split(training_feature_set, training_feature_label, test_size = 0.1, random_state=42)` @http://stackoverflow.com/users/639792/jkitchen – Alvis May 03 '17 at 11:07
1

@Alvis, if your function returns DataFrames, you can still select individual items from those. Read the [docs for indexing](http://pandas.pydata.org/pandas-docs/stable/indexing.html). `.loc` or `.iloc` are probably what you want. – jkitchen May 04 '17 at 16:37
1

Thank you @jkitchen I'll check out the documentation :-) – Alvis May 05 '17 at 09:10

"Series objects are mutable and cannot be hashed" error

2 Answers2

Linked