9

Just when I thought I was getting the hang of Python and Pandas, another seemingly simple issue crops up. I want to add tuples to specific cells of a pandas dataframe. These tuples need to be calculated on-the-fly based on the contents of other cells in the dataframe - in other words, I can't easily calculate all tuples in advance and add them as a single array.

As an example, I define a dataframe with some data and add a couple of empty columns:

import pandas as pd
import bumpy as np
tempDF = pd.DataFrame({'miscdata': [1.2,3.2,4.1,2.3,3.3,2.5,4.3,2.5,2.2,4.2]})
tempDF['newValue'] = np.nan
tempDF['newTuple'] = np.nan

I can scroll through each cell of the 'newValue' column and add an integer value without problems:

anyOldValue = 3.5
for i in range(10):
    tempDF.ix[(i,'newValue')] = anyOldValue

print tempDF

However, if I try to add a tuple I get an error message:

anyOldTuple = (2.3,4.5)
for i in range(10):
    tempDF.ix[(i,'newTuple')] = anyOldTuple

print tempDF

I've received several error messages including:

ValueError: Must have equal len keys and value when setting with an ndarray

…and…

ValueError: setting an array element with a sequence.

I'm sure I've seen data frames with tuples (or lists) in the cells - haven't I? Any suggestions how to get this code working would be much appreciated.

user1718097
  • 3,350
  • 6
  • 34
  • 51

3 Answers3

10

You can use set_value:

tempDF.set_value(i,'newTuple', anyOldTuple)

Also make sure that the column is not a float column, for example:

tempDF['newTuple'] = 's' # or set the dtype

otherwise you will get an error.

elyase
  • 34,031
  • 10
  • 94
  • 107
  • That did the trick - thank you! Your second comment was very important to avoid error message. – user1718097 Jan 15 '15 at 07:55
  • Always use get_value() and set_value() instead of ix and others, when possible +1 for that :) http://stackoverflow.com/questions/13842088/set-value-for-particular-cell-in-pandas-dataframe and http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.get_value.html – JimLohse Oct 31 '16 at 03:14
  • And more detail on set_value() vs. df[value][value] for MultiIndexing etc http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy – JimLohse Oct 31 '16 at 03:16
  • Thanks! Although in the meantime, set_value is deprecated and we should use `tempDF.iat[i, 'newTuple'] = anyOldTuple` instead: https://pandas.pydata.org/pandas-docs/version/0.23/generated/pandas.DataFrame.set_value.html – Alexander Engelhardt Oct 09 '18 at 08:36
  • None of these work for me. I get a vague `ValueError: Must have equal len keys and value when setting with an iterable`. – Zizzipupp Jul 14 '20 at 14:54
1

set_value is deprecated.

you can just use .at[] or iat[]

e.g. some_df.at[ idx, col_name] = any_tuple

J.Melody
  • 331
  • 1
  • 6
1

As J.Melody pointed out, .at[] and .iat[] can be used to assign a tuple to a cell, if the dtype of the column is object.

Minimal example:

df initialized as:
   a  b  c
0  0  1  2
1  3  4  5
2  6  7  8

df containing tuple:
   a       b  c
0  0  (1, 2)  2
1  3       4  5
2  6       7  8

Code:

import numpy as np
import pandas as pd

df = pd.DataFrame(np.arange(9).reshape((3,3)), columns=list('abc'), dtype=object)
print('df initialized as:', df, sep='\n')
df.at[0,'b'] = (1,2)
print()
print('df containing tuple:', df, sep='\n')

Note:

If you skip , dtype=object, you end up with

ValueError: setting an array element with a sequence.
Markus Dutschke
  • 5,003
  • 2
  • 34
  • 38