I would like to assign a single tuple to a boolean-indexed slice of my dataframe, like this:
>>> import pandas as pd
>>> mydataframe = pd.DataFrame([1,2,3,4,5],columns=['colname'])
>>> mydataframe.loc[mydataframe['colname']>2,'colname'] = (1,2)
Desired output:
>>> mydataframe
colname
0 1
1 2
2 (1,2,3)
3 (1,2,3)
4 (1,2,3)
However, instead of assigning the tuple to each element, pandas tries to assign each element of the tuple to an element in the slice, and errors out because the shapes don't match.
Actual output:
ValueError: shape mismatch: value array of shape (2,) could not be broadcast
to indexing result of shape (3,)
I've tried using the set_value function and get the same behavior:
>>> mydataframe.set_value(mydataframe['colname']>2,'colname', (1,2))
ValueError: shape mismatch: value array of shape (2,) could not be broadcast
to indexing result of shape (3,)
This question works for assigning to a single element in the dataframe: Add a tuple to a specific cell of a pandas dataframe
Is there a way to do this assignment without resorting to looping over the elements in the slice?
Edit: I also tried the following as per EdChum's answer and it still isn't behaving as expected:
>>> mydataframe = pd.DataFrame([1,2,3,4,5],columns=['colname'])
>>> assignment_series = pd.Series([(1,2,3)]*np.sum(mydataframe['colname']>2))
>>>> assignment_series
0 (1, 2, 3)
1 (1, 2, 3)
2 (1, 2, 3)
dtype: object
>>> mydataframe.loc[mydataframe['colname']>2,'colname'] = assignment_series
>>> mydataframe
colname
0 1
1 2
2 (1, 2, 3)
3 NaN
4 NaN
Edit2: Sorry, I misunderstood EdChum's answer. The previous edit is not what he was saying, the assignment_series should be the same length as mydataframe, not mydataframe.loc[mydataframe['colname']>2,'colname'] as I did above. See EdChum's answer below.