1

I have a dictionary of dataframes with keys in the form "Sample_X_####celsius". Each dataframe is structured like this:

displacement force
values values
values values
values values

I also have an additional dataframe with three columns structured like this:

label width thickness
sample_1_200celsius 11 222
sample_1_300celsius 12 223

How can I combine each row of the additional dataframe with the dataframe in the dictionary who's key corresponds to the "label" entry? My goal looks like this:

key: sample_1_200celsius

displacement force label width thickness
values values sample_1_200celsius 11 222
values values
values values
smci
  • 26,085
  • 16
  • 96
  • 138
aria39948
  • 13
  • 2
  • `pd.concat([df1, df2])`. But are you sure you want a `concat` (so `label, width, thickness` are only non-empoty on the first row?), rather than a join with broadcasting to all rows? what is your intent, what do you want to do with the output df? – smci Apr 11 '21 at 02:02
  • Possible duplicate, see [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) – smci Apr 11 '21 at 02:03
  • I don't quite understand what is *a dictionary of dataframes with keys in the form "Sample_X_####celsius"*. – Ynjxsjmh Apr 11 '21 at 02:17
  • @smci I can't concatenate because I have a ton of different files (300+). I need a way for only one row from the additional dataframe to be "matched" to each file in the dictionary who's key is the same as the label of that row. – aria39948 Apr 11 '21 at 02:25
  • @Ynjxsjmh The keys match up with the "label" row of the additional dataframe is what I was trying to say there. – aria39948 Apr 11 '21 at 02:26
  • @aria39948 In the last dataframe of you posted, what's the value of `label`, `width`, `thickness` column of the last two rows. – Ynjxsjmh Apr 11 '21 at 02:41
  • @Ynjxsjmh No value there-- I want to take only one row from the 2nd dataframe and add it to each instance of the first dataframe. So for the sample_1_200celsius first dataframe, it only gets the row of the second dataframe that contains "sample_1_200celsius". – aria39948 Apr 11 '21 at 04:16
  • You can do the concatenate/join if you do them at read-time. Or you could do it in batches of say 10. Really you need to go back and show us the read or load commands that imported these dfs. – smci Apr 11 '21 at 20:06

1 Answers1

0

If the dictionary and additional dataframe are dictionary and df, you can concat() the dictionary dataframes with the corresponding df row:

for key in dictionary.keys():
    dfs = [dictionary[key], df.loc[df.label == key].reset_index(drop=True)]
    dictionary[key] = pd.concat(dfs, axis=1)

Full example

Given dictionary with dataframes for sample_1_200celsius and sample_1_300celsius:

df200 = pd.DataFrame({'displacement': [1,2,3], 'values': [2,4,6]})
df300 = pd.DataFrame({'displacement': [30,20,10], 'values': [60,40,20]})
dictionary = {'sample_1_200celsius': df200, 'sample_1_300celsius': df300}

And additional dataframe df:

df = pd.DataFrame({'label': ['sample_1_200celsius','sample_1_300celsius'], 'width': [11,12], 'thickness': [222,223]})

#                  label  width  thickness
# 0  sample_1_200celsius     11        222
# 1  sample_1_300celsius     12        223

You can concat() each dictionary dataframe with the corresponding row from df using loc indexing:

for key in dictionary.keys():
    dfs = [dictionary[key], df.loc[df.label == key].reset_index(drop=True)]
    dictionary[key] = pd.concat(dfs, axis=1)

So then the dictionary values will be updated as desired:

dictionary['sample_1_200celsius']

#    displacement  values                label  width  thickness
# 0             1       2  sample_1_200celsius   11.0      222.0
# 1             2       4                  NaN    NaN        NaN
# 2             3       6                  NaN    NaN        NaN
dictionary['sample_1_300celsius']

#    displacement  values                label  width  thickness
# 0            30      60  sample_1_300celsius   12.0      223.0
# 1            20      40                  NaN    NaN        NaN
# 2            10      20                  NaN    NaN        NaN
tdy
  • 11,122
  • 1
  • 8
  • 24