Is there a way to improve multiple key columns selection in Pandas

Question

Is there a way to improve my key column selection on 2 data frames? Below data is what I've coded and the result expectations were correct.

I listed the key_columns for reference as I have used it to drop duplicates instead of indicating the columns one by one.

I need help if is there is any way for my selection inside the FOR loop to shorten or be written in pythonic way? It would be a great help as I would need to select more data other than QTY. Thank you.

key_columns = ['COMPANY', 'PLANT', 'MATERIAL', 'DATE']
dfunique = dfunique.drop_duplicates(key_columns)
# convert all into a new table
for i, dfkey in dfunique.iterrows():
    try:
        dfunique.loc[i, 'BEFORE_QTY'] = df[df.COMPANY == dfkey.COMPANY] \
            [df.PLANT == dfkey.PLANT] \
            [df.MATERIAL == dfkey.MATERIAL] \
            [df.DATE == dfkey.DATE] \
            [df.ROW_TYPE == 'BEFORE'].QTY.item()
    except ValueError:
        dfunique.loc[i, 'BEFORE_QTY'] = 0

Sample data Hello, My output expectation looks like this, the logic is correct, I am just concern about the selection. Question on the MERGE suggestion, should I break my table into 2? BEFORE and AFTER and just merge it?

would you provide some sample data for these two dataframes? — Ian, Apr 11 '21 at 15:35
Is there something missing on the right side of assignment `df[df.company == dfkey.company] \ [df.PLANT == dfkey.PLANT] ....` — Joe Ferndz, Apr 11 '21 at 18:05
did you try to do a merge of df and dfunique on the keys to get records you wanted? left or right join as per your needs may be a better fit than iterating through the whole dataframe — Joe Ferndz, Apr 11 '21 at 18:07
Read more about merge join concat [here](https://stackoverflow.com/questions/53645882/pandas-merging-101) and [here](https://pandas.pydata.org/docs/user_guide/merging.html) — Joe Ferndz, Apr 11 '21 at 18:08
thank you for that suggestion Joe, i will check merge, but would that be efficient? i'm thinking of cutting the raw table into two one filtered BEFORE one filtered AFTER, then merging afterwards, i will update this thread once solved — chuam2, Apr 12 '21 at 06:04
thank you experts! the correct solution for my requirement is in the link above, question 10 df.pivot_table(index=key_columns, columns='ROW_TYPE', values='QTY').reset_index() — chuam2, Apr 12 '21 at 14:39

Is there a way to improve multiple key columns selection in Pandas

0 Answers0