Is there a way to improve my key column selection on 2 data frames? Below data is what I've coded and the result expectations were correct.
I listed the key_columns for reference as I have used it to drop duplicates instead of indicating the columns one by one.
I need help if is there is any way for my selection inside the FOR loop to shorten or be written in pythonic way? It would be a great help as I would need to select more data other than QTY. Thank you.
key_columns = ['COMPANY', 'PLANT', 'MATERIAL', 'DATE']
dfunique = dfunique.drop_duplicates(key_columns)
# convert all into a new table
for i, dfkey in dfunique.iterrows():
try:
dfunique.loc[i, 'BEFORE_QTY'] = df[df.COMPANY == dfkey.COMPANY] \
[df.PLANT == dfkey.PLANT] \
[df.MATERIAL == dfkey.MATERIAL] \
[df.DATE == dfkey.DATE] \
[df.ROW_TYPE == 'BEFORE'].QTY.item()
except ValueError:
dfunique.loc[i, 'BEFORE_QTY'] = 0
Sample data Hello, My output expectation looks like this, the logic is correct, I am just concern about the selection. Question on the MERGE suggestion, should I break my table into 2? BEFORE and AFTER and just merge it?