I can't sum fee based on country, currency and product id from dfJANUARY and dfFEBRUARY. python said 'array is too big'
my file.txt as dfJANUARY has 35,6 mb
my file.txt as dfFEBRUARY has 36,3 mb
In[1]: dfJANUARY
Out[1]
Country PRODUCT ID currency fee
0 Arab Emirate COCA COLA USD 1000
1 Arab Emirate COCA COLA USD 1000
2 Arab Emirate COCA COLA USD 1009
86212 rows × 6 columns (unhide country: America ; PRODUCT ID: Fanta ; currency: SGD)
In[2]: dfFEBRUARY
Out[2]:
Country PRODUCT ID currency fee
0 Arab Emirate COCA COLA USD 2000
1 Arab Emirate COCA COLA USD 2000
2 Arab Emirate COCA COLA USD 2000
86212 rows × 6 columns (unhide country: America ; PRODUCT ID: Fanta; currency: SGD)
I've tried made code but it's fail
df = pd.merge(dfJANUARY,dfFEBRUARY, on = "fee", how = "inner")
* when i merge ther's warning:
valueerror array is too big arr.size * arr.dtype.itemsize
#made value of total
TOTAL = dfJANUARY[fee] + dfFEBRUARY[fee]
#made new column, it's name "TOTAL"
df["TOTAL"] = TOTAL
#made Pivot
gdf = df.pivot_table(index = ["PRODUCT ID","Country","currency"],values = ("TOTAL"), aggfunc="sum", fill_value=0)
so this is my expactation, i can sum income based on type of currency, product id, country. So i will get TOTAl
can you help me?
**expect**
dfEXPECT
TOTAL
Country PRODUCT ID currency
0 Arab Emirate COCA COLA USD 10000
SGD 15000
1 Arab Emirate Fanta USD 20000
SGD 30000
2 America COCA COLA USD 90000
SGD 95000
3 America Fanta USD 80000
SGD 75000
86212 rows × 6 columns