I have data in the following format:
date fruit
======================
1-jan-14 orange
3-jan-14 orange
1-may-14 orange
2-may-14 apple
3-may-14 apple
1-aug-14 pineapple
2-aug-14 apple
I want to add a column which will represent: most purchased fruit in past 6 months. So, for the above example:
date fruit most_purchased_6_months
=======================================
1-jan-14 orange orange
3-jan-14 orange orange
1-may-14 orange orange
2-may-14 apple orange
3-may-14 apple orange
1-aug-14 pineapple apple
2-aug-14 apple apple
I have tried using analytical functions but the best I have reached is showing the number of times the same fruit (from the same row) has been purchased in past 6 months using "count(*) over (partition by fruit order by date range 180 preceding)".
My main issue is that using a self-join would easily solve my issue but as this is a table with about 20million rows of data it takes a few hours to run on our database. Would appreciate any direction and/or help!