I have the following query, where I retrieve the number of sales, and the average price of those sales for each day, for a particular item.
SELECT COUNT(1) AS num_sales, DATE_FORMAT(sales.created_at, '%Y-%m-%d') AS date, AVG(prices.price) AS avg_price
FROM sales INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7503 AND (`prices`.`source` = 0 or (`prices`.`price` >= 400 and `prices`.`source` > 0))
GROUP BY date
ORDER BY date ASC
I also have a for-loop that does a separate query for each day to get the median price (let's assume the number of results are even):
SELECT prices.price FROM sales INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7503
AND (`prices`.`source` = 0 or (`prices`.`price` >= 400 and `prices`.`source` > 0))
AND DATE(sales.created_at) = "<THE DATE OF THE CURRENT FOR-LOOP OBJECT>"
ORDER BY prices.price ASC
LIMIT 1 OFFSET <NUMBER OF THE MIDDLE ROW>
As you can imagine, this is very slow, as in some cases hundreds of queries must be done on a large table (the sales table has a few hundred million rows).
How do you rewrite the first SQL query so that it also calculates the median of prices.price
, similar to AVG(prices.price)
? I've looked at answers such as this one but can't wrap my head around how to adapt it for my specific scenario.
I've spent hours trying to accomplish this, but my SQL knowledge simply isn't good enough. Any help would be greatly appreciated!
root@ns525077:~# mysql -V
mysql Ver 14.14 Distrib 5.7.13, for Linux (x86_64) using EditLine wrapper
Table schemas:
CREATE TABLE `prices` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`item_id` int(11) unsigned NOT NULL,
`price` decimal(8,2) NOT NULL,
`net_price` decimal(8,2) NOT NULL,
`source` tinyint(4) NOT NULL,
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`),
KEY `prices_ibfk_1` (`item_id`),
CONSTRAINT `prices_ibfk_1` FOREIGN KEY (`item_id`) REFERENCES `items` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=4861375 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE `sales` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`price_id` int(11) unsigned DEFAULT NULL,
`item_key` varchar(40) COLLATE utf8_unicode_ci NOT NULL,
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`),
UNIQUE KEY `item_key` (`item_key`),
KEY `price_id` (`price_id`),
KEY `created_at` (`created_at`),
KEY `price_id__created_at__IX` (`price_id`,`created_at`),
CONSTRAINT `sales_ibfk_1` FOREIGN KEY (`price_id`) REFERENCES `prices` (`id`) ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=386156944 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Example of output from my first query: