Group by doesn't give me the newest group

Question

I was trying to get the distinct result from my table, and people said I should use the group by. This worked half way... I now get the distinct result but the result is not the newest thread... my table contains status on apartments from several buildings. The apartments can be found many times since it's a history table... I need to make a select that retrieves the distinct apartments with the current status.

ID    Building Apartment_id  Status
1     1        1             1
2     1        1             2
3     2        2             3
4     2        4             2
5     2        3             2
6     2        5             1
7     2        6             1

I'm currently working with:

SELECT * FROM `ib30_history` GROUP BY apartment_id, building ORDER BY id DESC

What's the desired output, given the sample data? Speaking of, [code](http://sscce.org/) (as SQL statements) is a more useful way to present sample data, as it can be [tested directly](http://tinyurl.com/so-hints). — outis, Oct 14 '11 at 20:05
Well as I said the table contains a lot of apartment history. The status changes over time... free... taken... free... and so on... this is great for an historic overview... But I need to be able to select the apartments and get only the newest status... so if there is like 100 different apartments and the history table has like 1000 history lines I need to get the 100 apartments with the newest status :) Hope that cleared it up... — Ronnie Jespersen, Oct 14 '11 at 20:18
@Ronnie: what counts as "newest"? The sample data doesn't have a date field, and ID isn't sufficient. Please include the sample results in your question, as asked. — outis, Oct 14 '11 at 20:27
@outis: Sorry about that... I do have a timestamp filed called time to.. but it dosent make sence to use it since the highest ID counts at the newest status... So we have a unique key consisting of "Building"/"appartment_id"/"id" and I need it where the id is higher than the others... — Ronnie Jespersen, Oct 14 '11 at 20:31
Dup of [mysql: select max(score) doesn't return the relevant row data. how to solve the isssue?](http://stackoverflow.com/questions/2081211/mysql-select-maxscore-doesnt-return-the-relevant-row-data-how-to-solve-the-i), [Control which row is returned by a group by](http://stackoverflow.com/questions/537223/mysql-control-which-row-is-returned-by-a-group-by), [choose which result to return on GROUP BY row](http://stackoverflow.com/questions/4099127/choose-which-result-to-return-on-group-by-row) — outis, Oct 14 '11 at 20:45

score 0 · Answer 1 · answered Oct 14 '11 at 20:14

0

 select h.apartment_id, h.status
  from history h 
       join (select apartment_id, max(status) status
               from history
              group by apartment_id) recent 
       on h.apartment_id = recent.apartment_id 
      and h.status = recent.status

answered Oct 14 '11 at 20:14

dispake

3,183
2
16
21

I'm afraid this dosen't help... it return more than one line pr. apartment... I need the list to only contain 1 line pr. apartment like distinct – Ronnie Jespersen Oct 14 '11 at 20:24
I just suggested an edit to this query to correct its results. Personally, I find this derived table use of subqueries vastly more readable than correlated subqueries, because of the separation of "what is the result of this subquery" and "how does this subquery relate to the rest of the query". – Simon Oct 14 '11 at 21:51

score 0 · Accepted Answer · edited Oct 14 '11 at 20:18

0

SELECT 
  Building
  , Appartment_id
  , Status 
FROM ib30_history a
WHERE id = ( SELECT MAX(id) FROM ib30_history b 
             WHERE b.Building = a.Building AND b.Appartment_id = a.Appartment_id)

edited Oct 14 '11 at 20:18

Johan

71,222
23
174
298

answered Oct 14 '11 at 20:15

0xCAFEBABE

5,280
5
30
54

This one looks promising... its slow but it seems to give me the right result - ill get back to you after some more testing – Ronnie Jespersen Oct 14 '11 at 20:29
Yes, subqueries can be difficult for execution engines to properly streamline. I prefer subqueries because most of the time they're much more comprehensible for code readers (code is read more often than written). However, sometimes the differences are huge, and the subquery has to go, but you could just check the runtime and maybe an EXPLAIN SELECT over my suggestion. – 0xCAFEBABE Oct 14 '11 at 20:32
Thx for your answer.. it check out.. your sql gives me the results I was looking for! :) any thoughts on how this will scale when I get more history rows? – Ronnie Jespersen Oct 14 '11 at 20:51
Not great. I'd suggest to solve this structurally. I'd probably create a condensation table using a trigger mechanism or a separate write after my addition to the history table. That has the advantage of determining the status in a defined time, rather than scale running time with the number of records in the history table. – 0xCAFEBABE Oct 14 '11 at 20:54
Thanks a lot... i'll consider making another table :) – Ronnie Jespersen Oct 14 '11 at 21:31

Group by doesn't give me the newest group

2 Answers2

Linked