3

I am trying to query a list of meetings from the most recent semester, where semester is determined by two fields (year, semester). Here's a basic outline of the schema:

Otherfields    Year    Semester
meeting1       2014    1
meeting2       2014    1
meeting3       2013    2
... etc ...

As the max should be considered for the Year first, and then the Semester, my results should look like this:

Otherfields    Year    Semester
meeting1       2014    1
meeting2       2014    1

Unfortunately simply using the MAX() function on each column separately will try to find Year=2014, Semester=2, which is incorrect. I tried a couple approaches using nested subqueries and inner joins but couldn't quite get something to work. What is the most straightforward approach to solving this?

guhou
  • 1,702
  • 12
  • 31
  • `mysql` OR `oracle`??? – NoobEditor Apr 20 '14 at 05:57
  • Personally, I'd create a year/semester table, essentially similar to a regular Calendar table (or maybe as fields in one). Then you just stick the year/semester id in this table. This _does_ require an extra join for queries (the fact that the year/semester is _probably_ sequential is an implementation detail that shouldn't be relied upon), but should be a fast one. Plus, then it's easy to give semester start/end dates, etc. – Clockwork-Muse Apr 20 '14 at 06:41

6 Answers6

2

Using a window function:

SELECT Year, Semester, RANK() OVER(ORDER BY Year DESC, Semester DESC) R
FROM your_table;

R will be a column containing the "rank" of the couple (Year, Semester). You can then use this column as a filter, for instance :

WITH TT AS (
  SELECT Year, Semester, RANK() OVER(ORDER BY Year DESC, Semester DESC) R
  FROM your_table
)
SELECT ...
FROM TT
WHERE R = 1;

If you don't want gaps between ranks, you can use dense_rank instead of rank.

This answer assumes you use a RDBMS who is advanced enough to offer window functions (i.e. not MySQL)

Vincent Savard
  • 30,767
  • 10
  • 61
  • 70
  • I'm guessing this is the most performant way to run this query in Oracle. Out of curiosity, is the RANK() function Oracle-specific, or is it portable between different DBMS? (eg MSSQL, PostgreSQL) – guhou Apr 21 '14 at 08:49
  • There should be an equivalent in any DBMS featuring window functions, but I used PostgreSQL to write this particular query. – Vincent Savard Apr 21 '14 at 19:30
1

I wouldn't be surprised if there's a more effecient way to do this (and avoid the duplicate subquery), but this will get you the answer you want:

SELECT * FROM table WHERE Year = 
    (SELECT MAX(Year) FROM table)
AND Semester =
    (SELECT MAX(Semester) FROM table WHERE Year =
        (SELECT MAX(Year) FROM table))
musical_coder
  • 3,870
  • 3
  • 13
  • 18
  • Yes - you could modify the double-nested one to use as a subquery table-reference, and include it as a join. Although Oracle has better operations available, that would work in other dbs (like MySQL). – Clockwork-Muse Apr 20 '14 at 06:54
1

Here's Postgres:

with table2 as /*virtual temporary table*/
(
    select *, year::text || semester as yearsemester
    from table
)
select Otherfields, year, semester
from table2
where (Otherfields, yearsemester) in
(
    select Otherfields, max(yearsemester)
    from table2
    group by Otherfields
)
Emery Lapinski
  • 1,342
  • 17
  • 22
  • For dbs without better means, combining the relevant fields is a viable option, although I would have structured the rest of the query rather differently. And would have checked the performance difference of using a numeric type instead of a string. – Clockwork-Muse Apr 20 '14 at 07:00
1

I've been overthinking this, there's a much simpler way to get this:

SELECT Meeting.year, Meeting.semester, Meeting.otherFields
FROM Meeting
JOIN (SELECT year, semester
      FROM Meeting
      WHERE ROWNUM = 1
      ORDER BY year DESC, semester DESC) MostRecent
  ON MostRecent.year = Meeting.year
     AND MostRecent.semester = Meeting.semester

(and working Fiddle)

Note that variations of this should work for pretty much all dbs (anything that supports a limiting clause in a subquery); here's the MySQL version, for example:

SELECT Meeting.year, Meeting.semester, Meeting.otherFields
FROM Meeting
JOIN (SELECT year, semester
      FROM Meeting
      ORDER BY year DESC, semester DESC
      LIMIT 1) MostRecent
  ON MostRecent.year = Meeting.year
     AND MostRecent.semester = Meeting.semester

(...and working fiddle)

Given some of the data in this answer this should be performant for Oracle, and I suspect other dbs as well (given the shortcuts the optimizer is allowed to take). This should be able to replace the use of things like ROW_NUMBER() in most instances where no partitioning clause is provided (no window).

Community
  • 1
  • 1
Clockwork-Muse
  • 12,036
  • 6
  • 26
  • 44
0

why don't you simply use ORDER BY???

that way, it would be easier to handle and less messy!! :)

SELECT * FROM table
Where Year = (Select Max(Year) from table) /* optional clause to select only 2014*/
Order by Semester ASC, Year DESC,  Otherfields; /*numericaly lowest sem first. in case of sem clash, sort by descending year first      */   

EDIT

In case, you need limited results from 2014, use Limit clause ( for mysql )

SELECT * FROM table
Where Year = (Select Max(Year) from table) 
Order by Semester ASC, Year DESC,  Otherfields
LIMIT 10;

It will order first, then get the Limit - 10, so u get your limited result set!

This will fetch output like :

Otherfields    Year    Semester
meeting1       2014    1
meeting2       2014    1
meeting1       2013    1
meeting2       2013    2
NoobEditor
  • 14,239
  • 13
  • 66
  • 102
  • This does not eliminate the rows where the meeting wasn't last semester. – Vincent Savard Apr 20 '14 at 06:10
  • didnt get u...can u please rephrase?? – NoobEditor Apr 20 '14 at 06:25
  • Your query would return all the rows from the table while the original poster only wants those from the last semester. – Vincent Savard Apr 20 '14 at 06:32
  • yaa i agree, thats why i later added, `where` clause, also, to limit the rows returned, OP can use `limit`, but he/she hasnt mentioned what DB is in use!! – NoobEditor Apr 20 '14 at 06:34
  • Uh, no. For one thing, you're `ORDER`ing in the wrong direction for the semester. What happens if there are more than 10 meetings? – Clockwork-Muse Apr 20 '14 at 06:39
  • @Clockwork-Muse : first `2014` will be gathered, then, starting from 1, `semester`s will be ordered for 2014, in case, some sems clash, `meetings` col would be ordered....do i answer your question? – NoobEditor Apr 20 '14 at 06:42
  • You've misunderstood the OP's needs - he wants the _most recent_ semester, which will presumable **not** be `1`. `LIMIT` probably isn't useful, as the OP needs _all_ the meetings from the current semester - what happens if there are more than 10 meetings in a semester? And if there are less, he doesn't want meetings from other semesters... You need a better way to restrict the results. – Clockwork-Muse Apr 20 '14 at 06:51
  • @Clockwork-Muse : right...for that, you first need to get most recent year, *2014 in this case*, then for all `2014`, arrange the sems....thats what OP wants and so does my query....besides, OP aint responding...so i guess s/he left for the day already!! :) – NoobEditor Apr 20 '14 at 08:05
  • -1 - Your queries aren't guaranteed to get all results, and getting any results at all will still require massaging application-side, which is not what the OP needs (nor is it even necessary). And if you get those results out of your second query, something is very wrong with the db. – Clockwork-Muse Apr 20 '14 at 08:26
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/51047/discussion-between-noobeditor-and-clockwork-muse) – NoobEditor Apr 20 '14 at 09:12
  • *Your queries aren't guaranteed to get all results* - remove `Limit` and u have *all* result set.... – NoobEditor Apr 20 '14 at 09:42
0

Answering my own question here:

This query was run in a stored procedure, so I went ahead and found the maximum year/semester in separate queries before the rest of the query. This is most likely inefficient and inelegant, but it is also the most understandable method- I don't need to worry about other members of my team getting confused by it. I'll leave this question here since it's generally applicable to many other situations, and there appear to be some good answers providing alternative approaches.

-- Find the most recent year.
SELECT MAX(year) INTO max_year FROM meeting;

-- Find the most recent semester in the year.
SELECT MAX(semester) INTO max_semester FROM meeting WHERE year = max_year;

-- Open a ref cursor for meetings in most recent year/semester.
OPEN meeting_list FOR
    SELECT otherfields, year, semester
    FROM   meeting
    WHERE  year = max_year
    AND    semester = max_semester;
guhou
  • 1,702
  • 12
  • 31