18

I have a table with an ID column and another column with a number. One ID can have multiple numbers. For example

ID | Number
 1 |  25
 1 |  26
 1 |  30
 1 |  24
 2 |  4
 2 |  8
 2 |  5

Now based of this data, in a new table, I want to have this

ID | Low | High 
1  |  24 |  26
1  |  30 |  30
2  |  4  |  5
2  |  8  |  8

As you can see, I want to merge any data where the numbers are consecutive, like 24, 25, 26. So now the low was 24, the high was 26, and then 30 is still a separate range. I am dealing with large amounts of data, so I would prefer to not use a cursor for performance sake (which is what I was previously doing, and was slowing things down quite a bit)...What is the best way to achieve this? I'm no SQL pro, so I'm not sure if there is a function available that could make this easier, or what the fastest way to accomplish this would be.

Thanks for the help.

Joseph at SwiftOtter
  • 4,261
  • 5
  • 36
  • 55
Andrew Backes
  • 1,734
  • 2
  • 18
  • 36

4 Answers4

36

The key observation is that a sequence of numbers minus another sequence is a constant. We can generate another sequence using row_number. This identifies all the groups:

select id, MIN(number) as low, MAX(number) as high
from (select t.*,
             (number - ROW_NUMBER() over (partition by id order by number) ) as groupnum
      from t
     ) t
group by id, groupnum

The rest is just aggregation.

Gordon Linoff
  • 1,122,135
  • 50
  • 484
  • 624
3

Solution with CTE and recursion:

WITH CTE AS (
  SELECT T.ID, T.NUMBER, T.NUMBER AS GRP
  FROM T 
  LEFT OUTER JOIN T T2 ON T.ID = T2.ID AND T.NUMBER -1 = T2.NUMBER 
  WHERE T2.ID IS NULL
  UNION  ALL
  SELECT T.ID, T.NUMBER, GRP
  FROM CTE 
  INNER JOIN T
  ON T.ID = CTE.ID AND T.NUMBER  = CTE.NUMBER + 1
)
SELECT ID, MAX( NUMBER ), MIN(NUMBER)
FROM CTE
GROUP BY ID, GRP

Results at fiddlesql

dani herrera
  • 39,746
  • 4
  • 87
  • 153
0

I'd suggest using a WHILE loop structure with a table variable instead of the cursor.

For example,

DECLARE @TableVariable TABLE
(
    MyID int IDENTITY (1, 1) PRIMARY KEY NOT NULL,
    [ID] int,
    [Number] int
)

DECLARE @Count int, @Max int

INSERT INTO @TableVariable (ID, Number)
SELECT ID, Number
FROM YourSourceTable

SELECT @Count = 1, @Max = MAX(MyID)
FROM @TableVariable

WHILE @Count <= @Max
BEGIN

    ...do your processing here...


    SET @Count = @Count + 1

END
Darth Continent
  • 2,251
  • 2
  • 25
  • 41
  • What? No; loops in SQL are almost always the wrong thing to do. SQL was designed with set logic in mind, and implementations will usually be most performant when written to take advantage of this. I also have a feeling this sort of `WHILE` loop will use some of the same underlying processes as a cursor, which would leave you in just as bad a spot as before. Besides which, you haven't shown the necessary conditions to reset `@Count`... – Clockwork-Muse Jan 02 '13 at 16:36
  • 1
    In conjunction with a table variable there've been a lot of cases where the performance gains have been well worth it. `@Count` in this case wouldn't need to be reset, just incremented as each record in the table variable is processed, though that might change depending on OP's requirements. – Darth Continent Jan 02 '13 at 16:41
0
CREATE TABLE Table1
    ([ID] int, [Number] int)
;

INSERT INTO Table1
    ([ID], [Number])
VALUES
    (1, 25),
    (1, 26),
    (1, 30),
    (1, 24),
    (2, 4),
    (2, 8),
    (2, 5)
;

    select ID, 
           MIN(Number)
         ,(SELECT MIN(Number) 
                  FROM (SELECT TOP 2 Number from Table1 WHERE ID =
                  T1.Id ORDER BY Number DESC) as DT)
    from Table1 as T1
    GROUP BY ID
    UNION 
    SELECT ID, MAX(Number), MAX(Number)
    FROM Table1 as T1
    GROUP BY ID;

Live Example

Vishwanath Dalvi
  • 31,604
  • 36
  • 115
  • 146