0

I hoping to find the answer on how I can fill in lead and lag values.

Input:

BRAND   Promo_Start VALUE
TESLA   2016-06-05  NULL
TESLA   2016-06-12  NULL
TESLA   2016-06-19  40000
TESLA   2016-06-26  75000
TESLA   2016-07-03  75000
TESLA   2016-07-10  NULL

My desired output is:

BRAND   Promo_Start VALUE
TESLA   2016-06-05  40000
TESLA   2016-06-12  40000
TESLA   2016-06-19  40000
TESLA   2016-06-26  75000
TESLA   2016-07-03  75000
TESLA   2016-07-10  75000

I've been able to fill in the last value, however, I've not been successful with filling in the first two values.

BRAND   Promo_Start VALUE   FILLED_VALUE
TESLA   2016-06-05  NULL    NULL
TESLA   2016-06-12  NULL    NULL
TESLA   2016-06-19  40000   40000
TESLA   2016-06-26  75000   75000
TESLA   2016-07-03  75000   75000
TESLA   2016-07-10  NULL    75000

Using this query:

WITH help1 as (
SELECT *,
CASE WHEN [VALUE] IS NULL THEN 0 ELSE 1 END CHANGEINDICATOR
FROM #SOExample
)
, help2 as (
SELECT *, SUM(CHANGEINDICATOR) OVER (ORDER BY [Promo_Start]) RowGroup from help1
)
SELECT [BRAND],[Promo_Start],[VALUE],
CASE WHEN [VALUE] IS NOT NULL THEN [VALUE]
ELSE FIRST_VALUE([VALUE]) OVER (PARTITION BY RowGroup ORDER BY [Promo_Start])
END [FILLED_VALUE]
FROM help2
GO

Table was created using:

CREATE TABLE #SOExample
    ([BRAND] varchar(10),[Promo_Start] varchar(10), [VALUE] varchar(15))
;
GO

INSERT INTO #SOExample
    ([BRAND],[Promo_Start], [VALUE])
VALUES
    ('TESLA', '2016-06-05',NULL),
    ('TESLA', '2016-06-12',NULL),
    ('TESLA', '2016-06-19','40000'),
    ('TESLA', '2016-06-26','75000'),
    ('TESLA', '2016-07-03','75000'),
    ('TESLA', '2016-07-10',NULL)
;
GO

I think this question may be similar to: LAG() / LEAD() of the next rank (Postgresql) and I've looked at this for creating a flag and this because it seems to be a similar question.

I've also looked into dense_rank and using a change indicator (change from NULL to a value in next row and value to NULL in next row).

Uwe Keim
  • 36,867
  • 50
  • 163
  • 268
Mariah Akinbi
  • 234
  • 1
  • 3
  • 13
  • what is the required logic ? Get the next or previous non-null VALUE ? if `2016-06-26` has a NULL value, what should be the value ? – Squirrel Aug 24 '18 at 05:01
  • @Squirrel Hi! Yes, the logic is shown in the output table listed below the input table. If the original value is not NULL, then the output value should be the same as the original. So because 2016-06-26 has a non-null value, then it's output value should be the same as its original, 75000. – Mariah Akinbi Aug 24 '18 at 05:09

2 Answers2

1

there there are 2 OUTER APPLY, first one get the next non-null value, second one get the prev non-null value.

SELECT  *, NEW_VALUE = COALESCE ( so.VALUE, n.FILLED_VALUE, p.FILLED_VALUE)
FROM    #SOExample so
        OUTER APPLY
        (
            SELECT  TOP 1 FILLED_VALUE = x.VALUE
            FROM    #SOExample x
            WHERE   x.BRAND     = so.BRAND
            AND     x.Promo_Start   >= so.Promo_Start
            AND     x.VALUE     IS NOT NULL
            ORDER BY x.Promo_Start
        ) n
        OUTER APPLY
        (
            SELECT  TOP 1 FILLED_VALUE = x.VALUE
            FROM    #SOExample x
            WHERE   x.BRAND         = so.BRAND
            AND     x.Promo_Start   <= so.Promo_Start
            AND     x.VALUE     IS NOT NULL
            ORDER BY x.Promo_Start DESC
        ) p
Squirrel
  • 17,861
  • 3
  • 31
  • 30
  • this works beautifully! Thank you!! How can I drop the FILLED_VALUE columns? Could I possibly use a CASE WHEN statement? – Mariah Akinbi Aug 24 '18 at 05:52
  • 1
    Don't use `SELECT *`. Specify the column name explicitly – Squirrel Aug 24 '18 at 06:19
  • If I'm looking to fill in multiple values, how could this be reapplied? I have [BRAND],[Promo_Start],[VALUE],[Resale Value],[Otra Value] and I want [BRAND],[Promo_Start],[VALUE],[FILLED_VALUE],[Resale Value], [Filled Resale Value], [Otra Value], [Filled Otra Value]. Often times the first non-null value for VALUE, Resale Value, and Otra Value are not in the same row. – Mariah Akinbi Aug 28 '18 at 20:39
  • you will required multiple `OUTER APPLY` section. Each for one column – Squirrel Aug 29 '18 at 02:03
  • that's what I ended up doing! Thank you! – Mariah Akinbi Aug 30 '18 at 03:34
1

You just need to handle the case when RowGroup = 0

WITH help1 as (
SELECT *,
CASE WHEN [VALUE] IS NULL THEN 0 ELSE 1 END CHANGEINDICATOR
FROM #SOExample
)
, help2 as (
SELECT *, SUM(CHANGEINDICATOR) OVER (ORDER BY [Promo_Start]) RowGroup from help1
)
SELECT [BRAND],[Promo_Start],[VALUE],
CASE WHEN RowGroup = 0 THEN (Select [value] from help2 where RowGroup = 1)
ELSE FIRST_VALUE([VALUE]) OVER (PARTITION BY RowGroup ORDER BY [Promo_Start])
END [FILLED_VALUE]
FROM help2
Ajay Gupta
  • 1,592
  • 1
  • 7
  • 21