2

I'm trying to figure out how to calculate date ranges (islands) where all workers worked all together. In other words, if one of the workers is not at some date, then this date should be excluded from the result. Below sample data:

insert into WORK_DAYS(WORKER_ID, DATE_FROM, DATE_TO) VALUES(1, '2019-10-01', '2020-04-30');
insert into WORK_DAYS(WORKER_ID, DATE_FROM, DATE_TO) VALUES(1, '2020-05-01', '2020-07-19');
insert into WORK_DAYS(WORKER_ID, DATE_FROM, DATE_TO) VALUES(1, '2020-10-01', '9999-01-01');
insert into WORK_DAYS(WORKER_ID, DATE_FROM, DATE_TO) VALUES(2, '2019-10-01', '2020-04-30');
insert into WORK_DAYS(WORKER_ID, DATE_FROM, DATE_TO) VALUES(2, '2020-05-01', '2020-07-31');
insert into WORK_DAYS(WORKER_ID, DATE_FROM, DATE_TO) VALUES(2, '2020-11-01', '9999-01-01');
insert into WORK_DAYS(WORKER_ID, DATE_FROM, DATE_TO) VALUES(3, '2018-03-12', '2018-08-20');
insert into WORK_DAYS(WORKER_ID, DATE_FROM, DATE_TO) VALUES(3, '2019-10-01', '2020-04-15');
insert into WORK_DAYS(WORKER_ID, DATE_FROM, DATE_TO) VALUES(3, '2020-07-01', '2020-07-31');
insert into WORK_DAYS(WORKER_ID, DATE_FROM, DATE_TO) VALUES(3, '2020-11-01', '9999-01-01');

sample data

I work with Firebird database, but you can show result in any database like SQL Server (but please without cross apply only something defined in SQL standard). This is simplified data retrieved already from gaps/islands problem. As really worker_id in my sample is whole team.

I know how to find overlapping date ranges, but I do not know how to find overlapping date ranges applied to all workers at the same time.

Mark Rotteveel
  • 82,132
  • 136
  • 114
  • 158
Livius
  • 866
  • 1
  • 5
  • 18

1 Answers1

2

Assuming that the workers have no overlaps, you can use a counting trick for this. Count the number of workers working on each date. Then the dates with all the workers are the dates you want.

You don't actually need every date. Assuming that date_to is included as a workday, you can unpivot the data and use a cumulative sum.

The following in the logic as expressed in Postgres (your question does specify that a solution in any database is acceptable and I find that Postges is closest to Standard SQL):

with wd as (
      select worker_id, date_from as dte, 1 as inc
      from work_days wd
      union all
      select worker_id, date_to + interval '1 day', -1 as inc
      from work_days wd
     ),
     wd_cnt as (
      select wd.dte, sum(sum(inc)) over (order by dte) as num_on_date,
             lead(wd.dte) over (order by wd.dte) as next_dte
      from wd
      group by wd.dte
     )
select dte, next_dte - interval '1 day'
from wd_cnt
where num_on_date = (select count(distinct worker_id) from work_days);

This is all standard SQL, but date/time functions differ among databases. Do note that this adds one day to date_to, so don't use the absolute maximum date for this value.

Here is a db<>fiddle.

Gordon Linoff
  • 1,122,135
  • 50
  • 484
  • 624
  • Greate solution! Thank you. I see that you sum number of dates and compare to number of workers. Really simple. Here is link for modified fiddle solution for Firebird 3 https://dbfiddle.uk/?rdbms=firebird_3.0&fiddle=fa83e359e1b1134a9b92884a77b0303e – Livius Feb 11 '21 at 17:50
  • @Livius . . . I forgot that db<>fiddle supports Firebird. I would've used that to start with. Cheers! – Gordon Linoff Feb 11 '21 at 18:18
  • Perhaps i would even materialize it and make an accumulator table (there is no materialized views in FB) of {DATE, WORKERS_COUNT} structure. Then i would put `AFTER UPDATE OR INSERT OR DELETE` SQL trigger on FB `WORK_DAYS` table so that that new table would be always auto-updated. Or perhaps better {DATE, WORKERS_COUNT_EMPLOYED, WORKERS_COUNT_WORKING}. Granted, if granularity would refine form days to hours or seconds this approach would be less and less efficient. But for mere days should be okay. – Arioch 'The Feb 12 '21 at 09:35