-1

I have 3 tables and I need to check if they have a positive number in the "rate" column. Tables have different schema but they all have a column named "rate". I am working with airflow and I created a DAG which has a task to check if the rate in not a positive number then fail. So in a simple word, I want my SQL fails if there is an incorrect rate in any of those tables. I have done this:

WITH t1 AS (
SELECT deliverydate, rate
  FROM table1
  WHERE rate <= 0
),
t2 AS (
  SELECT deliverydate, rate
  FROM table2
  WHERE rate <= 0
),
t3 AS (
  SELECT deliverydate, rate
  FROM table3
  WHERE rate <= 0
)
SELECT 1 FROM (
  SELECT * FROM t1
  UNION ALL
  SELECT * FROM t2
  UNION ALL
  SELECT * FROM t3
)

In this case, t1, t2 and t3 will report rows with wrong rate(0 or negative rates). And Select 1 fails if there is no wrong rate.

But my goal is: pass if all the rates are correct and fail if at least one record has wrong rate and show me that record to fix it.

SQL queries run on bigquery standard

Gordon Linoff
  • 1,122,135
  • 50
  • 484
  • 624
Sana
  • 216
  • 2
  • 12
  • Tag your question with the database you are using. Also, explain what you want for the results. A value that indicates passing versus failing? Some sort of error (which `select` generally is not used for)? 1 row for failure and 0 rows for success? – Gordon Linoff Sep 18 '20 at 11:11
  • @GordonLinoff bigquery standard sql – Sana Sep 18 '20 at 11:15

1 Answers1

1

I would suggest logic like this:

select (case when exists (select 1 from table1 where rate <= 0) then 'FAIL'
             when exists (select 1 from table2 where rate <= 0) then 'FAIL'
             when exists (select 1 from table3 where rate <= 0) then 'FAIL'
             else 'PASS'
        end)

This stops at the first failure and should be pretty efficient. For added performance, include an index on rate in each of the subtables.

Some databases require a from clause, such as from dual.

I doubt this is the best way to do things, but this should generate an error only when there is an error condition using error():

select (case when exists (select 1 from table1 where rate <= 0) then error('Bad rate in table1')
             when exists (select 1 from table2 where rate <= 0) then error('Bad rate in table2')
             when exists (select 1 from table3 where rate <= 0) then error('Bad rate in table3')
             else 'PASS'
        end)
Gordon Linoff
  • 1,122,135
  • 50
  • 484
  • 624
  • thanks for your comment. I don't want to get FAIL and PASS words. I want to actually the query fails or pass. Fail means e.g. if you have an empty table and you run select 1 from table; then it fails and pass means the query runs successfully and return some rows which you queried. – Sana Sep 18 '20 at 13:31
  • @Saba . . . Seems odd to use an *error* from a `select` for this purpose. I would rather use the return value. But you can use the `error()` function in BigQuery. – Gordon Linoff Sep 18 '20 at 13:39
  • thank you for your effort to answer my question. In the first piece of code there will be always a result, fail or pass. I actually want to run this query as an airflow task and I need it to fail if at least one target rate is negative. For example if I run this query: SELECT 1 FROM (SELECT * FROM table1 WHERE age = 1000) after running this query in bigquery because there is no age 1000 in table1 so it shows "This query returned no results.". So it failed. I just want to fail for my task if there is at least one row with rate<=0 – Sana Sep 18 '20 at 15:17