1

Folks!

I need some ideas, with the follow problem:

I have two tables:

Table 1:

+-------+------------+---------+
| ID    | field_name | value   |
+-------+------------+---------+
| 1     | usd        |  10.08  |
| 1     | gross_amt  |  52.0   |
| 1     | jpy        |  30.05  |
| 2     | usd        |  50.0   |
| 2     | eur        |  50.0   |
| 3     | real_amt   |  210.43 |
| 3     | total      |  320    |
| 4     | jpy        |  23.45  |
| 4     | name       |  john   |
| 4     | city       |  utah   |
+-------+------------+---------+

Table 2:

+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| ID  |  name | last_name |   date1  | counrty | city |  usd  |  eur  |  jpy  | gross_amt | real_amt | total | ... | field200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| 1   |  jane | doe       | 19900108 |   usa   | LA   | 9.08  | 0.00  | 29.05 | 50.0      |  52.0    | 900.0 | ... | value200 |
| 2   |  lane | smith     | 19900108 |   usa   | LA   | 40.8  | 40.0  | 0.00  | 100.0     |  70.0    | 290.0 | ... | value200 |
| 3   |  mike | hoffa     | 19900108 |   usa   | SF   | 5.05  | 0.00  | 0.00  | 10.0      |  25.0    | 100.0 | ... | value200 |
| 4   |  paul | doe       | 19900108 |   usa   | NY   | 1.00  | 0.00  | 29.05 | 45.0      |  55.0    | 110.0 | ... | value200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+

I need to update the values of the fields in the table 2, which are in table 1 in column field_name, with the values of table 1 column value, both IDs are the same in both tables, beside that, the datatype of column value in table 1 are string, but the data type of the columns to update in the table 2 are diferent, especially the numbers(numeric, int64, float64)

The tables above are an example, table 2 of the real problem has 200 fields and in table 1 for an ID there can be up to 40 value modifications for thousands of records to be modified daily

Thanks

I have tried the following two solutions:

Solution 1 (it works, but very slow, it is a lot of records):

DECLARE SQLSCRIPT STRING DEFAULT '';
DECLARE col, val, id STRING;
DECLARE n INT64;
DECLARE i INT64 DEFAULT 1;

SET n= (SELECT COUNT(*) FROM `project.dataset.table1`);

WHILE i <= n DO
    SET col = (SELECT col FROM `project.dataset.table1`  LIMIT 1);
    SET val = (SELECT val FROM `project.dataset.table1`  LIMIT 1);
    SET id = (SELECT id FROM `project.dataset.table1`  LIMIT 1);
    SET SQLScript = (SELECT CONCAT('UPDATE `project.dataset.table2`` SET ',col,' = ',val,' WHERE id = ','"',id,'"'));
    SET i = i + 1;
END WHILE;
EXECUTE IMMEDIATE  SQLSCRIPT;

Solution 2 (I can't get it to work, it gives me the following error):

[error execution Big Query] [1]: https://i.stack.imgur.com/Pv44T.png

EXECUTE IMMEDIATE (SELECT STRING_AGG('UPDATE `project.dataset.table2` SET '||x.col||'="'||x.val||'" WHERE id = "'||x.id||'"', ';')
                      FROM UNNEST((SELECT ARRAY_AGG(STRUCT(id, col, val))
                                    FROM `project.dataset.table1`)) AS x);
Leo
  • 47
  • 5
  • 1
    please stop making updates to your original question - it is fully answered already. if you have new question(s) - please post a new question!! and please revert back your original question so there is no confusions and provided answer is still valid as it was given for original question! – Mikhail Berlyant Oct 08 '20 at 15:38
  • and please do not worry about your new questions to not be addressed - a bunch of experts and enthusiasts are here 24:7 so for sure someone will pick your new question and answer it - so no need to call out specific users - and btw. be sure sooner (mostly) or later I check all bq related questions even those which answered and even accepted already :o) – Mikhail Berlyant Oct 08 '20 at 15:59
  • Done sir!, sorry my ignorance in the use of the site, return the post to its previous state, I really appreciate your comments and advice, it would be very helpful if you take a look at the new post, thanks for everything: https://stackoverflow.com/questions/64266729/how-update-a-table-in-big-query-and-store-the-original-value-replaced-and-her-di – Leo Oct 08 '20 at 16:30

2 Answers2

1

Below is for BigQuery Standard SQL

EXECUTE IMMEDIATE '''
CREATE TEMP TABLE pivot1 AS
SELECT id, ''' || (
  SELECT STRING_AGG(DISTINCT "MAX(IF(field_name = '" || field_name || "', CAST(value AS " || data_type || "), NULL)) AS " || field_name)
  FROM `project.dataset.table1`
  JOIN (
    SELECT column_name, data_type
    FROM `project.dataset.INFORMATION_SCHEMA.COLUMNS`
    WHERE tablename = 'table2' 
  ) ON field_name = column_name
) || '''  
FROM `project.dataset.table1`
GROUP BY id
''';

EXECUTE IMMEDIATE '''
MERGE `project.dataset.table2` AS t2
USING pivot1 AS t1
ON t2.id = t1.id
WHEN MATCHED THEN
  UPDATE SET
''' || (
  SELECT STRING_AGG(DISTINCT field_name || ' = IFNULL(t1.' || field_name || ', t2.' || field_name || ')')
  FROM `project.dataset.table1` 
);

SELECT * FROM `project.dataset.table2` ORDER BY id;

If to apply to sample data (table1 and table2) from your question - output is (updates are highlighted)

enter image description here

Mikhail Berlyant
  • 117,385
  • 6
  • 77
  • 139
0

One way is to create a pivoted (wide column) view of table_1 and then update table_2.

Creating tables with sample data

CREATE OR REPLACE TABLE `project.dataset.table_1` AS
  SELECT 1 AS id, 'usd' AS field_name, 10.08 AS value UNION ALL
  SELECT 1 AS id, 'gross_amt' AS field_name, 52.0 AS value UNION ALL
  SELECT 1 AS id, 'jpy' AS field_name, 30.05 AS value UNION ALL
  SELECT 2 AS id, 'usd' AS field_name, 50.0 AS value UNION ALL
  SELECT 2 AS id, 'eur' AS field_name, 50.0 AS value UNION ALL
  SELECT 3 AS id, 'real_amt' AS field_name, 210.43 AS value UNION ALL
  SELECT 3 AS id, 'total' AS field_name, 320.66 AS value UNION ALL
  SELECT 4 AS id, 'jpy' AS field_name, 23.45 AS value;


CREATE OR REPLACE TABLE `project.dataset.table_2` AS
  SELECT 1 AS id, 9.08 as usd, 0.00 as eur, 29.05 AS jpy, 50.0 AS gross_amt,52.0 as real_amt, 900.0 AS total UNION ALL
  SELECT 2, 40.8, 40.0, 0.00, 100.0, 70.0, 290.0 UNION ALL
  SELECT 3, 5.05, 0.00, 0.00, 10.0, 25.0, 100.0 UNION ALL
  SELECT 4, 1.00, 0.00, 29.05, 45.0, 55.0, 110.0;

Now, create a pivoted view first and then use MERGE to update using the pivoted view

BEGIN
  -- pivot rows from table (rows to columns) and create a view
  EXECUTE IMMEDIATE '''
  CREATE OR REPLACE VIEW `project.dataset.view_1` AS
  SELECT id, ''' || (SELECT STRING_AGG(DISTINCT "MAX(IF(field_name = '" || field_name || "', value, 0.0)) AS " || field_name)
  FROM `project.dataset.table_1` 
  ) || '''  
  FROM `project.dataset.table_1`
  GROUP BY 1
  ''';

  -- update table_2 based on table_1 values
  EXECUTE IMMEDIATE '''
  MERGE `project.dataset.table_2` AS TABLE_2
  USING `project.dataset.view_1` AS view_1
  ON
    table_2.id = view_1.id
  WHEN MATCHED THEN
    UPDATE SET
  ''' || (SELECT STRING_AGG(DISTINCT field_name || ' = view_1.' || field_name)
  FROM `project.dataset.table_1` 
  );
END;
raj
  • 914
  • 5
  • 8
  • Thanks @rajesh, I have updated the question, with more information to keep in mind, I appreciate if you can give it another look – Leo Oct 03 '20 at 12:02
  • I see you added you have couple hundred fields and tens of modifications. Did you try the above solution? What is it not working? – raj Oct 03 '20 at 14:29
  • another problems @rajesh, when pivoted into view_1 all data type are cast to string, and when execute the merge, several fields have datatype incompatibilities, all fields numerics, float64 and int64 – Leo Oct 05 '20 at 19:48