I've just started using the luigi
library. I am regularly scraping a website and inserting any new records into a Postgres database. As I'm trying to rewrite parts of my scripts to use luigi
, it's not clear to me how the "marker table" is supposed to be used.
Workflow:
- Scrape data
- Query DB to check if new data differs from old data.
- If so, store the new data in the same table.
However, using luigi's postgres.CopyToTable
, if the table already exists, no new data will be inserted. I guess I should be using the inserted
column in the table_updates
table to figure out what new data should be inserted, but it's unclear to me what that process looks like and I can't find any clear examples online.