I have a table with these columns: Person_Name, Email, Latest_Post_Title
Every day I scrape new records that include Email, Latest_Post_Title
If a scraped email matches an existing one in my table, I want to update the Latest_Post_Title for that row with the scraped Latest_Post_Title
If a scraped email doesn’t match an existing one in my table, I want to create a new row for it.
I’ve been experimenting with a Full Outer Join, but I question if that’s the best way to do this. (There are actually more columns than described in my simplified example.)
@RIchardC I put together a (hopefully) complete example using H2 and the Northwind database with customerid.
Here we first create a database with a customerid as a primary key (randomly selected from the central DB). Then every 10 seconds another random batch gets drawn. Existing customerid will be updated. Then the workflow determines which customerid are new and would insert them. If a row is inserted initially a timestamp first_inserted will be stored. Then with every update another timestamp last_updated markt the time and update has been performed.
Maybe you can take that example and work with that.