SCD Type 1

<aside> 💡 In SCD Type 1, if an item is updated, the existing record is overwritten. If an item is added, the existing record is added

</aside>

SCD type 1 ensures that there are no duplicate records in the table and that the data reflects the most recent current dimension.

Initial Table

Product Sales Revenue
Widget A 1,250 $12,500
Widget B 875 $8,750
Widget C 1,500 $15,000
Widget D 2,100 $21,000
Widget E 1,375 $13,750

Transactions

After Upserting

Product Sales Revenue
Widget A 1,250 $12,500
Widget B 900 $10,000
Widget C 1,500 $15,000
Widget D 2,100 $21,000
Widget E 1,375 $13,750
Widget H 250 $2500

Implementation

This can be done efficiently in Databricks using the Merge Command

MERGE INTO old_sales
USING new_sales
ON old_sales.Product = new_sales.Product
WHEN MATCHED THEN
  UPDATE SET 
    old_sales.Sales = new_sales.Sales,
    old_sales.Revenue = new_sales.Revenue
WHEN NOT MATCHED THEN
  INSERT (old_sales.Product, 
  old_sales.Sales, 
  old_sales.Revenue) 
  VALUES (
    new_sales.Product, 
    new_sales.Sales, 
    new_sales.Revenue
  )

SCD Type 2