ETL+ Load Replace
Load Replace landing page for help links from ETL+.
⚒️ The Load Replace panel
The user interface for Load Replace appears to represent a conventional, SQL like, filter as might appear in a SQL WHERE clause. However, the internal/behind-the-scene process is a multi-step process.
Internally, (1) first the T-SQL expression (right hand side) is evaluated, then (2) a filtered extract (staging table) is created from the source using the query system used in the connection to the source object, and (3) finally the staging table is loaded into the target table.
Most of the time the internal logic anticipates your needs and the end results work out as expected.
Concepts
Replace is a complement to Load All. Replace works like a Load All based on a selected records (a filtered extract) from the source table. For comparison, Load All extracts all records from the source table and loads them all into the target table in the data warehouse, completely ‘replacing’ all rows in the target table with new data.
Load All and Load Replace Adds, Updates or Deletes Rows in the Target Table
Internally, Load All ensures that all rows in the target table are ‘replaced’ by first deleting all records in that target before beginning to add (load) new rows.
Use Cases
Replace works most easily with:
Data that doesn’t change after some date or interval of time or
Data where you only need to extract the more recently added records.
Load Replace Page
The on page instructions for Load Replace read:
“1st: Insert rows based on the condition. Ongoing: Replace rows based on the condition.” WHERE
“1st” means the load operation when the target table is empty (contains zero rows).
The “condition” refers to the conditional expression Source Field >= Target Value or Constant.
Source Field. Name of the field/row on the source table.
This column should identify the order of the rows creation in the table.Target Value or Constant. Is a T-SQL expression performed on the target table.
This expression is evaluated against the target table and the resulting value is used to filter the data extracted from the source table.
The expression can be:a Constant/literal value. E.g.; ‘
1/1/2022'
A T-SQL expression using the MAX function.
A T-SQL expression using a date/datetime function. For examples see T-SQL Date Expressions
Example
When to Use Replace
When there is a single column identifying the relative order in which rows were added to the table and where the value assigned to the column is not changed. For example a Document Date or an always ascending Document Number where the value is assigned once and not changed.
Do Not use columns that can have their content periodically updated such as LastModifiedDateTime columns. If all date columns can be periodically updated then use Load Upsert instead.
When you only need to re-load those records from a recent period that might have had records modified and/or deleted.
Target Value or Constant field Expressions for Dates and Fiscal Periods
Examples of T-SQL expressions for reloading with Load Replace from recent periods:
Date expressions:
Last 7 days:
DATEADD(DAY, - 7, GETDATE())
Last day of of the prior calendar month:
EOMONTH(GETDATE(), - 1)
For more examples see T-SQL Date Expressions
Fiscal Period expressions (YYYYPP)
Beginning of this year:
SELECT [FY_NoInt4]*100 FROM [SchemaName].[4_Today]
For more examples see T-SQL Date Expressions
Load Replace Requirements
Availability of a column to control the delta load subset. Example: a date or a sequentially growing value.
The source table allows filtered data extraction.
After the initial load, delta loads need columns and formats compatible with the data warehouse table. Addition of new columns and/or increase in column formats require a Load All load to sync up.
Rows prior to the max control column value don’t change in the data warehouse. If that can happen, use
Force Load All Option
The Force Load All option on the Job properties panel of the the ETL+ Job page supports maintaining tables normally refreshed with Load Replace.
The Replace Concept
⚒️ Scenarios & Use Cases
Using a Constant/literal value in the T-SQL/right hand expression.
Using a MAX function in the T-SQL/right hand expression.
Using a T_SQL datetime function in the T-SQL/right hand expression.
Using a datetime field for the source field/left hand side.
Using a sequentially incremented key (e.g. a invoice id that is based on a counter that is incremented by 1 after every new invoice record is added).
Concepts
Replace is a type of delta loading option that dramatically speeds up the load after the initial one.
The initial replace is a special case. When a MAX( )
aggregate function is specified in Target T-SQL entry and Replace is run against a new (and therefor empty) target table the load operation is Load All.
Replace deletes all target table’s rows that meet the filter criteria and re-loads rows that meet that criteria from the source table. For example, all data is loaded in the initial InvoiceHeader table extraction, and after it only adds/replaces rows where the control column values >= last extracted InvoiceDate.
Replace Process
1st / Initial Replace (Special Cases)
Concepts. Before Load Replace can find data to replace in the target data warehouse table there must be rows in the table. Pragmatically, the first time Load Replace is run on a target table we want to load all data from the source.
Logic. What actually triggers a Load All, or it’s logical equivalent, is the resolution of the T-SQL entered on the right hand / Target T-SQL side. Any resolution of the Target T-SQL clause that results in a NULL triggers a Load All operation.
A T-SQL
MAX
function against a table that is empty (contains zero records) or doesn't yet exist is evaluated as a NULL.
Ongoing
Evaluates the Target T-SQL clause.
In the screenshot above, this would be the maximum value of Invoice Date.IF the Target T-SQL clause from the step above resolves to NULL
THEN run Load All and end the load.
Otherwise continue with the next step.Delete all data warehouse rows where Filter Column >= Filter Value.
Load (append) all source table rows where Filter Column >= Filter Value onto the data warehouse table.
The load is first written into a temporary table. Next the appended to the target table.
Related Pages
T-SQL String Constants (String Literals) The Replace option can specify literals.
Job properties panel, Load type options