Type 0
The Type 0 method is passive. It manages
dimensional changes and no action is performed. Values remain as they were at
the time the dimension record was first inserted
The most common types are I, II, and III.
Type I
This methodology overwrites old with new data, and therefore does
not track historical data. Its common uses are for misspelled names
Example of a Customer table:
Customer_Key
|
Customer_Code
|
Customer_Name
|
Customer_State
|
1
|
ABC
|
Oracle
|
CA
|
In the above example, Customer_Code is the natural key and Customer_Key
is a surrogate key.
If the Customer relocates the headquarters to New Jersey. The
updated table would overwrite this record:
Customer_Key
|
Customer_Code
|
Customer_Name
|
Customer_State
|
1
|
ABC
|
Oracle
|
NJ
|
Cons:
·
No history in the data
warehouse.
·
If you have calculated
an aggregate table summarizing facts by state, it will need to be recalculated
when the Customer_State is changed.
Pros: It's easy to maintain.
Type II
This method tracks historical data by creating multiple records
for a given natural key in
the dimensional tables with separate surrogate keys and/or different version numbers. Unlimited history is
preserved for each insert.
For example, if the Customer relocates to New Jersey the version
numbers will be incremented sequentially:
Customer_Key
|
Customer_Code
|
Customer_Name
|
Customer_State
|
Version.
|
1
|
ABC
|
Oracle
|
CA
|
0
|
2
|
ABC
|
Oracle
|
NJ
|
1
|
Another method is to add 'effective date' columns.
Customer_Key
|
Customer_Code
|
Customer_Name
|
Customer_State
|
Start_Date
|
End_Date
|
1
|
ABC
|
Oracle
|
CA
|
01-Jan-2005
|
21-Dec-2009
|
2
|
ABC
|
Oracle
|
NJ
|
22-Dec-2009
|
The null End_Date in row two indicates the current record version.
Pros: An aggregate table summarizing facts by state
continues to reflect the historical state, i.e. the state the Customer was in
at the time of the transaction; no update is needed.
Cons: If there are retrospective changes made to the
contents of the dimension, or if new attributes are added to the dimension (for
example a Customer_Rep column) which have different effective dates from those
already defined, then this can result in the existing transactions needing to
be updated to reflect the new situation. This can be an expensive database
operation, so Type 2 SCDs are not a good choice if the dimensional model is
subject to change.
No comments:
Post a Comment