Page 1 of 1

How to do a change capture for a Slowly Changing Dimension

Posted: Thu Nov 21, 2013 6:45 pm
by pavi
Hello Team,

Can any one please resolve this issue?The scenario is

After:

Key,Value
ABC,10
ABC,10

Before:
Key,Value
ABC,20

O/P:

Key,Value,Change code
ABC,10,Update
ABC,10,Insert


Expectation is both after records are updates.

Am I missing any thing

Posted: Thu Nov 21, 2013 9:27 pm
by Mike
Your missing the fact that the change capture stage requires unique keys on both of its inputs.

Mike

Posted: Fri Nov 22, 2013 12:31 pm
by pavi
Thanks Mike for your response, But it didn't work.

If I understood well, the modified scenario is
Today:

Key,Value ,QTY
ABC,10,1
ABC,10 ,2

Yesterday:
Key,Value
ABC,20

O/P:

Key,Value,Change code,QTY
ABC,10,Update ,1
ABC,10,Insert,2

Posted: Fri Nov 22, 2013 1:19 pm
by Mike
You didn't understand... How many rows with a key value of "ABC" do you have in your today input? If the answer is not one, then the change capture stage won't produce expected output.

If you have a scenario where yesterday is a "master" file and today contains multiple update transactions, the Merge stage might supply functionality you can use.

Mike

Posted: Fri Nov 22, 2013 1:56 pm
by pavi
Thanks alot Mike for quick response.

Our source is a transaction DB where I can get duplicate keys with same value and the fact information might vary based on quantity of the products bought by customers. We are using change capture to see if there is an update of insert. I believe Merge stage doesn't tell whether a record is an update or delete. So I am thinking to do remove duplicates on Key before applying change capture.

DB---> RemDup---> Change capture--> apply Type2 logic---> insert/updateDB

We know from requirements that we will get dups from source with same value but the fact Qty might change.

Am I on the right direction? Please suggest

Posted: Fri Nov 22, 2013 2:13 pm
by Mike
You'll need to think through your requirements...

Do you need to record every transaction?
Do you only need to record the last transaction?
How do you identify the last transaction?
etc.

Mike

Posted: Fri Nov 22, 2013 2:25 pm
by pavi
Mike we record every transaction. So if we have 3 transactions which occurred for a product we are pulling all three. One of the subset requirements is if these three transactions have same key then it will be either an update or an insert. We need to push three records into fact and 1 record as an update into dimension. As it is a Type 2, we need to insert one record in a dimension and update the previous one.

Posted: Fri Nov 22, 2013 2:36 pm
by Mike
Well, if you need to record all transactions, then removing duplicates is not going to be a good idea.

The SCD Stage handles multiple transactions and was made for maintaining slowly changing dimensions.

Mike

Posted: Fri Nov 22, 2013 3:34 pm
by pavi
Thanks Mike for quick response!!

Out of my curiosity how to implement this scenario in DataStage 7 where we don't have SCD stage?

Posted: Fri Nov 22, 2013 9:01 pm
by ray.wurlod
Your original post specified 8x, which is probably why Mike responded as he did.

In earlier versions you have to do all the coding yourself; the lookup on the business key, the testing to see whether the row exists and, if so, whether there are any changes, the generation of the new key value and the handling of the SCD2 dimension (SCD1 is a simple update).

Posted: Sat Nov 23, 2013 9:05 am
by Mike
pavi wrote:Out of my curiosity ...
Since it was just your curiosity, I didn't feel the urge to describe it for a DataStage version that is out of support.

Ray's description of the old design pattern makes it obvious just how convenient the new SCD stage is.

One thing to add... since you have multiple change transactions coming on the stream input, the testing for type 2 changes becomes a little more complex... you need to check both the lookup results and the previous input record to determine changes.

Of course, if you'd really like to satisfy your curiosity, you can still create the 7X design pattern in an 8X environment.

Mike

Posted: Sat Nov 23, 2013 9:39 am
by chulett
ray.wurlod wrote:In earlier versions you have to do all the coding yourself
It's not really all that hard. You dang kids with your fancy smancy SCD stage, get off my lawn! :P

Posted: Wed Nov 27, 2013 2:23 pm
by pavi
Thanks a lot Mike, Ray,Chullet..

Even though SCD is a robust stage to handle this scenorio.I implemented it in the way we implement in 7x as it shows me whats going on rather than bundling it in single stage.

Thanks a lot again for you time and response.