dwh-setup using compare and difference-stages

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jasper
Participant
Posts: 111
Joined: Mon May 06, 2002 1:25 am
Location: Belgium

dwh-setup using compare and difference-stages

Post by jasper »

I'm setting up a new DWH and now we're working on the templates for filling dimensions. What we now have is that everything is coming in daily(no delta's but complete tables), and we transform this into the same format as the dimension table.
At this point we need to do 3 actions
-check for inserts: all records in input and not in dwh
-check for updates : all records in input for which we find the key in dwh, and for which relevant columns have changed
-check for deletes:records in dwh that are not in the inputs anymore.

When going trough the documentation of DS7.5 I've found 2 stagetypes which seem to be ideal for this, but with which I'm not familiar: the difference-stage and the compare stage.

Would it be a wise setup to use the following as template:
-create 2 ordered sets:one for the input, one for the current status of the dimension.
-use a difference with input as after and dwh as before : this will give the inserts
-use a compare :this will give us the updates
-use a difference with dwh as after and input as before: this will give us the deletes

main advantage I see is that this can easily be reused, main disadvantage seems the large amount of data we'll need to process.
Any one else any idea's?
jasper
Participant
Posts: 111
Joined: Mon May 06, 2002 1:25 am
Location: Belgium

Post by jasper »

next time I'll think longer before posting, but Actually it seems one difference-stage can provide me with all 3 outputs
mandyli
Premium Member
Premium Member
Posts: 898
Joined: Wed May 26, 2004 10:45 pm
Location: Chicago

Post by mandyli »

Hi

I hope you are going to achieve SCD (Slowly changing dimension). Instead of compare stage use Change Capture Stage. Change Capture Stage only fulfills your requirements.

More information about Change Capture Stage please ref parjdev.pdf document at page 30.
jasper
Participant
Posts: 111
Joined: Mon May 06, 2002 1:25 am
Location: Belgium

Post by jasper »

thanx for the hint, but when I try this it seems harder to get information for deleted records.
In the compare stage I can easily log some info from the before record for the ones that are not in the after image (which can make it easier to do the correct updates for this delete). this looks more difficult from the change capture (or am i missing something?)
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi Jasper,

We are doing exactly what you want.
Jasper wrote:At this point we need to do 3 actions
-check for inserts: all records in input and not in dwh
-check for updates : all records in input for which we find the key in dwh, and for which relevant columns have changed
-check for deletes:records in dwh that are not in the inputs anymore.
As mandy suggested earlier we are using the change capture stage and it works like a charm.

Pls search this forum. I have already discussed this in detail.

HTH
--Rich

Pride comes before a fall
Humility comes before honour
Post Reply