Transaction Processing

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
myukassign
Premium Member
Premium Member
Posts: 238
Joined: Fri Jul 25, 2008 8:55 am

Transaction Processing

Post by myukassign »

Hello...

I would to get some assistance and guidance regarding a scenario I would like to handle. I have some ideas in my mind but looking for the best practice.


One of the requirement is, the ETL load should not corrupt the tables with half loaded data anytime. Either the load has to be completed successfully, if not it should reset the status of the table before the beginning of the load. In the target side, I am using DB2 enterprise stage. How can I achieve this ? Can the DB2 connector stage is of any help?


One idea that pop up in my mind is, maintain a temporary table in the staging layer which have the same structure of foundation target table. Use the ETL process to load the data to temporary table and use a SQL merge to move data from staging to foundation. Keep the SQL merge between begin and End transaction so that incase if the final load fails, the data will not be committed.

Please help if I can implement this anyway in ETL.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Many of the places where I've worked preferred an 'all or nothing' like you are describing. We managed this back in the Server world by simply setting the Transaction Size to zero which meant a single commit is issued at the end of the load. Yes, that does mean you need to work with your DBA team to ensure that you have the rollback space you need but I never found that to be an insurmountable object. And there's no need for an intermediate table / two step merge - we did the work directly (whatever it was) and allowed the database to roll it all back on a failure.

You are effectively saying the same thing by wrapping the work inside some kind of transaction control mechanism. I know you can accomplish this with an OBDC stage but otherwise I'm thinking that would require a database procedure to run the merge and commit at the end. I'm not aware of any stage that could manage that other than the one designed for use with MQ.

In all honesty, I don't know how easily you can 'commit once' with a Parallel job unless you constrain it to a single node, otherwise I'm assuming you could get into a situation where one or more nodes succeed and commit while others fail. Even still, from what I recall commits are controlled by two environment variables - one time and one count based - neither of which allow a zero value. Could be wrong but that's what is popping into my head right now. :wink: I can see where in your approach you'd do parallel loads into the 'staging' table but then you have a second non-parallel step to perform the MERGE. It does sound like it could be made to work, however.

We've taken a different approach where I am now. It's specific to how Informatica handles the 'target load plan' but I'd imagine you could manage this in DataStage as well. We track begin and end times in a 'job run' table for everything that runs. When the job starts a new record is inserted with a start time and active status. When the job completes, the record is updated to show a completion status and an end time. If the job aborts, the status of this job run record still shows as active with no end time. This is checked for first thing at the beginning of every run and if an active record is found for the job being run it 'knows' that the last run failed and triggers a custom crafted back-out procedure. Other tables control the target tables involved and the rollback mechanism needed per target. Not always practical but our updates all track a run-id and history in some fashion so inserts are deleted and updates reverted before the load starts over again and a new active control record is created. Just thought I'd throw that out there for the sake of discussion.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply