restartability in db2 enterprise stage

lakshmipriya · Post by **lakshmipriya** » Fri Sep 17, 2004 5:41 am

While writing in the DB2 Enterprise stage, if the job is aborted what is the flexibility of restarting it by deleting the written records in the previous unsuccessful run in the same job other than looking up the same table again in the same job before writing it or running one more job for solving this purpose.

Is there any feature in this stage to solve this problem

ray.wurlod · Post by **ray.wurlod** » Fri Sep 17, 2004 3:34 pm

Best practice dictates that every record loaded is marked with some form of identification of the load run, perhaps a batch ID, perhaps a timestamp.
If you do this, unwinding inserts becomes a simple constrained DELETE statement.

If you've stage a before-image of the table you're loading, unwinding becomes an only slightly more complex operation using rows from the before-image. This would the approach to take if you ever perform UPDATE or DELETE operations into DB2.

There's nothing in the ETL tool; what you're asking to do isn't strictly ETL. So you have to plan and to design your recovery strategy.

dsxdev · Post by **dsxdev** » Tue Sep 21, 2004 9:32 pm

lakshmipriya wrote:While writing in the DB2 Enterprise stage, if the job is aborted what is the flexibility of restarting it by deleting the written records in the previous unsuccessful run in the same job other than looking up the same table again in the same job before writing it or running one more job for solving this purpose.

Is there any feature in this stage to solve this problem

In DB2 stage you have options Update Commit Interval and Row Commit Interval. These are available if the write mode is Upsert and only Row Commit is available when write mode is write(insert only)

Row Commit Interval would commit after specified no of records were inserted and if job aborts it rolls back these records.

lakshmipriya · Post by **lakshmipriya** » Tue Sep 21, 2004 10:37 pm

I am using write mode. Thanks of ur great response, i hope this would be help full for me. Could you please through some more light on what will be the maximum commit level that we can give to a DB2

ray.wurlod · Post by **ray.wurlod** » Wed Sep 22, 2004 12:17 am

That's a question best posed to your DB2 DBA who must configure sufficient space to handle transactions of the size you desire. Or, more likely, to tell you to use smaller transactions.

kcbland · Post by **kcbland** » Wed Sep 22, 2004 8:29 am

You face a larger issue with a partitioned database load. On high volumes you simply cannot issue a commit at the end, as the data is independently streaming to a node. PX bypasses the coordinator node because it has the ability to read the system catalog. It automatically partitions the data to that of the database and fires off independent processes to directly write each partition. You will not be able to "commit at the end" because of the independent nature of this activity.

What you will have to do is have an audit number in your target table that will identify each row to a given load cycle, like a batch number. To make a self-correcting and restartable load job, you'll have to query the database to find rows you've already loaded and take them out of the load set as you stream it.

What we just finished designing for a major customer of DataStage PX just a restartable template job. The job takes the load-ready data and cuts out the primary key and loads it into an ETL work table in the target and inner-joins it to the target table, extracting any keys already there into a filtering stream. The load data then streams against that filter and only lets thru rows that aren't in that filter stream. To optimize this process, we added a job parameter that would be set to the state of that jobs run to indicate if it's the first try or a retry run. In the case of a first try, no primary key rows get written to the ETL work table, thus causing that extra stream of logic to abbreviate immediately and let all rows pass thru efficiently. A lot of work, but necessary to achieve a high-performance restartable load template job.

DSXchange

restartability in db2 enterprise stage

restartability in db2 enterprise stage

Re: restartability in db2 enterprise stage