Db2 bulk load job- Performance Issue
Moderators: chulett, rschirm, roy
Db2 bulk load job- Performance Issue
Hi,
We have a multi-instance Db2 bulk loading job which reads a dataset and loads data to Db2 target via DB2 connector stage. No other stages involved in job except these two
Other details as below
- Approximate volume it handles is around 13 to 15 million records
- Time consumed normally - 1 hr 30 mins plus always. Occasionally 2 hrs+
- Write Mode - Bulk Load
- Table Action - Append
- Transaction and array size - 50000 (although its insignificant in this case I believe)
- Load control properties
Load mode - Insert
Load method - Named Pipe
Save count - 0
Non recoverable load - Yes
If we observe detailed log in director after job start it stays at "db2_Insert_Target,0: DB2 Load is starting ... " and completes only after timeframe mentioned above. Rows/second value observed is around 2800
Queries are -
Is this acceptable behaviour or job should finish much earlier having straight bulk load?
Any issues with respect to parameters configured in db2 connector stage?
Happy to provide any additional information required.
Thanks,
Sachin
We have a multi-instance Db2 bulk loading job which reads a dataset and loads data to Db2 target via DB2 connector stage. No other stages involved in job except these two
Other details as below
- Approximate volume it handles is around 13 to 15 million records
- Time consumed normally - 1 hr 30 mins plus always. Occasionally 2 hrs+
- Write Mode - Bulk Load
- Table Action - Append
- Transaction and array size - 50000 (although its insignificant in this case I believe)
- Load control properties
Load mode - Insert
Load method - Named Pipe
Save count - 0
Non recoverable load - Yes
If we observe detailed log in director after job start it stays at "db2_Insert_Target,0: DB2 Load is starting ... " and completes only after timeframe mentioned above. Rows/second value observed is around 2800
Queries are -
Is this acceptable behaviour or job should finish much earlier having straight bulk load?
Any issues with respect to parameters configured in db2 connector stage?
Happy to provide any additional information required.
Thanks,
Sachin
Sachin C
1) You haven't said anything about average record size... and that's one of the most critical pieces of information.
2) How many records total are in the table?
3) Is this DB2 z/OS or regular DB2?
4) Do you access this database with any other jobs? Do you have ANY jobs that access this database with good (or great) performance?
5) Why do you think Array / Transaction size is inconsequential?
2) How many records total are in the table?
3) Is this DB2 z/OS or regular DB2?
4) Do you access this database with any other jobs? Do you have ANY jobs that access this database with good (or great) performance?
5) Why do you think Array / Transaction size is inconsequential?
Thank you Andy for reply.
1. Record size is as below. Nothing indicating a big record size
Field 1 - integer
Field 2 - varchar(250)
Field 3 - date
Field 4 - time
Field 5 - timestamp
Field 6 - integer
Field 7 - integer
Field 8 - integer
Field 9 - timestamp
Field 10 -char(1)
2. Target table count is something I need to check
3. Regular DB2 - Database server = DB2/AIX64 10.1.3
4. We are getting better performance in another similar job which loads around 30-40 million records in 1/2 an hour or so
5. What I understood from reading DB2 connector load control properties is that its not transaction size but savecount property that defines consistency points in case of bulk load. Looks like I am not interpreting it correctly
Based on 4th point above I had a look at better performing job and only difference in these two jobs is array size and transaction size. The better performing job has it as 2000 while this job creating issue has it as 50000
Time to tweak transaction size and array size in bad performing job??
Thanks,
Sachin
1. Record size is as below. Nothing indicating a big record size
Field 1 - integer
Field 2 - varchar(250)
Field 3 - date
Field 4 - time
Field 5 - timestamp
Field 6 - integer
Field 7 - integer
Field 8 - integer
Field 9 - timestamp
Field 10 -char(1)
2. Target table count is something I need to check
3. Regular DB2 - Database server = DB2/AIX64 10.1.3
4. We are getting better performance in another similar job which loads around 30-40 million records in 1/2 an hour or so
5. What I understood from reading DB2 connector load control properties is that its not transaction size but savecount property that defines consistency points in case of bulk load. Looks like I am not interpreting it correctly
Based on 4th point above I had a look at better performing job and only difference in these two jobs is array size and transaction size. The better performing job has it as 2000 while this job creating issue has it as 50000
Time to tweak transaction size and array size in bad performing job??
Thanks,
Sachin
Sachin C
I haven't done a bulk load to DB2 in ages, and I can't check it at the moment, so this is basically from memory...
The total size of the table and the number of indicies can be important. If I remember correctly, it has to rebuild every index after the bulk load completes - which can take some time and affect the overall job completion time (which lowers RPS).
Also - I don't remember that Array size was ignored. If it isn't, then it is set WAY too large and would be negatively affecting performance.
If someone with more recent experience can chime in... please do so!
The total size of the table and the number of indicies can be important. If I remember correctly, it has to rebuild every index after the bulk load completes - which can take some time and affect the overall job completion time (which lowers RPS).
Also - I don't remember that Array size was ignored. If it isn't, then it is set WAY too large and would be negatively affecting performance.
If someone with more recent experience can chime in... please do so!
Sorry, not one of the chimers requested as my experience is primarily with Oracle not DB2 but...
If it's anything like Oracle you can't bulk load into a table with indexes, if they exist they must be dropped and rebuilt afterwards. There are times when that fact alone rules it out as the added speed of the load is far offset by the time taken to rebuild indexes.
And I'm not sure why the OP mentions Array Size and Transaction Size together all the time as if they were one setting. A bulk load is not 'transactional' and thus does not issue commits but it sounds like DB2 has a 'save count' property? Not sure that has much if any impact on performance.
Now, as noted and in my experience, Array Size is an issue no matter what kind of load you are doing and I've seen plenty of examples in the past where setting it too high had an adverse affect on performance. I think you'll find some formulas posted here to compute it based on packet size and ARL but sometimes there's more art than science involved. Finding the 'sweet spot' can involve trying a series of sizes in a repeatable scenario, tracking the results. Crank it up until it starts to degrade and then back it back down. This rather than just pulling a number out of... the air.
Still looking for someone with more recent (and relevant) experience to chime in.
If it's anything like Oracle you can't bulk load into a table with indexes, if they exist they must be dropped and rebuilt afterwards. There are times when that fact alone rules it out as the added speed of the load is far offset by the time taken to rebuild indexes.
And I'm not sure why the OP mentions Array Size and Transaction Size together all the time as if they were one setting. A bulk load is not 'transactional' and thus does not issue commits but it sounds like DB2 has a 'save count' property? Not sure that has much if any impact on performance.
Now, as noted and in my experience, Array Size is an issue no matter what kind of load you are doing and I've seen plenty of examples in the past where setting it too high had an adverse affect on performance. I think you'll find some formulas posted here to compute it based on packet size and ARL but sometimes there's more art than science involved. Finding the 'sweet spot' can involve trying a series of sizes in a repeatable scenario, tracking the results. Crank it up until it starts to degrade and then back it back down. This rather than just pulling a number out of... the air.
Still looking for someone with more recent (and relevant) experience to chime in.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Thank you Andy and Craig for your replies. We have started investigating from db2 tables perspective in terms of counts etc. and I have below information
So as rightly pointed by Andy, it might be something to do with existing records in the table and indexes. Doing more investigation and will keep posted in this forum as well
Apologies for the formatting. Need to learn that
Thanks,
Sachin
Code: Select all
Table1 Table2
Record count 7595154159 13564905740
daily avg load volume 50-70 million 10-20 million
Time taken 20-30 mins 100 mins plus
load method bulk-append bulk-append
Commit size 2000 2000 (recently changed)
Index only single column four columns
Apologies for the formatting. Need to learn that
Thanks,
Sachin
Sachin C
Code: Select all
tags preserve whitespace, followed by lots of 'preview' passes until it all lines up. :wink:
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers