Db2 bulk load job- Performance Issue

SachinCho · Post by **SachinCho** » Tue Jan 19, 2016 6:04 pm

Hi,
We have a multi-instance Db2 bulk loading job which reads a dataset and loads data to Db2 target via DB2 connector stage. No other stages involved in job except these two
Other details as below
- Approximate volume it handles is around 13 to 15 million records
- Time consumed normally - 1 hr 30 mins plus always. Occasionally 2 hrs+
- Write Mode - Bulk Load
- Table Action - Append
- Transaction and array size - 50000 (although its insignificant in this case I believe)
- Load control properties
Load mode - Insert
Load method - Named Pipe
Save count - 0
Non recoverable load - Yes

If we observe detailed log in director after job start it stays at "db2_Insert_Target,0: DB2 Load is starting ... " and completes only after timeframe mentioned above. Rows/second value observed is around 2800

Queries are -
Is this acceptable behaviour or job should finish much earlier having straight bulk load?
Any issues with respect to parameters configured in db2 connector stage?

Happy to provide any additional information required.

Thanks,
Sachin

SachinCho · Post by **SachinCho** » Fri Feb 12, 2016 7:35 pm

Hi All,
Any pointers to the above issue will be much appreciated.

Additional information on this job - RCP enabled for this one

Happy to provide additional information required

IBM Analytics Champion 2009 - 2020 · Post by **asorrell** » Fri Feb 12, 2016 10:59 pm

1) You haven't said anything about average record size... and that's one of the most critical pieces of information.
2) How many records total are in the table?
3) Is this DB2 z/OS or regular DB2?
4) Do you access this database with any other jobs? Do you have ANY jobs that access this database with good (or great) performance?
5) Why do you think Array / Transaction size is inconsequential?

SachinCho · Post by **SachinCho** » Sat Feb 13, 2016 3:23 pm

Thank you Andy for reply.
1. Record size is as below. Nothing indicating a big record size
Field 1 - integer
Field 2 - varchar(250)
Field 3 - date
Field 4 - time
Field 5 - timestamp
Field 6 - integer
Field 7 - integer
Field 8 - integer
Field 9 - timestamp
Field 10 -char(1)

2. Target table count is something I need to check
3. Regular DB2 - Database server = DB2/AIX64 10.1.3
4. We are getting better performance in another similar job which loads around 30-40 million records in 1/2 an hour or so
5. What I understood from reading DB2 connector load control properties is that its not transaction size but savecount property that defines consistency points in case of bulk load. Looks like I am not interpreting it correctly

Based on 4th point above I had a look at better performing job and only difference in these two jobs is array size and transaction size. The better performing job has it as 2000 while this job creating issue has it as 50000

Time to tweak transaction size and array size in bad performing job??

Thanks,
Sachin

IBM Analytics Champion 2009 - 2020 · Post by **asorrell** » Sun Feb 14, 2016 12:01 am

I haven't done a bulk load to DB2 in ages, and I can't check it at the moment, so this is basically from memory...

The total size of the table and the number of indicies can be important. If I remember correctly, it has to rebuild every index after the bulk load completes - which can take some time and affect the overall job completion time (which lowers RPS).

Also - I don't remember that Array size was ignored. If it isn't, then it is set WAY too large and would be negatively affecting performance.

If someone with more recent experience can chime in... please do so!

chulett · Post by **chulett** » Sun Feb 14, 2016 8:28 am

Sorry, not one of the chimers requested as my experience is primarily with Oracle not DB2 but...

If it's anything like Oracle you can't bulk load into a table with indexes, if they exist they must be dropped and rebuilt afterwards. There are times when that fact alone rules it out as the added speed of the load is far offset by the time taken to rebuild indexes.

And I'm not sure why the OP mentions Array Size and Transaction Size together all the time as if they were one setting. A bulk load is not 'transactional' and thus does not issue commits but it sounds like DB2 has a 'save count' property? Not sure that has much if any impact on performance.

Now, as noted and in my experience, Array Size is an issue no matter what kind of load you are doing and I've seen plenty of examples in the past where setting it too high had an adverse affect on performance. I think you'll find some formulas posted here to compute it based on packet size and ARL but sometimes there's more art than science involved. Finding the 'sweet spot' can involve trying a series of sizes in a repeatable scenario, tracking the results. Crank it up until it starts to degrade and then back it back down. This rather than just pulling a number out of... the air.

Still looking for someone with more recent (and relevant) experience to chime in.

SachinCho · Post by **SachinCho** » Sun Feb 14, 2016 10:18 pm

Thank you Andy and Craig for your replies. We have started investigating from db2 tables perspective in terms of counts etc. and I have below information

Code: Select all

                             Table1                            Table2
Record count                 7595154159                        13564905740
daily avg load volume        50-70 million                     10-20 million
Time taken                   20-30 mins                        100 mins plus
load method                  bulk-append                       bulk-append
Commit size                  2000                              2000 (recently changed)
Index                        only single column                four columns

So as rightly pointed by Andy, it might be something to do with existing records in the table and indexes. Doing more investigation and will keep posted in this forum as well

Apologies for the formatting. Need to learn that

Thanks,
Sachin

chulett · Post by **chulett** » Sun Feb 14, 2016 11:39 pm

Code: Select all

 tags preserve whitespace, followed by lots of 'preview' passes until it all lines up.  :wink: