DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
jreddy



Group memberships:
Premium Members

Joined: 03 Feb 2004
Posts: 202

Points: 1953

Post Posted: Wed May 09, 2018 11:54 am Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Windows
Looking for some guidance on how we can set/use the transaction handling options in ODBC Connector (or other SQL Server stages) to get the best performance when inserting/updating data to SQL Server Database.

For a job that inserts 50K rows for example, I have set the Array Size and Record Count as 2000 in the SQL Server Enterprise stage for the INSERT, but have seen that on the DB side, I see 50k insert statements issued

Tried changing array size and record count to 20K each, but still see the same on the DB side.

What option(s) could be set so we can reduce the number of insert statements. I was going to try the SQL Server Bulk stage and ODBC connector stage and try the same, but wanted to see if anyone was able to gain performance improvements by playing with these options and also commit frequencies, for example..

Any guidance is appreciated, thanks
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42753
Location: Denver, CO
Points: 220316

Post Posted: Wed May 09, 2018 12:01 pm Reply with quote    Back to top    

None of those settings will affect the number of inserts issued. If you've got 50k rows to process, 50k insert statements will be issued. Array Size controls how many are "bundled together" to be sent to the database at any given time. And Record Count, doesn't that control how often a commit is performed?

_________________
-craig

Research shows that 6 out of 7 dwarves aren't happy
Rate this response:  
Not yet rated
jreddy



Group memberships:
Premium Members

Joined: 03 Feb 2004
Posts: 202

Points: 1953

Post Posted: Wed May 09, 2018 12:07 pm Reply with quote    Back to top    

Thank you Craig.

Our DBA has pointed me to these 50k inserts getting issued 1 by 1 and has asked me to see if there is a way to group them for better efficiencies, so it doesn't hit the database that many times.. is that something we can do with a bulk load?
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42753
Location: Denver, CO
Points: 220316

Post Posted: Wed May 09, 2018 1:03 pm Reply with quote    Back to top    

Not sure why there would be an issue with that number of inserts. Sure they get processed 'one by one' but that's not a volume where I would think it warranted to switch to bulk loads. Are they taking an inordinate amount of time and that's why the DBA is involved?

Bulk inserts are a different beast. Not familiar with SQL Server and all of its nuances but have dealt with this in Oracle. They use a different API, bypass a lot of the things a "conventional" insert needs to deal with and there's no longer a concept of "commits". For Oracle, you cannot have any indexes on the table. And then it all processes successfully or it fails and when there's a bulk load failure it can leave your table in a bit of a compromised state. That's why we save them for Loads of Unusual Size - say for example when our normal daily load might be 1 to 1.5 million (which we still load conventionally) expands up to 100 to 150 million like it does once a year.

And from the Oracle side I could also write a PL/SQL procedure to generate and bulk collect the inserts and let the database handle them in large chunks that way. Been a heck of a long time since I've had a situation where that looked like that would be a viable solution for. Never mind it's not an ETL solution, more of a PITA solution.

Not sure how much of that helps. I'm just wondering if you are trying to solve something that really isn't a problem, per se. Sad

_________________
-craig

Research shows that 6 out of 7 dwarves aren't happy
Rate this response:  
Not yet rated
qt_ky



Group memberships:
Premium Members

Joined: 03 Aug 2011
Posts: 2813
Location: USA
Points: 21315

Post Posted: Thu May 10, 2018 5:10 am Reply with quote    Back to top    

Sounds quite normal.

_________________
Choose a job you love, and you will never have to work a day in your life. - Confucius
Rate this response:  
Not yet rated
jreddy



Group memberships:
Premium Members

Joined: 03 Feb 2004
Posts: 202

Points: 1953

Post Posted: Fri May 11, 2018 11:57 am Reply with quote    Back to top    

Thank you Craig
Yes, we are having performance issues with the load jobs and hence we are looking into every possible way to tune it. The whole process up until the last stage where insert happens breezes through and then insert takes anywhere between 2 mins to 11 mins randomly ..
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42753
Location: Denver, CO
Points: 220316

Post Posted: Fri May 11, 2018 4:25 pm Reply with quote    Back to top    

I would hazard a guess that the load on the database, whatever other processes are running on it or just the server itself, are what are driving your variances. That and probably volumes since I don't image each run is always 50K records. Another factor could simply be the number of indexes on the table. Or network congestion. Lots of factors could be at play here.

Just from personal experience I would caution against going down the bulk load path for such a small number of rows considering the (mostly single digit) load times you are seeing. I don't believe the level of complexity it would add would be worth whatever performance gains you may or may not see. All IMHO, of course.

_________________
-craig

Research shows that 6 out of 7 dwarves aren't happy

Last edited by chulett on Sun May 20, 2018 4:40 pm; edited 1 time in total
Rate this response:  
Not yet rated
UCDI



Group memberships:
Premium Members

Joined: 21 Mar 2016
Posts: 335

Points: 3391

Post Posted: Tue May 15, 2018 10:05 am Reply with quote    Back to top    

have you done stats and index on the target? Inserts can go long if the stats are a mess, or if the table isnt indexed well for the job.

bulk works well too, if you go that route. Hardware and systems vary so its hard to say where to do a cutoff of inserts vs bulk mode. What seems few to a big system can be overwhelming to a tiny system. Try both, see what you get.
Rate this response:  
Not yet rated
rschirm

Premium Poster



Group memberships:
Premium Members, Inner Circle

Joined: 13 Dec 2002
Posts: 27

Points: 203

Post Posted: Sun May 20, 2018 3:26 pm Reply with quote    Back to top    

I would try the following.

Calculate what the width is for each row. (so add up all the lengths of the columns) Take that value and divide it into 64,000. Round the value down to the next whole number. Use this number for the Array Size. Make the Record Count to be a multiple of the Array Size so that number is up around 5000.



Please post back with the stats of this change.
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours