Multi-instance dataset to Oracle job - slow

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
koti9
Participant
Posts: 52
Joined: Wed Nov 09, 2005 10:51 am

Multi-instance dataset to Oracle job - slow

Post by koti9 »

Hello all,

We have a multi-instance dataset to Oracle job which runs slow if I run it against all database instances in parallel. However it runs quite fast when I run it against single database instance using bulk load?

My job design is pretty straight forward.

DataSet ----> Transformer ----> Oracle Connector (Bulk Load)

I am not using copy stage to load multiple databases as i am doing multi-instance.

Do you have any ideas to get the same throughput as if i am running against single database instance while doing multi-instance run in parallel?

Thanks & Regards
Koti
samyamkrishna
Premium Member
Premium Member
Posts: 258
Joined: Tue Jul 04, 2006 10:35 pm
Location: Toronto

Re: Multi-instance dataset to Oracle job - slow

Post by samyamkrishna »

Hi,

As a first step, you can or ask your admins to monitor the resources on Datastage server and the Oracle server to find out the bottle neck and then take the necessary actions.
Cheers,
Samyam
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

When you say it goes slow when you run it "against all database instances in parallel", do you mean you are targeting the same table in separate databases with each invocation id? Assuming that's what you meant but wanted to clarify.
-craig

"You can never have too many knives" -- Logan Nine Fingers
koti9
Participant
Posts: 52
Joined: Wed Nov 09, 2005 10:51 am

Yes that's Right Craig

Post by koti9 »

Yes that's Right Craig, I am loading the same table in separate databases
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Are those target databases on the same host?
koti9
Participant
Posts: 52
Joined: Wed Nov 09, 2005 10:51 am

Post by koti9 »

No Paul, They are on different hosts
forest416
Participant
Posts: 3
Joined: Thu Dec 11, 2003 2:05 pm

Post by forest416 »

Hi,

As you said, the connections are from one host to multiple target host. There might be bottleneck at the Datastage server side.

It may hit the limit of Datastage server host bandwidth. (I pretty sure, the bandwidth usage won't reach to 100MiB/s, if you are on gigabit network. The actual maximum bandwidth of function of the network latency and TCP recevie/send buffer on both side. ). If you reached 1/3 of the eithernet bandwidth (30MiB/s on giga ethernet), I suggest you contact network admin to tune the network performace.

Another thing I noticed is that when use BulkLoad (with DB2), the varchar field are transferred through network in FULL-length.

For example, if I have a table to load with BULD load by DB2 connector.
The table structure is like:

num int,
desc varchar(255)

The source is also a DB2 connection from another database. And the maximun length of value of desc is 10

on Datastage host, I can see the outbound network usage be 10+ times than the inbound.

That means, if you have a big varchar field, you need either tune the varchar maximum length, or tune the network bandwidth, or avoid to use BULK.


On Linux/Unix, network bandwidth usage can be seen by:
sar -n DEV 3 #instance , every 3 second
sar -n DEV # history.
Post Reply