ISSUE WITH EXECUTION MODE IN PARALLEL EXECUTING IN 3 NODES

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
JMRodriguez
Premium Member
Premium Member
Posts: 6
Joined: Wed Dec 05, 2007 7:48 am
Location: Venezuela, CCS

ISSUE WITH EXECUTION MODE IN PARALLEL EXECUTING IN 3 NODES

Post by JMRodriguez »

The problem occurs when using a parallel job with two stages in DB2-API, one for a Select and the other for an Insert to a database, and set the Execution Mode = Parallel (3 nodes), the amount of rows read / inserted is 3 times the number of rows when configured in Execution Mode=Sequential.

Here is the datastage configuration file:
{
node "node1"
{
fastname "capp_ux"
pools ""
resource disk "/dsadm/Ascential/DataStage/Datasets" {pools ""}
resource scratchdisk "/dsadm/Ascential/DataStage/Scratch" {pools ""}
}
node "node2"
{
fastname "capp_ux"
pools ""
resource disk "/dsadm/Ascential/DataStage/Datasets" {pools ""}
resource scratchdisk "/dsadm/Ascential/DataStage/Scratch" {pools ""}
}
node "node3"
{
fastname "capp_ux"
pools ""
resource disk "/dsadm/Ascential/DataStage/Datasets" {pools ""}
resource scratchdisk "/dsadm/Ascential/DataStage/Scratch" {pools ""}
}
}
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

I don't think the configuration file has anything to do in this.

Did you try with 2,4 and 6 nodes ?

Try writing the output to a DataSet and check whether all partitions take full copy of the source.

Is the table partitioned ? Did you check / set the DB2 partition mode ?
JMRodriguez
Premium Member
Premium Member
Posts: 6
Joined: Wed Dec 05, 2007 7:48 am
Location: Venezuela, CCS

Post by JMRodriguez »

Thank's for your prompt response.

I couldn't try a different node configuration because I don't have administrator right's.

I also tried with a DataSet file instead of a DB2-API (for Insert statement), and the result was the same. The problem starts from the stage DB2-API (for Select statement) where the records are read 3 times.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What settings are you using in the source DB2 stage? What partitioning is in use in the job? As noted, you seem to have it set to 'Entire' so that each node gets a complete copy of the selected data.
-craig

"You can never have too many knives" -- Logan Nine Fingers
JMRodriguez
Premium Member
Premium Member
Posts: 6
Joined: Wed Dec 05, 2007 7:48 am
Location: Venezuela, CCS

Post by JMRodriguez »

Hi there,

Here are settings in the Source DB2-API stage:

Output>General Tab:
Transacction isolation=Cursor Stability
Output>Sql Tab:
Query type=Generate Select clause from columns list; enter other clauses
Output>Advanced Tab:
Buffering Mode: (Default)

Stage>Advanced Tab:
Execution mode=Parallel
Combinability mode=(Auto)
Preserve partitioning=Default(Propagate)
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Check the input tab of your insert, what partitioning type is selected?
JMRodriguez
Premium Member
Premium Member
Posts: 6
Joined: Wed Dec 05, 2007 7:48 am
Location: Venezuela, CCS

Post by JMRodriguez »

Hi there,

Settings on the target DB2-API stage:

Input>General Tab:
Update Action: Insert rows without clearing
Create table action: Do not create target table
Input>Partitioning Tab:
Partition type: (Auto)
Input>Advanced Tab:
Buffering Mode: (Default)

Stage>Advanced Tab:
Execution mode=Parallel
Combinability mode=(Auto)

Note:
Job design looks like:
DB2 -> TRANFORMER -> DB2

On the transformer stage there is no transformations, only direct data relations between fields.
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Ahh sorry, I read your original posts as being only two stages, the select and insert. In that case, have a look at the partition type on the input tab of the transformer.

When you look at the job monitor, where do you see the multiplication of row counts? (Expand the full branch)
JMRodriguez
Premium Member
Premium Member
Posts: 6
Joined: Wed Dec 05, 2007 7:48 am
Location: Venezuela, CCS

Post by JMRodriguez »

No problem Kryt0n,

As a matter of fact, I modified job design by adding transformer stage to have better view of the problem.

Here are settings on the transformer stage:

Stage>Advanced Tab:
Execution mode=Parallel
Combinability mode=(Auto)
Preserve partitioning=Default(Propagate)

Input>Partitioning Tab:
Partition type: (Auto)
Input>Advanced Tab:
Buffering Mode: (Default)

Output>Advanced Tab:
Buffering Mode: (Default)

On Job Monitor, expanding all stages, it shows same row number (3 times)

Also the log shows 3 messages for each DB2-API stage, one for each node, in other words, it shows a total of 6 messages.

"DB2_UDB_API_20,0: Info: DTWResSaldoPasivoDInsert_Prueba3.Copy_of_DB2_UDB_API_20: Transaction committed as part of link close processing."
"DB2_UDB_API_20,2: Info: DTWResSaldoPasivoDInsert_Prueba3.Copy_of_DB2_UDB_API_20: Transaction committed as part of link close processing."
"DB2_UDB_API_20,1: Info: DTWResSaldoPasivoDInsert_Prueba3.Copy_of_DB2_UDB_API_20: Transaction committed as part of link close processing. "

"Copy_of_DB2_UDB_API_20,0: Info: DTWResSaldoPasivoDInsert_Prueba3.Copy_of_DB2_UDB_API_20: Transaction committed as part of link close processing. "
"Copy_of_DB2_UDB_API_20,1: Info: DTWResSaldoPasivoDInsert_Prueba3.Copy_of_DB2_UDB_API_20: Transaction committed as part of link close processing. "
"Copy_of_DB2_UDB_API_20,2: Info: DTWResSaldoPasivoDInsert_Prueba3.Copy_of_DB2_UDB_API_20: Transaction committed as part of link close processing. "
suman27
Participant
Posts: 33
Joined: Wed Jul 15, 2009 6:52 am
Location: London

Post by suman27 »

Hi,
I am facing similar issue with WebSphereMQ stage. With 2 node configaration it is reading twice the number of records. where as with one node it is working fine. I am still investigating the problem .

Regards.
Suman
sjfearnside
Premium Member
Premium Member
Posts: 278
Joined: Wed Oct 03, 2007 8:45 am

Post by sjfearnside »

The DB2 API is not capable of understanding database partitioning and when set to parallel will perform the full read on each node and duplicate the data. If someone has done it otherwise, please correct me and I will relay the information back to support.

Note: I went back to my notes and referenced the PMR I opened on this issue. It was dated 02/2008. It is included here for your reference: PMR 713338 499 000.
Post Reply