Parallel job with multiple joined tables not fully optimized

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
audev
Participant
Posts: 11
Joined: Wed Jul 18, 2018 7:25 am

Parallel job with multiple joined tables not fully optimized

Post by audev »

Hello,
I am trying to optimize a parallel job that contains 7 teradata connector as input, 1 teradata connector as output. In between a join stage (and a few transfomer stages) that put all input tables together.

see a picture:
Image

The result is a non fully optimized job.

here are some more details:

Optimizer
- release of optimizer release 11.5.0.0B2016.04.14c-201411004
- options: isGenOrdBy,PushProcToSrcs,PushProcToTrgts,PushAllToDB

Job
all input teradata connectors have access method "bulk", the output teradata connector has access method "bulk" + load type "update.
all input teradata connectors have a custom select statement (some with subqueries)


the optimization log displays warnings that say "WARNING: Impossibile generare istruzioni di query combinate per i link link_to_join_anag_cli,link_to_join_segm,link_to_join_rat_int,link_to_join_rat_moodys,link_to_join_rat_stpoor,link_to_join_rat_fitch,link_to_join_rat_dbrs,link_to_trasf. Il pattern viene abbandonato."
Which means more or less "it is not possible to create combined query instructions for the link: [link_names]"

has anyone ever encountered a similar issue? how can it be solved?
thank you,
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What options did you select when running Balanced Optimization?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
audev
Participant
Posts: 11
Joined: Wed Jul 18, 2018 7:25 am

Post by audev »

Standard options

All the following are checked:
-Generate order by clause
-Push processing to data sources
-Push processing to data targets
-Push all processing into target data server

Advanced option (among others)

Maximum number of SQL nested joins: unlimited
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Can you please show us the "optimized" design?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
audev
Participant
Posts: 11
Joined: Wed Jul 18, 2018 7:25 am

Post by audev »

Apparently all those subqueries in the input connector, together with the multiple links to single join stage, were the cause of the problem. After changing the structure a little bit I managed to solved and fully optimize it.

Here is the new original job configuration:

<a href="https://ibb.co/bZK9M8"><img src="https://preview.ibb.co/fAN78o/Cattura.png" alt="Cattura" border="0"></a>
Post Reply