Page 1 of 1

Order of stage variables fails a job

Posted: Tue Aug 02, 2016 6:32 pm
by Novak
Hi all,

We have a parallel job that needs some improvements but is by no means complex. The last transformer in a job is preceeded by lookup stage that drops un-matched records. The transformer writes to dataset with some simple derivations.
The job was running fine in non-prod environments, but started failing on 4 production engine nodes (fifth one dedicated to conductor).
The failure message is "Player 9 terminated unexpectedly" with Player 9 being the transformer.
The stage variables causing this failure are:

svZoneLenId
------
Len(Trim(DecimalToString( To_Xfm.POS_ZONE_ID,"suppress_zero")) )

svZoneId
------
Trim(DecimalToString( To_Xfm.POS_ZONE_ID,"suppress_zero"))

When written as such the Dump Score advises there are 21 datasets.

If however, the order of stage variables is swapped so that svZoneId appears first, the job completes and the Dump Score advises there are 20 datasets instead.
I know these stage variables are not written in the best way but even as they are they should not be causing a failure?

Regards,

Novak

Posted: Wed Aug 03, 2016 3:25 am
by priyadarshikunal
A few questions first,

Is it the complete derivation?
does it fail even if you do null handling properly?

I would atleast use len(svZoneId) as a derivation for length though.

Posted: Wed Aug 03, 2016 7:18 am
by chulett
Since they've already said that they "know these stage variables are not written in the best way" I wasn't going to bother to point that out. :wink:

And just for the record, no clue why the order would matter here seeing as how one is not dependent on the other. :? Seems like something to take to support.

Posted: Wed Aug 03, 2016 3:40 pm
by ray.wurlod
Reordering stage variables in a Transformer will not produce a change in the number of Data Sets in the score. Something else has also been changed in the job design, about which you have chosen not to inform us.

Perhaps you could post the two scores?

Posted: Wed Aug 03, 2016 3:45 pm
by Novak
This will very likely be due to buffering.
Was trying to avoid changing too much of a job in production but prior to this there were two lookup stages linked one to each other.
In dev and sit, they tell me it was never an issue, but it was never tested with large volumes.
After changing from lookup to join stage the job is behaving better.

Thanks for chiming in.

Novak