Two Source Match Stage, long running job and fails at last

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
dslearner45
Premium Member
Premium Member
Posts: 2
Joined: Wed Feb 07, 2018 1:28 pm

Two Source Match Stage, long running job and fails at last

Post by dslearner45 »

Hi All,

Thanks to you all helping me to resolve this issue.

I have a job using "Two-Source Match" Stage, it has two input link for data match. Now, Job is able to fetch data from two source table and it is getting stuck at "Two-Source Match" Stage. Then job runs for long time and at end it fails with following log details. We have 20 GB Scratch space allocated. Appreciate your help.

Following are log details
---------------------------------------------------
Source1TBL_Data,0: Number of rows fetched on the current node: 1679746.

TwoSourceMatch_StageName,0: Unsupported close in APT_FileBufferOutput::spillToNextFile(): Input/output error.

TwoSourceMatch_StageName,0: write failed: Output file full, and no more output files

TwoSourceMatch_StageName,0: Failure during execution of operator logic.

TwoSourceMatch_StageName,0: Fatal Error: Tsort merger aborting: mergeOneRecord() punted


TwoSourceMatch_StageName,0: Failure during execution of operator logic.

TwoSourceMatch_StageName,0: Fatal Error: Pipe read failed: short read

node_node1: Player 21 terminated unexpectedly.


main_program: APT_PMsectionLeader(1, node1), player 21 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 11 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 12 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 10 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 11 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 12 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 11 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 15 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 13 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 14 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 13 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 14 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 15 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 14 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 17 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 18 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 15 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 16 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 17 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 15 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 16 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 17 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 18 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 19 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 20 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 21 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 22 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 23 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 24 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 25 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 26 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 27 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 28 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 29 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 20 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 21 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 22 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 23 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 24 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 25 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 26 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 27 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 28 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 29 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 18 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 19 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 20 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 21 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 22 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 23 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 24 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 25 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 26 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 27 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 28 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 16 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 17 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 18 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 19 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 20 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 22 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 23 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 24 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 25 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 26 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 27 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 28 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 29 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 30 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 31 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 32 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 30 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 31 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 32 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 29 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 30 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 31 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 30 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 31 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 32 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 35 - Unexpected exit status 1.
APT_PMsectionLeader(2, node2), player 36 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 35 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 36 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 34 - Unexpected exit status 1.
APT_PMsectionLeader(4, node4), player 35 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 35 - Unexpected exit status 1.
APT_PMsectionLeader(3, node3), player 36 - Unexpected exit status 1.


TwoSourceMatch_StageName,0: Input 0 consumed 100714435 records.
dslearner45
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Ok, so what is your question?


You ran out of scratch disk. Did you calculate how much data you are pulling back from the queries? Did you multiply that value by the amount of times you are sorting the data? Did you have TMPDIR set to your scratch disk location as well (thus chewing up that drive space too)?


20GB... pretty small after all.

How many rows of data do you have and how many did you expect?


Your job failed because you ran out of scratch disk.
dslearner45
Premium Member
Premium Member
Posts: 2
Joined: Wed Feb 07, 2018 1:28 pm

Post by dslearner45 »

HI PaulVL, Thank you for response.

This is the only job running at that moment and failed when it was processing 1679746 row with the error log I have provided, but it did not failed with 1000 row. Is 20 GB not enough to process this amount of data. Providing following stages details used in this Job and some additional information to you to decide if my job design is wrong.

1. Job design:
Note: please ignore line 12 and 13.

Oracl 1679746 row
Stg --------1--------> Trans Stg---3---->Two-
--------------------------------------12--- Source|-5---->Funnel--7-->Dataset
Oracl 1679746 row-----------------13--- Match|---6---->Stage
Stg --------2--------> TransStg----4---> Stage|

2. Job fails at Two-Source Match Stage and its using Match Specification which has one blocking column and three column for Match command.

3. We are not doing mutch in Transformer stage, just null handling. This I can remove.

Please suggest if I am wrong anywhere.

Thanks a lot for all your help

Regards
dslearner45
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Quantity of data is not the same as quantity of rows.

It's ROWS * bytes per row.

You have to look at your job design to see if those stages are sorting.

the Match stage also uses scratch disk as buffer space. Look to see what your project is using for TMPDIR location as well.

Don't remove the null handling.

look at the scratch disk while the job runs.

df -h /the_scratch_mount over and over while the job runs.
VALIDATE that you are running and maxing out the mount.

See what mount is hitting max.
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

Is the job using bounded length varchars in the data? If yes, then setting the APT_OLD_BOUNDED_LENGTH environment variable will reduce runtime disk usage. Note that setting APT_OLD_BOUNDED_LENGTH might have adverse performance effects.
Post Reply