Error in Join stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
subashri
Premium Member
Premium Member
Posts: 2
Joined: Tue Nov 16, 2010 11:21 pm
Location: Chennai

Error in Join stage

Post by subashri »

Hi ,

I am getting the following error when trying to join two tables each having 40 million records .
Source 1 -->Checksum-->Join (Primary link )
Source 2 --> Join (reference )

No other complex transformation involved. The Partition is Auto on the Join stage .Job runs on 4 Node in Big Integ environment.

APT_CombinedOperatorController,0: APT_IOPort: read failed on [fd 5: L16.34.70.10:40212, R16.34.70.10:11028], errno 104 (Connection reset by peer)

APT_CombinedOperatorController,0: Virtual data set.; input to "inserted tsort operator {key={value=C1, subArgs={asc, nulls={value=first}, cs}}, key={value=C2, subArgs={asc, nulls={value=first}, cs}}, key={value=C3, subArgs={asc, nulls={value=first}, cs}}, key={value=C4, subArgs={asc, nulls={value=first}, cs}}, key={value=C5, subArgs={asc, nulls={value=first}, cs}}}(1)": getRecord DM read error; returning false.

Please help .

Thanks,
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Well... the first bit of advice in situations like this is always going to be to add $APT_DISABLE_COMBINATION to the job and set it to TRUE. This will allow you to know what the actual origin of the error is.

And this probably isn't important but I'm not really sure what a "Big Integ environment" is. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
subashri
Premium Member
Premium Member
Posts: 2
Joined: Tue Nov 16, 2010 11:21 pm
Location: Chennai

Post by subashri »

Hi Craig,
Issue is resolved by adding the env variable . I think this will impact the job performance since combining is disabled .
I am just eager to know if there is any other work around with out impacting the job performance.

By d way , I was trying to shorten the term Big Integrate in my original post :)

Thanks for your help
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Just so you know, that variable was not meant to solve anything, simply help with debugging by stopping the operator combining that the underlying framework does to make things more efficient. The end goal was simply to move from an error message from "APT_CombinedOperatorController" to the actual operator with the issue.

Others that have been in the same situation can chime in with advice on what to do next. I'd suggest opening a support case and let them know what's going on, see what they suggest. Are you current on your fix packs for your version?
-craig

"You can never have too many knives" -- Logan Nine Fingers
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

I would second Craig on that. Your error is likely intermittent and will happen again. It would help to prepare and open a Support case about it.
Choose a job you love, and you will never have to work a day in your life. - Confucius
Post Reply