Its aparallel with the following job design .
Code: Select all
Lk1 Lk2 Lk3 ..Lkn
| | | |
ORADB -------> Lookup -----------> Transformer ------->DataSet
This job runs fine but I get the following error at times :
main_program: Fatal Error: Unable to start ORCHESTRATE job: APT_PMwaitForPlayersToStart failed while waiting for players to confirm startup. This likely indicates a network problem.
Status from APT_PMpoll is 0; node name is node2
I reset the job and after I had rerun the same I got a different error :
node_node2: Fatal Error: Unable to start ORCHESTRATE process on node node2 (colby): APT_PMPlayer::APT_PMPlayer: fork() failed, Not enough space
When I reran it for the third time
repartition(2),0: Fatal Error: Unable to start ORCHESTRATE network connection on node node1(colby):READYWAIT failed: parallel {natural="/chshttp/dsoweb/group4/ncbd/clients/dell/hra/logs/NGNHraEductnLvlCd_129.ds", synthetic="input repartition(2)"}(2, 0)
Finally the job completed successfully when I ran it for the 4 time .
Its a 4 node apt and there is enough space on the resource disk and scratch disk .
Free memory was also available.
I thought splitting the job could help but after the final run I am not sure what needs to be done.
Any suggestions would be helpful.
Thanks,
Vinod