Timed out while waiting for an event

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Timed out while waiting for an event

Post by richdhan »

Hi,

We are running 2 parallel jobs in a Job Sequence. If we run the jobs individually there are no problems. But if we run the jobs in a Job Sequence we are getting this timeout problem.

Code: Select all

Warning :
GFSTCPMCWS2150AcctDept..JobControl (@PJOB_GFSTCPMCWJ2150AcctDeptCdc): Controller problem: Error calling DSRunJob(GFSTCPMCWJ2150AcctDeptCdc. 1), code=-14
[Timed out while waiting for an event]

Info :
GFSTCPMCWS2150AcctDept..JobControl (@Coordinator): Summary of sequence run
08:58:07: Sequence started (checkpointing on)
08:58:07: PJOB_GFSTCPMCWJ2150AcctDeptCdc (JOB GFSTCPMCWJ2150AcctDeptCdc) started
08:59:09: Exception raised: @PJOB_GFSTCPMCWJ2150AcctDeptCdc, Error calling DSRunJob(GFSTCPMCWJ2150AcctDeptCdc. 1), code=-14 [Timed out while waiting for an event]
08:59:09: Sequence failed (restartable)

Fatal :
GFSTCPMCWS2150AcctDept..JobControl (fatal error from @Coordinator): Sequence job (restartable) will abort due to previous unrecoverable errors
Any help on this will be appreciated.

Thanks and Regards
Rich
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

This has been discussed before, for example here.
Essentially, your job is taking too long to start. You need to investigate why this is. What's the elapsed time between the start request and the actual job start (these times should be logged)? There's a hard-wired limit that generates the -14 error code.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi Ray,

Before I posted this topic I searched the forum. I found two posts on the same topic. I already went through the post you had mentioned. One of the soultion was to change "Reset if required and then Run" option. We were using this option for all the jobs in the sequencer. So I changed the option to "Run" and ran the sequencer again but the result was the same.

The odd thing is that the job for which the sequencer is waiting is actually getting executed and the status of the job is finished but the Sequencer gets aborted.

Any thoughts on this!

Thanks
Rich

Pride comes before a fall
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Quick first thoughts are to monitor the system to see whether you are running out of some critical resource. For CPU monitor %Idle (very low is bad), for memory monitor PF/sec (high is bad), for physical disk I/O monitor IO/sec (high is bad).
You could also check whether you're running out of slots in the T30FILE table for open dynamic hashed files - yes, I know your jobs are parallel jobs, but sequences use hashed files in the Repository - what else is happening in DataStage at these times? What else is happening in the machine at these times? Are there any Orchestrate bottlenecks (sorry I can't help with how to diagnose that one)?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rsrikant
Participant
Posts: 58
Joined: Sat Feb 28, 2004 12:35 am
Location: Silver Spring, MD

Post by rsrikant »

I believe this error is very misleading.

I am going through the error message you have posted.

Code: Select all

DSRunJob(GFSTCPMCWJ2150AcctDeptCdc. 1)
Is the option "Allow Multiple Instances" enabled in your sequence? It should have been enabled. That's why the job name is followed by a .1
But why you are having a space between the . and 1?
DSRunJob won't take more than 1 argument (if i am not wrong) and the job name can not have spaces in between.

Is it your typing mistake or it is the exact error message you got?

Do check this and see if it gives any clue to your problem.

Thanks,
Srikanth
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi Srikanth,

Thanks for the post. It has resolved the issue. We were using a parameter which was used as invocation id. What happened was that the parameter was provided with a string value which had a leading space in front of it.

After changing the parameter value and running the sequence it ran successfully without any problems.

Thanks
Rich
Post Reply