Page 1 of 1

Timed out while waiting for an event

Posted: Thu Jun 10, 2004 9:46 am
by richdhan
Hi,

We are running 2 parallel jobs in a Job Sequence. If we run the jobs individually there are no problems. But if we run the jobs in a Job Sequence we are getting this timeout problem.

Code: Select all

Warning :
GFSTCPMCWS2150AcctDept..JobControl (@PJOB_GFSTCPMCWJ2150AcctDeptCdc): Controller problem: Error calling DSRunJob(GFSTCPMCWJ2150AcctDeptCdc. 1), code=-14
[Timed out while waiting for an event]

Info :
GFSTCPMCWS2150AcctDept..JobControl (@Coordinator): Summary of sequence run
08:58:07: Sequence started (checkpointing on)
08:58:07: PJOB_GFSTCPMCWJ2150AcctDeptCdc (JOB GFSTCPMCWJ2150AcctDeptCdc) started
08:59:09: Exception raised: @PJOB_GFSTCPMCWJ2150AcctDeptCdc, Error calling DSRunJob(GFSTCPMCWJ2150AcctDeptCdc. 1), code=-14 [Timed out while waiting for an event]
08:59:09: Sequence failed (restartable)

Fatal :
GFSTCPMCWS2150AcctDept..JobControl (fatal error from @Coordinator): Sequence job (restartable) will abort due to previous unrecoverable errors
Any help on this will be appreciated.

Thanks and Regards
Rich

Posted: Thu Jun 10, 2004 6:18 pm
by ray.wurlod
This has been discussed before, for example here.
Essentially, your job is taking too long to start. You need to investigate why this is. What's the elapsed time between the start request and the actual job start (these times should be logged)? There's a hard-wired limit that generates the -14 error code.

Posted: Thu Jun 10, 2004 11:33 pm
by richdhan
Hi Ray,

Before I posted this topic I searched the forum. I found two posts on the same topic. I already went through the post you had mentioned. One of the soultion was to change "Reset if required and then Run" option. We were using this option for all the jobs in the sequencer. So I changed the option to "Run" and ran the sequencer again but the result was the same.

The odd thing is that the job for which the sequencer is waiting is actually getting executed and the status of the job is finished but the Sequencer gets aborted.

Any thoughts on this!

Thanks
Rich

Pride comes before a fall

Posted: Fri Jun 11, 2004 6:42 pm
by ray.wurlod
Quick first thoughts are to monitor the system to see whether you are running out of some critical resource. For CPU monitor %Idle (very low is bad), for memory monitor PF/sec (high is bad), for physical disk I/O monitor IO/sec (high is bad).
You could also check whether you're running out of slots in the T30FILE table for open dynamic hashed files - yes, I know your jobs are parallel jobs, but sequences use hashed files in the Repository - what else is happening in DataStage at these times? What else is happening in the machine at these times? Are there any Orchestrate bottlenecks (sorry I can't help with how to diagnose that one)?

Posted: Sun Jun 13, 2004 10:33 pm
by rsrikant
I believe this error is very misleading.

I am going through the error message you have posted.

Code: Select all

DSRunJob(GFSTCPMCWJ2150AcctDeptCdc. 1)
Is the option "Allow Multiple Instances" enabled in your sequence? It should have been enabled. That's why the job name is followed by a .1
But why you are having a space between the . and 1?
DSRunJob won't take more than 1 argument (if i am not wrong) and the job name can not have spaces in between.

Is it your typing mistake or it is the exact error message you got?

Do check this and see if it gives any clue to your problem.

Thanks,
Srikanth

Posted: Mon Jun 14, 2004 3:32 am
by richdhan
Hi Srikanth,

Thanks for the post. It has resolved the issue. We were using a parameter which was used as invocation id. What happened was that the parameter was provided with a string value which had a leading space in front of it.

After changing the parameter value and running the sequence it ran successfully without any problems.

Thanks
Rich