Page 1 of 1

Long time in completion of job, after executing all stages

Posted: Fri Jun 22, 2018 8:49 am
by sensiva
Hello

I am encountering a strange issue on my job & sequences since last week. The job or sequences that takes around 2 mins to compelte took 15 mins since last week. The disk space or the cpu consumption are normal. There was something strange when I looked at the logs. The job actually completes as expected in a couple of minutes with the

Code: Select all

 INFO log saying "Parallel job reports successful compeltion" which is followed by CONTROL log saying "Finished Job xxx". The strange thing is the differnce between these 2 logs is nearly 5 mins. 
[/b]

We are planning to restart the server next week to see if it resolves, but thought of posting here to see if someone could share any pointers to find the actual root cause.

The delay is for each and every job irrespective of connectors used (file/db etc). If it process millions of records, the difference between info and control log is 5 mins, if the processing is for thousands of records the difference between the info and control is 1-2 mins.

Thanks

Posted: Fri Jun 22, 2018 11:15 am
by PaulVL
Log message tip of the day:

The timestamp of the entry in the log is not the timestamp of the event happening, it is the timestamp of the actual WRITE to the log of that event.

Think of the log as a print queue. You "may" have had some messages stuck in the queue to be written to the log, and they finally got written when the flush of the queue was done.



============


This may or may not be associated with what you are seeing, but I thought it best to mention that little tidbit.

Posted: Sun Jun 24, 2018 11:46 pm
by ray.wurlod
Do you have a long running (and silent) after-job subroutine?

Posted: Mon Jun 25, 2018 1:32 am
by sensiva
@ Paul --> I completely agree with you and inline with you. Even in that case, all the INFO logs are written to the log file at the same time or with the expected delay for each operation. I could see the log split up for 45 - 50 secs that actually does the job in a sequential and correct order. But the CONTROL log after 2 or 5 mins is completly out of sync and no trace in logs for that delay.

@ Ray --> No we don't have any after job subroutine. It happens for all job.

Another observation was that the Job Monitor which eventually indicates the CPU and number of records treated per sec is always in status "READY" even though the job has finished and doesn't indicate any statistics.

We did restart the server and we dont see that issue any more now. But still eager to know the root cause. Let me check from my side if i could find any other strange observations.

Thanks