Long time in completion of job, after executing all stages

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sensiva
Premium Member
Premium Member
Posts: 21
Joined: Tue Aug 22, 2017 10:39 am

Long time in completion of job, after executing all stages

Post by sensiva »

Hello

I am encountering a strange issue on my job & sequences since last week. The job or sequences that takes around 2 mins to compelte took 15 mins since last week. The disk space or the cpu consumption are normal. There was something strange when I looked at the logs. The job actually completes as expected in a couple of minutes with the

Code: Select all

 INFO log saying "Parallel job reports successful compeltion" which is followed by CONTROL log saying "Finished Job xxx". The strange thing is the differnce between these 2 logs is nearly 5 mins. 
[/b]

We are planning to restart the server next week to see if it resolves, but thought of posting here to see if someone could share any pointers to find the actual root cause.

The delay is for each and every job irrespective of connectors used (file/db etc). If it process millions of records, the difference between info and control log is 5 mins, if the processing is for thousands of records the difference between the info and control is 1-2 mins.

Thanks
sen
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Log message tip of the day:

The timestamp of the entry in the log is not the timestamp of the event happening, it is the timestamp of the actual WRITE to the log of that event.

Think of the log as a print queue. You "may" have had some messages stuck in the queue to be written to the log, and they finally got written when the flush of the queue was done.



============


This may or may not be associated with what you are seeing, but I thought it best to mention that little tidbit.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Do you have a long running (and silent) after-job subroutine?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sensiva
Premium Member
Premium Member
Posts: 21
Joined: Tue Aug 22, 2017 10:39 am

Post by sensiva »

@ Paul --> I completely agree with you and inline with you. Even in that case, all the INFO logs are written to the log file at the same time or with the expected delay for each operation. I could see the log split up for 45 - 50 secs that actually does the job in a sequential and correct order. But the CONTROL log after 2 or 5 mins is completly out of sync and no trace in logs for that delay.

@ Ray --> No we don't have any after job subroutine. It happens for all job.

Another observation was that the Job Monitor which eventually indicates the CPU and number of records treated per sec is always in status "READY" even though the job has finished and doesn't indicate any statistics.

We did restart the server and we dont see that issue any more now. But still eager to know the root cause. Let me check from my side if i could find any other strange observations.

Thanks
sen
Post Reply