Page 1 of 1

Unable to end the Process DSD.OshMonitor MSEVENTS.FALSE

Posted: Tue Jul 25, 2017 5:27 am
by kagliwal.a
Hello Experts,

We are facing some strange issue in our job.
Job is getting completed but taking more than 1-2 hours to return its status to the head node.
Below is the output from the logs:

Item #: 111
Event ID: 4151752
Timestamp: 2017-07-25 02:58:36
Type: Info
User Name: test
Message Id: IIS-DSTAGE-RUN-I-0124
Message: Parallel job reports successful completion
Item #: 112
Event ID: 4152177
Timestamp: 2017-07-25 03:58:34
Type: Control
User Name: test
Message Id: IIS-DSTAGE-RUN-I-0077
Message: Finished Job <jobname>


Also we have check from our end that the load and other parameters are normal.
While checking from backend we found that there is still process running even after the job has been successfully completed.
Below is the output of command after job has been finished.

ps -ef | grep -i <Jobname>
test 10144 1 99 Jul24 ? 1-02:20:41 phantom DSD.OshMonitor <jobname> 9411 MSEVENTS.FALSE
root 13692 9540 0 05:19 pts/0 00:00:00 grep --color=auto -i <jobname>

Thanks in advance :)

Posted: Tue Jul 25, 2017 6:16 am
by priyadarshikunal
Did you check if you have any after job subroutine in that job?

Posted: Tue Jul 25, 2017 6:30 am
by kagliwal.a
Yes I checked, there is no job sub routine after that job.
Earlier it used to take just 12 min to finish the job but from the last 2 days it is taking more than 1 hrs as job is getting stuck at a particular point.
Also we have find the process from background:
test 10144 1 99 Jul24 ? 1-02:20:41 phantom DSD.OshMonitor <jobname> 9411 MSEVENTS.FALSE
root 13692 9540 0 05:19 pts/0 00:00:00 grep --color=auto -i <jobname>
from that day only.

Posted: Thu Jul 27, 2017 1:38 am
by priyadarshikunal
you can try setting APT_NO_JOB_MON = 'True' just to check if its the job monitor which is causing the problem.

Above enviornment variable changes will disable the monitoring and the statistics will not be visible in director.