Determine which jobs have failed without using Director
Posted: Mon Aug 02, 2010 4:15 pm
Hi all,
Here's the technical stuff to set the stage for the question(s):
- Running DataStage Server Edition v.8.0.1 on AIX Unix.
- Source and target db's are SQL Server 2005 and 2000 respectively.
- The project is one code base parameterized to handle 22 separate companies.
- The project was converted from 7.5.1a, so no parameter sets are in use.
- All databases involved have the same structure, one company per source/target pair of db's.
- Total job count in project is 1253.
- Average job count per company cycle each night is around 700-800 depending on size and volume.
- Logs are retained for one day only.
Here is the problem: when we have to reset a bunch of jobs after multiple failures for whatever reason (usual error -14 - but that's another story and question entirely
), we currently have to use the DataStage Director and manually go through each directory to check the logs for jobs that either aborted or became some status other than runnable. We then manually reset them and resurrect the batch (we use Ken Bland's job control process).
The response times in the Director are horrid, especially if other companies are still running. I have organized the project to limit the number of jobs in a folder as this helps with the return times on the list, but it still is not fun when debugging at 2 AM when I am half awake.
Is there a way to query the jobs that are in a status 3? I want to say there is also a status in the 90's that also requires resetting. I don't want to automatically reset all the jobs each time because that increases my processing times beyond my processing window (I only have 4.5 hours to finish everything). Usually, the failures generate no more than 100 jobs in failed condition, so it would seem inefficient to query the status of all of them to find the handful needing a reset by running against every job in the project.
Thoughts?
As usual, your expert input is truly appreciated.
Here's the technical stuff to set the stage for the question(s):
- Running DataStage Server Edition v.8.0.1 on AIX Unix.
- Source and target db's are SQL Server 2005 and 2000 respectively.
- The project is one code base parameterized to handle 22 separate companies.
- The project was converted from 7.5.1a, so no parameter sets are in use.
- All databases involved have the same structure, one company per source/target pair of db's.
- Total job count in project is 1253.
- Average job count per company cycle each night is around 700-800 depending on size and volume.
- Logs are retained for one day only.
Here is the problem: when we have to reset a bunch of jobs after multiple failures for whatever reason (usual error -14 - but that's another story and question entirely
![Evil or Very Mad :evil:](./images/smilies/icon_evil.gif)
The response times in the Director are horrid, especially if other companies are still running. I have organized the project to limit the number of jobs in a folder as this helps with the return times on the list, but it still is not fun when debugging at 2 AM when I am half awake.
Is there a way to query the jobs that are in a status 3? I want to say there is also a status in the 90's that also requires resetting. I don't want to automatically reset all the jobs each time because that increases my processing times beyond my processing window (I only have 4.5 hours to finish everything). Usually, the failures generate no more than 100 jobs in failed condition, so it would seem inefficient to query the status of all of them to find the handful needing a reset by running against every job in the project.
Thoughts?
As usual, your expert input is truly appreciated.