Realistic uppper limit quantity of jobs in on Project

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Realistic uppper limit quantity of jobs in on Project

Post by PaulVL »

Have a question for the other big shops out there...

What is the realistic max number of unique jobs (jobs + invocation IDs) that the Director tool can handle in one Project?

I have a project that generates about 8000 unique entries in director per day. They are looking to expand their flow and include other countries. So I'm looking at 4 times more jobs per day.

Has anyone encountered the max realistic upper limit on Director entries yet?

I know what is stated in the manual, but that has not been encountered in the real world yet.

Who's out there that is pushing the upper limits of the tool?
PhilHibbs
Premium Member
Premium Member
Posts: 1044
Joined: Wed Sep 29, 2004 3:30 am
Location: Nottingham, UK
Contact:

Post by PhilHibbs »

My suggestion is, don't use Director. Roll your own system that runs dsjob to query the job logs and statuses that you need to know about.

I do this for my own purposes from an Excel spreadsheet using VBA code to generate a script that is then executed on the Unix server and then scrapes the results back into the spreadsheet.

I've also built some mainframe jobs and shell scripts to check job statuses and fetch all the warnings and erros plus the "Summary of sequence run" entry from the log so a mainframe operator can see all the info without leaving the comfort of the 3270 emulator.

The Excel method is also a great way of pulling back large amounts of Peek stage information because I can just copy and paste it all from there into a text editor. Search this forum for DataStageAnalysis if you want to have a look at my Excel implementation.
Phil Hibbs | Capgemini
Technical Consultant
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

But I am certain that the extra 8000 entries in the director are instances, which are stored in the single log file for that job and don't impact performance, apart from the size of the log file.
PhilHibbs
Premium Member
Premium Member
Posts: 1044
Joined: Wed Sep 29, 2004 3:30 am
Location: Nottingham, UK
Contact:

Post by PhilHibbs »

Which ever it is - one job with 8,000 instances or 16 jobs with 500 instances each - there will be a performance hit on Director retrieving the stats and displaying all 8,000 entries in the browse list. I think that that is what the OP is concerned about, that Director will retrieve the Status, Started, On Date, Last Ran, etc. for all 8,000 job instances when the user clicks on the folder.

On my last project, using v7, it took 10 minutes just to start up Director due to the number of Job Categories in the project. This was the impetus behind the creation of my Excel VBA tool.
Phil Hibbs | Capgemini
Technical Consultant
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

As an admin, I tend to turn off Folder View from Director.

It's killing me to open that project and find aborted jobs. 10 to 15 minutes is a normal wait time for that bad boy.

Then I have to turn off auto refresh interval since that would cause naughty words to leave my mouth.

Now they will throw 4 times more jobs into the project.

So ya... Who else out there has encountered these issues and how did you overcome them?

(I'll be looking at your excel sheets Phil)

But that will be a very hard sell since we have a secure PROD environment where users do not have access to the back end Linux server (engine).
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Paul, my answer looks a bit like bashing you with the obvious, but I have read your concerns and I'm not denigrating them.

Simply put: break your 8,000 jobs down into smaller, logical work units and build a folder system to organize them.

I believe that means making each work unit its own project, and trading off the long load/refresh for having to go through the log-on process for each project.

Our largest project is about 1/4 the size of yours, and it spans a dozen or so DataStage projects. The documentation for it all is cumbersome, but again that's part of the trade-off with having everything "at your fingertips".
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I would script a compile nightly. I can post either DOS batch file or VBscript to do it. That many instances would drive me crazy.
Mamu Kim
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Each job log is a hashed file, with 32-bit addressing. The limit on the total volume of entries is, therefore, 2GB by default. How many actual entries this is depends on how large the various entries are.

If you choose to convert particular log hashed files to 64-bit addressing, then the limit on volume becomes ridiculously large (theoretically 1.9 million TB) but you'll run out of disk space before hitting that. Even at your site!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply