Interaction between a Grid and WLM?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Interaction between a Grid and WLM?

Post by ray.wurlod »

Can anyone please share their experiences with (and any technical documentation describing) the interaction between execution of DataStage jobs on a grid and the DataStage workload management (WLM) system?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
daignault
Premium Member
Premium Member
Posts: 165
Joined: Tue Mar 30, 2004 2:44 pm
Contact:

Post by daignault »

Installing the Grid Resource Toolkit, you substitute scripts for the 'osh' process which is renamed to osh.exe. You also create a "prototype" apt file with dummy values that may be substituted. When submitting jobs under DataStage, you execute the dsjob command and when the osh script is invoked by the conductor, it will query Platform LSF.

The script will review specific environment variables such as number of requested compute nodes and partitions. It will then create a REAL apt file from the prototype file naming compute nodes required. OSH will then execute osh.exe to execute the job against the APT file.

My sites is running CA7 schedulerThere are sites here running with ControlM and other scheduling software. Although I have not actually dispatched any jobs with WorkLoad Manager to the GRID, I'm fairly confident it will work based on the way the Grid toolkit works.

Hope this helps
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

WLM is not involved with GRID at all. It simply throttles the work on the Head Node. So you could have both enabled, but if the Head Node gets too busy the next job waits at a point even before it would reach the Dynamic_grid.sh code that the toolkit introduces.

I turn WLM off on my grid setups. I find that the threshold for allowing jobs to run is a tad too low, and I get more support calls about jobs hanging then I care for.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Thanks, Ray and Paul. This is pretty much how I understood things to be. But my client is hyper-cautious and requested me to ask around.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

IMHO...

WLM is useful when you are using the internal job scheduler. If you are using an external one then you can throttle the concurrent jobs that way. It's best used when it's just a solo box. I have it turned off on all of my setups as I take the job scheduler approach. Throttling just the submission of a datastage job doesn't mean much when often times the customer has wrapper scripts that may chew up a ton of cpu and memory. If you do not throttle those, you get a catch 22 situation where your datastage job will be held, and your wrapper script is running and will only complete once the DS job does...


Small shop... could be useful, big boy shop... not so much.
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

The one client I was at that initially had both enabled, encountered problems. The site ran tens of thousands of jobs on a daily basis. With WLM enabled they were seeing startup delays of 15 to 30 seconds for jobs. That would be a serious impact to their daily load, so WLM was immediately disabled.

Once we disabled WLM the delay went away. We didn't really need WLM, so no further research was performed to see if it was a configuration problem or if that was "normal behavior" when WLM was enabled on a GRID.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
Post Reply