Throttling performance on a grid?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dqm
Premium Member
Premium Member
Posts: 18
Joined: Fri Oct 26, 2007 12:03 am

Throttling performance on a grid?

Post by dqm »

Hi everyone,

We have a grid setup that is highly utilised already.
Now there is another group of developers that need access, but we want to limit the amount of resources they get to use, to reduce the potential performance impact on the other jobs that are running (some of them quite critical).

Is there a way to do this? How do you configure things so some projects have access to some of the CPUs, and some have access to all of them?

TIA.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Electrify their keyboards when you want them not to work.
:lol:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

Usually in a grid environment, that's where your Resources Management software such as PBS Pro comes in to play. It's a piece of cake for a situation like yours for PBS Pro to handle. You can even collect the resource usages data generated from PBS Pro to do the charge-back to each project, department, etc.
dqm
Premium Member
Premium Member
Posts: 18
Joined: Fri Oct 26, 2007 12:03 am

Post by dqm »

I'm not worried about charging the right cost centre, but being able to limit the amount of processing power they can possibly divert from the other more critical processes.
Is it as simple as defining a config file with a reduced list of nodes? Or is there more to it?
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

Yes, you configure that kind of information, how many CPUs, job's priority, etc. in PBS Pro and it will pass on that information to the Grid enablement software. It's much more to it.
dqm
Premium Member
Premium Member
Posts: 18
Joined: Fri Oct 26, 2007 12:03 am

Post by dqm »

lstsaur wrote:Yes, you configure that kind of information, how many CPUs, job's priority, etc. in PBS Pro and it will pass on that information to the Grid enablement software. It's much more to it.
So it's not something you can do in the DataStage config file, to limit how much of the grid a project's jobs can use?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You can certainly create a configuration file that requests fewer nodes than usual, but you can't prevent "them" from starting zillions of jobs each of which will dynamically request that number of nodes from the grid.

Some grid management software will let you throttle the number of nodes that a particular user may request, but that's not a feature within DataStage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

No, once your DataStage server is "gridified", your config_file is generated dynamacally for you based on the resource allocation configured in the PBS Pro. So that means I can configure your project always run on a particular server with only 1 CPU. You can also control your project job's priority and time parameters in there, so even zillions jobs are submitted from that project, but they will always be held in the "cube" until the priority and the time are met.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Do ALL variants of grid management software work that way? In particular, does Sun Grid Manager behave thus?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

Yes, Sun Grid Engine can perform all that functions too. Even Sun's line commands for qsub, qrel, qhold, qdelete, are exactly the same as PBS Pro.
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

Good question, very good answers. Learned something new today. Thanks!


...and no Ray - I wasn't talking about electrifying keyboards!

:D
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
dqm
Premium Member
Premium Member
Posts: 18
Joined: Fri Oct 26, 2007 12:03 am

Post by dqm »

Agreed, Andy: awesome responses. Thanks for all of your help, everyone.
:D
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

We're a grid shop.

I would investigate submitting Project X (which you want to run on a subset of your total GRID compute nodes) to a different Queue.

Your Grid Resource Manager defines whish servers can services which queues. So, if you want to limit them to 5 our of 20 grid servers, do it via your queue.

You can also submit them to a lower priority queue. This will only affect their wait time in the queue. Once dispatched to the GRID, you will not be able to lower their priority since DataStage is not really using the GRM to it's fullest potential.
Post Reply