did you know? PX does not manipulate the OS scheduler

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bigpoppa
Participant
Posts: 190
Joined: Fri Feb 28, 2003 11:39 am

did you know? PX does not manipulate the OS scheduler

Post by bigpoppa »

A common misconception about PX is that it has some control over an operating system's scheduler. However, PX only invokes parallel instances of processes; it allows the OS scheduler to schedule the processes on the CPUs.

For example, if a 4-way PX configuration file is set, in general, PX will schedule 4 parallel instances of each parallel stage in a job. You may notice several processes prefixed with "APT" when you are monitoring a PX job on a UNIX system. Some of these processes are the parallel instances of the job's stages.

Some PX stages, such as 'generator', are set to run sequentially by default. PX does not invoke parallel instances of these stages, and all of the data that needs to be processed by a sequential stage must be collected before that stage can process any data.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There is, of course, a little more to it than that. Licensing is per-CPU both for DataStage and PX. This means, to preserve BP's example, that a four-way PX configuration file is set, then at most four CPUs will be devoted to this parallel job, even though there may be more CPUs in the machine.
Post Reply