How Many Stages in a PX Job

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nrevezzo
Participant
Posts: 15
Joined: Mon Sep 08, 2003 2:36 pm

How Many Stages in a PX Job

Post by nrevezzo »

Based on how PX creates UNIX processes, one for each parallel stage X the number of nodes, should there be a limit on the number of stages in a PX job and what should it be based on?
bigpoppa
Participant
Posts: 190
Joined: Fri Feb 28, 2003 11:39 am

Amdahl's Law

Post by bigpoppa »

Amdahl's law, a heuristic for determining "speedup", is below and can be generally used to 'fit' your PX job to your machine.

Amdahl's law:
If F is the fraction of a calculation that is sequential, and (1-F) is the fraction that can be parallelised, then the maximum speedup that can be achieved by using P processors is 1/(F+(1-F)/P).

But, in practice, I like to write several configuration files and apply my PX job to each one. Then I mentally build a graph of # of processes (which increases as the degree of parallelism increases) vs. job completion time. This way I get a good idea of where the # of processes (and, therefore, the degree of parallelism) starts to give me diminishing returns for a given job.

To help out, you may want to set the PX env var APT_DUMP_SCORE=1. The score dump lists, among other things, the number of processes generated by a PX job for a specific config file. The score dump is a handy report to understand, so I encourage all PX users to become familiar with it.

HTH,
BP
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Bigpoppa

Geez, I hope we are graded on a curve. It has been awhile since I took Calculus. Very cool answer.

Kim.
Mamu Kim
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

I'm a rocket scientist and that answer boggles my mind. I prefer the "TransGalactic-Vogon-Super-Highway It's-A-Small-Moon-No-It's-The-DeathStar My-God-It's-Full-Of-Stars We-Need-A-Bigger-Boat" equation that says if you write one job that completely maximizes every ounce of resources then back it off a little and try again. Keep backing off as long as it goes faster, and then stop when it degrades again.

Now that you've tuned for your development environment, do it again in your UAT environment. Now that you've tuned for your UAT environment, do it again in your production environment.

And keep doing it every time you make design changes.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply