How does a join stage work?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Manu1
Participant
Posts: 36
Joined: Mon Aug 31, 2009 5:51 am
Location: Hyderabad

How does a join stage work?

Post by Manu1 »

Hi
Could any one explain me how does a join stage work from conceptual perspective?

If i have some 10 million recs on each i/p how/where does the join stores that data in order to compare/find a match/How it make use of memory and buffer etc?

Please let know if you need more details...
Manu
Datastage Devoloper
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:!: Please post in the correct forum going forward. I've moved your two posts here, any other questions specific to the EE/PX poduct belong here as well.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The tsort operator requires sorted inputs. It loads the rows having first join key value from both inputs into memory and performs the join type based on that set of records (those with the first join key value). Any result is pushed onto the output link and the memory is freed. It then loads the rows having second join key value from both inputs into memory, and processes these. And so on until end-of-data is processed. By default 20MB of real memory per partition is allocated, so that scratch disk ought not to be needed.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply