Dataset lookup

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
rggoud
Participant
Posts: 15
Joined: Thu Nov 06, 2003 9:59 am

Dataset lookup

Post by rggoud »

Hi,

Why the number of the records from Dataset (that gets displayed in monitor window) reference link in a dataset lookup always match the the primary link's number of records ? I created a dataset with 20,000 records in Job X. In Job Y, i use this dataset as a reference link. Primary link has about 3000 records. When i ran the job Y, the show performance stat. is giving me 3000 records on primary link and 3000 records (instead of 20,000) on reference link. Any reason why ? Is it doing some sort of join even before the lookup internally ?

Thanks.

Raj.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Basically, the link row count shows the number of rows that resulted from the lookup operation itself.
In your case, all lookups succeeded, so there were 3000 rows returned along that link.
If some of the lookups had failed, fewer than 3000 rows would have been processed along that link.
The fact that there are 20000 rows in the dataset to which the lookup refers is immaterial. This figure would be pertinent if you had a job that pre-loaded the data set - you would see 20000 rows being processed along that link. But if PX loads the dataset for you, this is not reported directly.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply