usage of join,merge and lookup

legendkiller · Post by **legendkiller** » Wed Nov 24, 2004 6:27 am

in what scenario we used join, lookup or merge, I guess they serve common purpose

vigneshra · Post by **vigneshra** » Wed Nov 24, 2004 6:55 am

You have enough details in Parallel job developer guide. Anyway, lookups have to be done only if the data to be looked up is very small. In that case, lookups' performance will be better than counter parts. On the other hand, if the data volume is high, you have to go for join or merge, depending on what the requirement is.

Regards,
Vignesh.

mandyli · Post by **mandyli** » Thu Nov 25, 2004 12:13 am

Hi

the same topic coverd many times. any way please check previous post.

or check in ParallelDevel.pdf doc.

Thanks
man

legendkiller · Post by **legendkiller** » Thu Nov 25, 2004 5:49 am

Hi

Thanks for response.

But what generally defines the small volume of data and what range defines the large volumns of data.

Are there are some standard ranges of volumn of data, so that based on range we clearly say if this is volume of data use lookup other wise join like that.

richdhan · Post by **richdhan** » Thu Nov 25, 2004 8:03 am

Hi,

Although lookup, join and merge may serve the same purpose it depends on the scenario as well as the volume of data that you are dealing with.

A lookup is nothing but a left outer join and is used for less volume. A join has more options other than left outer join but it cannot handle rejects. The merge stage is similar to the join stage and is capable of handling rejects.

HTH
--Rich

Pride comes before a fall
Humility comes before honour

legendkiller · Post by **legendkiller** » Fri Nov 26, 2004 6:17 am

Thanks again,

but agin my question remains unanswerd. Are there are certain ranges of volume of data for which we can say if this is range use join. if this is range then use lookup etc.

vigneshra · Post by **vigneshra** » Fri Nov 26, 2004 8:43 am

As its name suggests, lookups are used for looking up some values based on key values. Generally dimension tables are looked up. Volume of data lookups can handle depends on your hardware configuration of the server. Generally tables or files that do not grow enormously over a period of time are looked up. Also look up will return only one row even if there are more than one row available for the match (ODBC multirow lookup is an exception).

Vignesh.

rajeev_prabhuat · Post by **rajeev_prabhuat** » Sun Nov 28, 2004 10:31 pm

Hi Vignesh,

If the reference to a Lookup is directly from a table, and the number of input rows is significantly more than (eg. 1:100 or more), and if job is small enough to fit into available physical memory, then go for lookup. Else go for join or Merge Stage.

legendkiller wrote:in what scenario we used join, lookup or merge, I guess they serve common purpose

vigneshra · Post by **vigneshra** » Sun Nov 28, 2004 11:00 pm

Thanks for the info rajeev !!

mandyli · Post by **mandyli** » Sun Nov 28, 2004 11:38 pm

Hi Rajeev,

U r correct.

Thanks
Man

T42 · Post by **T42** » Mon Nov 29, 2004 11:28 pm

legendkiller wrote:but agin my question remains unanswerd. Are there are certain ranges of volume of data for which we can say if this is range use join. if this is range then use lookup etc.

It should be based on the amount of memory you have available to you at that particular moment. This is not a cut n' dry answer, but with some smart calculations, and careful observation of the job during its run, you can determine what your general limits are.

And be careful -- I'm referring to production machines. That is why it is always best to have an equivalent machine for development with PX -- easy to tune without risking a production box due to mischeft.

DSXchange

usage of join,merge and lookup

usage of join,merge and lookup

Re: usage of join,merge and lookup