QS: undup VS undup independent

This forum is in support of all issues about Data Quality regarding DataStage and other strategies.

Moderators: chulett, rschirm

Post Reply
ponzio
Participant
Posts: 165
Joined: Mon Dec 05, 2005 9:13 am
Location: Italy

QS: undup VS undup independent

Post by ponzio »

Hi all.
In my experience the "undup" type of match ha some limitations:

1) groups created in a pass are not merged in the next passes

2) only master records and residuals are made available for
the subsequent passes


Undup independent seems to solve those limitations. To enter in a group I think it's enough that at least one of the match criteria expressed as a pass is matched, and this match doesn't need to be with the master record but it also can be with whichever of the records in the group.

The problmem with undup independent is that I don't know to which pass is related the weigth (I see in the extract file), because all the records in the group have the same pass number.

My question is: how can I know more about undup independent ?
How can I work with cut-offs ?

Thanks
ponzio
Participant
Posts: 165
Joined: Mon Dec 05, 2005 9:13 am
Location: Italy

Re: QS: undup VS undup independent

Post by ponzio »

Any news ?
ashok
Participant
Posts: 43
Joined: Tue Jun 22, 2004 3:04 pm

Post by ashok »

wt cut off's are explained very well in MatchConcepts, my suggestion is to work with 10/20 sample records and apply cut offs depending up on matching fields

To better understand use only one pass, later you can experiment with increasing number of passes.

Thanks
Ashok
ashok
Participant
Posts: 43
Joined: Tue Jun 22, 2004 3:04 pm

Post by ashok »

Regarding first question?

Undup independent will also bring records which you might not want to consider as required matches, so be carefull while selecting passes for Undup independent, Like pass one has hard key match & pass to has fussey match

example-
Pass 1- 10 char name & tax id
Pass 2- Name & address

do some analysis and decide which method is best to opt.
ponzio
Participant
Posts: 165
Joined: Mon Dec 05, 2005 9:13 am
Location: Italy

Post by ponzio »

Hi ashok
sorry for my delay...I forgot to check for new posts

:wink:


Thanks for yours suggestions but my question is quite different...
and sincerely, I didn'y undertand your last post :oops:

I know how weights cutoffs works with "undup" (I've been working on "big" projects with QS for 4 years also for Ascential/IBM)
...but the problems with cutoffs in "undup" is a little more particular...

Suppose you have 4 similar records as result of the first macth pass, then you setup a new cutoff for that pass and those records are splited into 2 match groups...
then suppose you setup the second match pass with others criterions.
With the criterions introduced by the second pass the 4 records should result similar and grouped in only one match group...
that is not possible because of the split done by the FIRST pass !


I know QS works this way and with "undup" only "single" records are added to match groups created in previous passes, no groups created in previous passess are added to others ...


My real question was related to the behaviuor of "undup independent" and how to use it :D
Post Reply