Suspect matchng in Qualitystage

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
vputta
Premium Member
Premium Member
Posts: 47
Joined: Wed Oct 08, 2008 7:35 am
Location: Charlotte

Suspect matchng in Qualitystage

Post by vputta »

HI

I have an requirement to perform suspect matching in Quality stage .
we have all the source data in one dataset . This dataset has the below information -

First name,last name,Address,Gender,Phone number

I need to identify the suspects from this file .
Rules - If First name,last name,Address,Phone number matches between any of the records and Gender does not match then identify the records as suspects .
If First name,last name,Address matches and Gender,Phone number does not match then identify it as suspect.
Similar way I have some 15 rules .

I want to know the appropriate Stage in Qualitystage that can achieve this result . Can it be Reference Match stage ?

Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use special variable handling on gender CRITICAL MISSING OK, or apply a heavy disagreement weight override to gender. This does not seem to be a candidate for a reference match (where's your second input?): prefer an Unduplicate match in this case.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vputta
Premium Member
Premium Member
Posts: 47
Joined: Wed Oct 08, 2008 7:35 am
Location: Charlotte

Post by vputta »

I have all the 100 million records in one dataset . My objective is to identify the suspects .Can we achieve this through Qualitystage ?
Can you plz explain me the detailed job design .

Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What I provided was, pretty much, the detailed design. Standardize the data, create the match specification (only you can decide the appropriate blocking columns and matching commands for each pass), and calculate match frequencies before performing the matching step.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply