Suspect matchng in Qualitystage

vputta · Post by **vputta** » Mon Mar 12, 2012 8:01 am

HI

I have an requirement to perform suspect matching in Quality stage .
we have all the source data in one dataset . This dataset has the below information -

First name,last name,Address,Gender,Phone number

I need to identify the suspects from this file .
Rules - If First name,last name,Address,Phone number matches between any of the records and Gender does not match then identify the records as suspects .
If First name,last name,Address matches and Gender,Phone number does not match then identify it as suspect.
Similar way I have some 15 rules .

I want to know the appropriate Stage in Qualitystage that can achieve this result . Can it be Reference Match stage ?

Thanks

ray.wurlod · Post by **ray.wurlod** » Mon Mar 12, 2012 11:37 am

Use special variable handling on gender CRITICAL MISSING OK, or apply a heavy disagreement weight override to gender. This does not seem to be a candidate for a reference match (where's your second input?): prefer an Unduplicate match in this case.

vputta · Post by **vputta** » Wed Apr 11, 2012 6:14 am

I have all the 100 million records in one dataset . My objective is to identify the suspects .Can we achieve this through Qualitystage ?
Can you plz explain me the detailed job design .

Thanks

ray.wurlod · Post by **ray.wurlod** » Wed Apr 11, 2012 4:16 pm

What I provided was, pretty much, the detailed design. Standardize the data, create the match specification (only you can decide the appropriate blocking columns and matching commands for each pass), and calculate match frequencies before performing the matching step.