matching

This forum is in support of all issues about Data Quality regarding DataStage and other strategies.

Moderators: chulett, rschirm

Post Reply
nag0143
Premium Member
Premium Member
Posts: 159
Joined: Fri Nov 14, 2003 1:05 am

matching

Post by nag0143 »

After matching i got records as XA and DA how do i relate this to the output file i want to use in datastage ....
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

XA means the "master record" in a block of probable duplicates. It tends to have the highest composite weight.

DA means one of the other records in a block of probably duplicates.

RA means a "residual", a record with no probably duplicates. You have to remember to include these in your survivorship rules.

CR means "clerical review", a record whose composite weight is between the two cutoffs you establish to segregate confirmed duplicates (composite weight above the upper threshold), confirmed non-duplicates (composite weight below the lower threshold), and "clerical review" records which should be inspected by a clerk for decision.

There is also a number associated with this. For example RA1 is a residual found in the first match pass, DA2 is a duplicate found in the second match pass, and so on.

The blocks into which these probable duplicates fall form the bases for your decision making during the next phase of processing, survivorship.
Last edited by ray.wurlod on Sat Feb 03, 2007 2:10 pm, edited 1 time in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ashok
Participant
Posts: 43
Joined: Tue Jun 22, 2004 3:04 pm

Post by ashok »

In extract file you need to create a field to allow match stage to populate match set numbers, this is explained in QS documents,

example
TYPE SET
XA 1
DA 1
DA 1
XA 2
DA 2
RA 3
RA 4
Post Reply