Generate same Key in the matched record

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
hitmanthesilentassasin
Participant
Posts: 150
Joined: Tue Mar 13, 2007 1:17 am

Generate same Key in the matched record

Post by hitmanthesilentassasin »

Hi,

I am generating a golden record based on the matched record. after the match process I replace the primary key in the qsmatchsetid. the matched record is treated as golden record. This works fine for the same set of records . However, when there is addition to the group this golden record seems to change which is causing an issue to maintain the history of the data. can any one suggest any method to fix this issue? I have already tried sort the data but the sort order is not relevant when a new record is introduced in it. can't think of anything yet.

Thanks for any inputs or ideas

Cheers
rjdickson
Participant
Posts: 378
Joined: Mon Jun 16, 2003 5:28 am
Location: Chicago, USA
Contact:

Post by rjdickson »

Hi,

Sorry, but I do not completely understand the question. Are you using the Survive stage? Can you describe the QualityStage job you have created and give data examples?
Regards,
Robert
hitmanthesilentassasin
Participant
Posts: 150
Joined: Tue Mar 13, 2007 1:17 am

Post by hitmanthesilentassasin »

Hi,

here is the sample input and output

First input

Code: Select all

Firstname lastname phone primarykey
Jack          Smith    1234     123
Jack          Smith    1234      345  

after processing through match specification

Code: Select all

Firstname lastname phone primarykey  QSDataId  QSMatchsetId
Jack          Smith    1234     123           1                  1
Jack          Smith    1234      345          2                  1
I am replacing the qsmatchsetid with primarykey to store in the db

Code: Select all

Firstname lastname phone primarykey  QSMatchsetId
Jack          Smith    1234     123                           123
Jack          Smith    1234      345                           123
this remains same as long as the group remains same. but when a new record is introduced the qsmatchsetid will change but since, previously it was generated as 123 it should still remain 123 also below is the scenario depicting it.

Second input

Code: Select all

Firstname lastname phone primarykey
Jack          Smith    1234     123
Jack          Smith    1234      345  
Jack          Smith    1234      000  
expected output this time

Code: Select all

Firstname lastname phone primarykey  QSMatchsetId
Jack          Smith    1234     123                           123
Jack          Smith    1234      345                           123
Jack          Smith    1234      000                           123
my question is, Is it possible to keep generating the same qsmatchsetid key regardless of addition of a record or not?
rjdickson
Participant
Posts: 378
Joined: Mon Jun 16, 2003 5:28 am
Location: Chicago, USA
Contact:

Post by rjdickson »

Hi,

No you can not reliably control QSMatchSetID. It can be different for each run.

I would not recommend re-purposing QSMatchSetID - call it 'NewPrimaryKey' or something like that (to avoid confusion).

After the first match, you would have:

Code: Select all

Firstname lastname phone primarykey  QSDataId  QSMatchsetId NewPrimaryKey 
Jack      Smith    1234  123         1         1            123 
Jack      Smith    1234  345         2         1            123 
The NewPrimaryKey would be populated by you after the match processing completed.

Now after the second match, your output may look like this:

Code: Select all

Firstname lastname phone primarykey  QSDataId  QSMatchsetId QSMatchType NewPrimaryKey 
Jack      Smith    1234  567         100       100          MP          
Jack      Smith    1234  123         200       100          DA          123 
Jack      Smith    1234  345         300       100          DA          123 
Note that in my example, the new record actually became the master - an assumption for this illustration; it could be anywhere in the group.

You can now process this output to populate the missing 'NewPrimaryKey'
Regards,
Robert
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You might even countenance using a Survivorship rule to accomplish that, such as "most frequently occurring non-blank".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply