Page 1 of 1
Generate same Key in the matched record
Posted: Wed Jul 31, 2013 7:53 pm
by hitmanthesilentassasin
Hi,
I am generating a golden record based on the matched record. after the match process I replace the primary key in the qsmatchsetid. the matched record is treated as golden record. This works fine for the same set of records . However, when there is addition to the group this golden record seems to change which is causing an issue to maintain the history of the data. can any one suggest any method to fix this issue? I have already tried sort the data but the sort order is not relevant when a new record is introduced in it. can't think of anything yet.
Thanks for any inputs or ideas
Cheers
Posted: Thu Aug 01, 2013 5:56 am
by rjdickson
Hi,
Sorry, but I do not completely understand the question. Are you using the Survive stage? Can you describe the QualityStage job you have created and give data examples?
Posted: Thu Aug 01, 2013 6:13 am
by hitmanthesilentassasin
Hi,
here is the sample input and output
First input
Code: Select all
Firstname lastname phone primarykey
Jack Smith 1234 123
Jack Smith 1234 345
after processing through match specification
Code: Select all
Firstname lastname phone primarykey QSDataId QSMatchsetId
Jack Smith 1234 123 1 1
Jack Smith 1234 345 2 1
I am replacing the qsmatchsetid with primarykey to store in the db
Code: Select all
Firstname lastname phone primarykey QSMatchsetId
Jack Smith 1234 123 123
Jack Smith 1234 345 123
this remains same as long as the group remains same. but when a new record is introduced the qsmatchsetid will change but since, previously it was generated as 123 it should still remain 123 also below is the scenario depicting it.
Second input
Code: Select all
Firstname lastname phone primarykey
Jack Smith 1234 123
Jack Smith 1234 345
Jack Smith 1234 000
expected output this time
Code: Select all
Firstname lastname phone primarykey QSMatchsetId
Jack Smith 1234 123 123
Jack Smith 1234 345 123
Jack Smith 1234 000 123
my question is, Is it possible to keep generating the same qsmatchsetid key regardless of addition of a record or not?
Posted: Thu Aug 01, 2013 2:59 pm
by rjdickson
Hi,
No you can not reliably control QSMatchSetID. It can be different for each run.
I would not recommend re-purposing QSMatchSetID - call it 'NewPrimaryKey' or something like that (to avoid confusion).
After the first match, you would have:
Code: Select all
Firstname lastname phone primarykey QSDataId QSMatchsetId NewPrimaryKey
Jack Smith 1234 123 1 1 123
Jack Smith 1234 345 2 1 123
The NewPrimaryKey would be populated by you after the match processing completed.
Now after the second match, your output may look like this:
Code: Select all
Firstname lastname phone primarykey QSDataId QSMatchsetId QSMatchType NewPrimaryKey
Jack Smith 1234 567 100 100 MP
Jack Smith 1234 123 200 100 DA 123
Jack Smith 1234 345 300 100 DA 123
Note that in my example, the new record actually became the master - an assumption for this illustration; it could be anywhere in the group.
You can now process this output to populate the missing 'NewPrimaryKey'
Posted: Thu Aug 01, 2013 6:08 pm
by ray.wurlod
You might even countenance using a Survivorship rule to accomplish that, such as "most frequently occurring non-blank".