Survivorship Rules with a pecking order

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
Derrickfn
Participant
Posts: 11
Joined: Fri Jul 18, 2008 4:07 pm
Location: Alexandria

Survivorship Rules with a pecking order

Post by Derrickfn »

Greetings dsxChange

I'm surviving an Individuals First, Middle and Last Names. I have to use the following pecking order to break a tie.

1) Name Part with the greatest Frequency (Use next rule if a tie)
2) Longest text (Use next rule if a tie)
3) alhabetical Order (I told our customer this is not a option)

Foir each name part I created to simple rules. The first using the Longest Technique and teh second set using the Most Frequent (Non-Blank) technique. However, when we have a tie on Frequency its just taking the last record.

Is there anyway to impliment a simple pecking order in the susrvivorship stage?

The following is a example using only the Middle Name:

Rec 1: J
Rec 2: Frank
Rec 3: Frank
Rec 4: J
Rec 5: A

Using the pecking order the middle name should be Frank. Its currently randomly picking one. This is only a issue when their is a tie and Frequency.

I would really like to use the Survivorship stage versus my own custom DS job. I have over 120 million records and the performance is very good. If I have to calculate frequency, lenght for each part and join them back together that is going to kill my Job performance. I did not see anything in the Complex rule set that could help me.

Thanks in advance
rjdickson
Participant
Posts: 378
Joined: Mon Jun 16, 2003 5:28 am
Location: Chicago, USA
Contact:

Post by rjdickson »

Hi,

Hm... Everything works for me in 8.7 when I use your data and order with the following two rules (in this order):
MiddleName: Most Frequent NonBlank
MiddleName: Longest

Input:

Rec 1: J
Rec 2: Frank
Rec 3: Frank
Rec 4: J
Rec 5: A

Output:
Rec 1: Frank

If you do not consistently see 'Frank', regardless of record order, then you should probably open a PMR.

Regarding alphabetical order, you can try to insert a sort on key+middlename before the survive. Sort middlename 'descending' so the 'a's are sorted to tbe bottom. That way, when the last record is picked, you will get the highest on in the alphabet. Insert two 'James' records in your data to try.
Regards,
Robert
Derrickfn
Participant
Posts: 11
Joined: Fri Jul 18, 2008 4:07 pm
Location: Alexandria

Post by Derrickfn »

greetings rjdickson,

Thanks for the reply!

If you add another J to the list it will still give you Frank instead of J because its now the highest frequency. Since both Freq and Lenght would always get a hit, it will never use the previous rules best record if there is a tie condition in the last rule. It will just pass though the last value evaluated. Which leads to random results.

With the Sorting I have all three name parts to process. I would have to resort because the best name part may not be in the record that I'm sorting in.

Derrick
rjdickson
Participant
Posts: 378
Joined: Mon Jun 16, 2003 5:28 am
Location: Chicago, USA
Contact:

Post by rjdickson »

Hmmm....

Try reversing the order of the rules in the gui, so that you read the logic from the bottom up (instead of top down):

First rule in the GUI: Longest
Second rule in the GUI: Most Frequent non-blank

So, processing the rules from the bottom up:
Rec 1: J
Rec 2: Frank
Rec 3: J
Rec 4: Frank
Rec 5: J
Rec 6: A

Produces expected output:
Output: J

Next test, processing the rules from the bottom up:
Rec 1: J
Rec 2: Frank
Rec 3: J
Rec 4: Frank
Rec 5: A

Produces expected output:
Output: Frank

Next test, processing the rules from the bottom up (records 3 and 4 were swapped to make sure the last record is not being selected):
Rec 1: J
Rec 2: Frank
Rec 3: Frank
Rec 4: J
Rec 5: A

Produces expected output:
Output: Frank

Give that a shot and see what happens.
Regards,
Robert
Post Reply