Hi,
I want to manipulate standardization output columns.Can i do that? Suppose i want to generate RVSNDX column for Primaryword1,Primaryword2 and Primaryword3 together.Can i do that by concatinating all three columns?
Manipulate standardization rule
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
For those particular combinations you are correct. It is not possible to assert that you are generally correct. Other words are similar in the right hand end, particularly names of corporate entities.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 527
- Joined: Thu Apr 19, 2007 1:25 am
- Location: Melbourne
You're obviously not using QualityStage to do this soundex. Soundex in a Transformer stage?
If you want a soundex that works on longer strings, you'll have to write it yourself. Though why you would, I'm not sure: phonetically it's quite loose, and it also falls down where the first letter of the strings differ.
So KrispyKreeme and CrispyKreeme will never match, regardless of how long you make the key.
If you want to use the ruleset properly (and back towards the original queation), you'll have to look at the fields you are given and understand what it does. If you need to, you can change the PAT and DCT files to add RVSNDX and NYSIIS fields to MatchPrimaryName3 (and 4 and 5 if need be) as well.
If you want a soundex that works on longer strings, you'll have to write it yourself. Though why you would, I'm not sure: phonetically it's quite loose, and it also falls down where the first letter of the strings differ.
So KrispyKreeme and CrispyKreeme will never match, regardless of how long you make the key.
If you want to use the ruleset properly (and back towards the original queation), you'll have to look at the fields you are given and understand what it does. If you need to, you can change the PAT and DCT files to add RVSNDX and NYSIIS fields to MatchPrimaryName3 (and 4 and 5 if need be) as well.
Stuart,
Yes u r right.I want to implement more powerful phonetic algorithm(reversesoundex,metaphone,double metaphone etc.) but as i m new to this tool,i really don't understand how to do that.I never change PAT or DCT files.It is very tricky to change anything in those files i guess.Can u just describe in detail how i can do that?
Yes u r right.I want to implement more powerful phonetic algorithm(reversesoundex,metaphone,double metaphone etc.) but as i m new to this tool,i really don't understand how to do that.I never change PAT or DCT files.It is very tricky to change anything in those files i guess.Can u just describe in detail how i can do that?
-
- Participant
- Posts: 527
- Joined: Thu Apr 19, 2007 1:25 am
- Location: Melbourne
You can't change the phonetic algorithms that are used in the QS rulesets.
They are part of the PAL.
The DCT file is just the output metadata for the ruleset. The QS user guide will explain it to you. As for the PAT file, read the Pattern Action Language Reference to understand what is in there. If I was doing it, I'd look to how it currently populates MatchPrimaryWord2NYSIIS and MAtchPrimaryWord2RVSNDX, and apply the same logic to MatchPrimaryWord3.
You would be able to write your own custom function in C/C++ to implement any phonetic algorithm and use it from within a transformer stage (although some like Double Metaphone may produce 2 output strings, so that will affect how you will use it). That is not QualityStage, however.
They are part of the PAL.
The DCT file is just the output metadata for the ruleset. The QS user guide will explain it to you. As for the PAT file, read the Pattern Action Language Reference to understand what is in there. If I was doing it, I'd look to how it currently populates MatchPrimaryWord2NYSIIS and MAtchPrimaryWord2RVSNDX, and apply the same logic to MatchPrimaryWord3.
You would be able to write your own custom function in C/C++ to implement any phonetic algorithm and use it from within a transformer stage (although some like Double Metaphone may produce 2 output strings, so that will affect how you will use it). That is not QualityStage, however.