DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
nilanjan
Participant



Joined: 18 Jan 2013
Posts: 16

Points: 178

Post Posted: Tue Mar 05, 2013 8:27 am Reply with quote    Back to top    

DataStage® Release: 8x
Job Type: Server
OS: Unix
Hi,
I want to manipulate standardization output columns.Can i do that? Suppose i want to generate RVSNDX column for Primaryword1,Primaryword2 and Primaryword3 together.Can i do that by concatinating all three columns?
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54524
Location: Sydney, Australia
Points: 295662

Post Posted: Tue Mar 05, 2013 1:14 pm Reply with quote    Back to top    

Yes you can, but it's probably a waste of time. RVSNDX (and Soundex) will only look at 4-6 characters.

_________________
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Rate this response:  
Not yet rated
nilanjan
Participant



Joined: 18 Jan 2013
Posts: 16

Points: 178

Post Posted: Wed Mar 06, 2013 12:08 am Reply with quote    Back to top    

ray.wurlod wrote:
Yes you can, but it's probably a waste of time. RVSNDX (and Soundex) will only look at 4-6 characters. ...


As i am using soundex function to generate phonetic code it does not generate proper code through which i can distinguish data.E.g. “AVP DISTRIBUTORS”,”AB PETROLEUM”,”ABOVE THE DUNES” generate same honetic code 'A113'.If i use reverse soundex it will consider 4 letters from the end part of word right?????? If it is right then it will generate different phonetic code.Am i right?
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54524
Location: Sydney, Australia
Points: 295662

Post Posted: Wed Mar 06, 2013 12:17 am Reply with quote    Back to top    

For those particular combinations you are correct. It is not possible to assert that you are generally correct. Other words are similar in the right hand end, particularly names of corporate entitie ...

_________________
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Rate this response:  
Not yet rated
stuartjvnorton
Participant



Joined: 19 Apr 2007
Posts: 523
Location: Melbourne
Points: 3890

Post Posted: Wed Mar 06, 2013 6:30 pm Reply with quote    Back to top    

You're obviously not using QualityStage to do this soundex. Soundex in a Transformer stage?

If you want a soundex that works on longer strings, you'll have to write it yourself. Though why you would, I'm not sure: phonetically it's quite loose, and it also falls down where the first letter of the strings differ.
So KrispyKreeme and CrispyKreeme will never match, regardless of how long you make the key.

If you want to use the ruleset properly (and back towards the original queation), you'll have to look at the fields you are given and understand what it does. If you need to, you can change the PAT and DCT files to add RVSNDX and NYSIIS fields to MatchPrimaryName3 (and 4 and 5 if need be) as well.
Rate this response:  
Not yet rated
nilanjan
Participant



Joined: 18 Jan 2013
Posts: 16

Points: 178

Post Posted: Wed Mar 06, 2013 10:58 pm Reply with quote    Back to top    

Stuart,
Yes u r right.I want to implement more powerful phonetic algorithm(reversesoundex,metaphone,double metaphone etc.) but as i m new to this tool,i really don't understand how to do that.I never change PAT or DCT files.It is very tricky to change anything in those files i guess.Can u just describe in detail how i can do that?
Rate this response:  
Not yet rated
nilanjan
Participant



Joined: 18 Jan 2013
Posts: 16

Points: 178

Post Posted: Wed Mar 06, 2013 10:59 pm Reply with quote    Back to top    

ray.wurlod wrote:
For those particular combinations you are correct. It is not possible to assert that you are generally correct. Other words are similar in the right hand end, particularly names of corporate entitie ...


Ray,
As i m not a premium user,i am unable to see your complete reply.
Rate this response:  
Not yet rated
stuartjvnorton
Participant



Joined: 19 Apr 2007
Posts: 523
Location: Melbourne
Points: 3890

Post Posted: Wed Mar 06, 2013 11:33 pm Reply with quote    Back to top    

You can't change the phonetic algorithms that are used in the QS rulesets.
They are part of the PAL.

The DCT file is just the output metadata for the ruleset. The QS user guide will explain it to you. As for the PAT file, read the Pattern Action Language Reference to understand what is in there. If I was doing it, I'd look to how it currently populates MatchPrimaryWord2NYSIIS and MAtchPrimaryWord2RVSNDX, and apply the same logic to MatchPrimaryWord3.

You would be able to write your own custom function in C/C++ to implement any phonetic algorithm and use it from within a transformer stage (although some like Double Metaphone may produce 2 output strings, so that will affect how you will use it). That is not QualityStage, however.
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours