soundex/phonetic for non-english languages

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
dj
Participant
Posts: 78
Joined: Thu Aug 24, 2006 5:03 am
Location: india

soundex/phonetic for non-english languages

Post by dj »

Hi,

Are there any functions available for getting the phonetic nysiis values in quality stage for Chinese/Thai languages.

Thanks.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Basically no. NYSIIS was designed for US English (the NYS is New York State).

Chinese and Thai are both tonal languages, which means that the same glyph can represent different sounds. Usually this has to be inferred from the context. This makes it tricky to create any "sounds like" algorithm.

Once upon a time I created a Soundex-equivalent function using a DataStage server routine. This was for Pacific Islander languages, which are actually a little simpler than English. But incorporating this into QualityStage meant a pre-processing step of calculating the values BEFORE applying standardization and matching.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply