Page 1 of 1

soundex/phonetic for non-english languages

Posted: Fri Jan 19, 2018 9:15 am
by dj
Hi,

Are there any functions available for getting the phonetic nysiis values in quality stage for Chinese/Thai languages.

Thanks.

Posted: Fri Jan 19, 2018 10:29 am
by ray.wurlod
Basically no. NYSIIS was designed for US English (the NYS is New York State).

Chinese and Thai are both tonal languages, which means that the same glyph can represent different sounds. Usually this has to be inferred from the context. This makes it tricky to create any "sounds like" algorithm.

Once upon a time I created a Soundex-equivalent function using a DataStage server routine. This was for Pacific Islander languages, which are actually a little simpler than English. But incorporating this into QualityStage meant a pre-processing step of calculating the values BEFORE applying standardization and matching.