Retyping to any of the know classes

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
hitmanthesilentassasin
Participant
Posts: 150
Joined: Tue Mar 13, 2007 1:17 am

Retyping to any of the know classes

Post by hitmanthesilentassasin »

Hi,

I am trying to remove dots following initials and combine them. after combining the initials I would want to retype the token to appropriate class. But I am not able to do.

below is the example I am trying to achieve.

Code: Select all

Input string:
1. abc xyc Pty L.T.D
2. J.O.H.N CAFE
Output pattern I am looking to achieve is :

Code: Select all

1. abc xyz pty LTD -> ++WO
2. JOHN CAFE -> FW
I have already tried retyping the token like below but nothing seems to be working.

Code: Select all

1. RETYPE [1] & "JOHN"
2. RETYPE [1] * "JOHN"
3. RETYPE [1] + "JOHN"
Thanks for your help!!
rjdickson
Participant
Posts: 378
Joined: Mon Jun 16, 2003 5:28 am
Location: Chicago, USA
Contact:

Post by rjdickson »

Hi,

Is the output pattern significant, or just how you want to deal with the data?

The most direct way is to test for the specific words toward the top of the rule set:

*&="L" | . | &="T" | . | &="D"
RETYPE [1] O "LTD" "LTD"
RETYPE [2] 0
RETYPE [3] 0
RETYPE [4] 0
RETYPE [5] 0

Note that the xxNAME rule sets already do this.

For JOHN, the same routine in xxNAME (the Periods subroutine) also outputs 'JOHN'.

I hope this helps!
Regards,
Robert
hitmanthesilentassasin
Participant
Posts: 150
Joined: Tue Mar 13, 2007 1:17 am

Post by hitmanthesilentassasin »

Thanks for your the reply rjdickson.

I wanted to classify the tokens to appropriate class after suppressing noise. the dot sepator was just an example. If I can classify the tokens to appropriate classes then they would get handled with the following patterns else I would have to handle each word in the pattern.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

hitmanthesilentassasin wrote:the dot sepator was just an example
And yet in your post all you mentioned was "I am trying to remove dots following initials and combine them" so that's why you got the answer you did. Always best to lead with a full example / explanation of what you need so you don't have people waste time spinning up an answer that doesn't really help you.

So what exactly falls into your "noise" category? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
hitmanthesilentassasin
Participant
Posts: 150
Joined: Tue Mar 13, 2007 1:17 am

Post by hitmanthesilentassasin »

Craig - in the example I provided the dots separators are the noise. my question was about retyping the cleansed tokens to the classifications defined within the classification table without having to retype each word to specific class.

Thanks for your help!!
hitmanthesilentassasin
Participant
Posts: 150
Joined: Tue Mar 13, 2007 1:17 am

Post by hitmanthesilentassasin »

I have managed to find below URL which does exactly what I was looking for. All I need to do is to manage separate list of classifications in the reference table.

http://www-01.ibm.com/support/knowledge ... okens.html
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Excellent.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply