Name Cleansing PTY LTD

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
U
Participant
Posts: 230
Joined: Tue Apr 17, 2007 8:23 pm
Location: Singapore

Name Cleansing PTY LTD

Post by U »

We're now trying to cleanse some name data in which some of the name fields have PTY LTD at the beginning rather than at the end, for example "PTY LTD ACME SYSTEMS".

There may be further standardization to be done within the remainder of the name. What we'd like to be able to do, probably in Input_Modifications subroutine, is to move the PTY LTD tokens to the end of the buffer. There may or may not be punctuation involved ("PTY. LTD. ACME SYSTEMS" for example).

Can anyone offer advice about whether this is feasible?

Thank you for your time.
rjdickson
Participant
Posts: 378
Joined: Mon Jun 16, 2003 5:28 am
Location: Chicago, USA
Contact:

Post by rjdickson »

Hi,

In this particular case, you may want to insert some Pattern Action language elsewhere in rule. A possibility is right after the comment block that says:

Code: Select all

;-------------------------------------------------
; Suffix (coming before the organization name)
;-------------------------------------------------
Insert the following:

Code: Select all

W =A= "PTY" | O =A= "LTD" | & ;Added 26 June 2014
COPY_A [1] temp
CONCAT " " temp
CONCAT_A [2] temp
RETYPE [1] O temp temp
RETYPE [2] 0
COPY "Y" Suffix_Flag
SET_L_MARGIN OPERAND [3]
Notes:
1) Periods have already been removed by this point in the rule. So PTY. LTD. and PTY LTD will effectively be the same.
2) That the data must begin with PTY LTD, so "ACME SYSTEMS PTY LTD" will not be hit with this pattern (but will with others).
3) Make sure you comment the code with something you can find later, so that you can re-insert this code if you ever update with a new IBM provided rule set.

I hope this helps.
Regards,
Robert
U
Participant
Posts: 230
Joined: Tue Apr 17, 2007 8:23 pm
Location: Singapore

Post by U »

Thank you, that works well.

Now, of course, "they" want more. The new problem is to find "W =A= "PTY" | O =A= "LTD" anywhere in the string and move it to the end.

Examples

Code: Select all

ALLEN & UNWIN PTY LTD      (ok)
ALLEN & PTRY LTD UNWIN
PTY TLD ALLEN & UNWIN      (ok)
ALLEN PTY LTD & UNWIN
Thank you for your time.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Can you simply standardise them in QualityStage, and re-arrange them afterwards in DataStage?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rjdickson
Participant
Posts: 378
Joined: Mon Jun 16, 2003 5:28 am
Location: Chicago, USA
Contact:

Post by rjdickson »

Ray has an excellent idea.

You can also change the pattern action language I provided earlier to scan from the left by starting pattern with a *. However, the margins (and subsequent PAL to deal with the margins) will likely be wrong.

You could also trap PTY LTD and put it off to the side in a temporary variable, and then test for non-blank when evaluating the suffix.
Regards,
Robert
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

I'd look go with Robert's "also" suggestion.

I don't like the way it puts PTY at the end of PrimaryName_AUAME and LTD in NameSuffix_AUNAME. I like to move them both to NameSuffix_AUNAME. The concatenation at the end doesn't change, but can help marginally with matching.

You could do that early and not have to worry about trying to juggle tokens and margins etc.
Post Reply