DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
U
Participant



Joined: 17 Apr 2007
Posts: 225
Location: Singapore
Points: 2795

Post Posted: Wed Jun 25, 2014 11:41 pm Reply with quote    Back to top    

DataStage® Release: 8x
Job Type: Parallel
OS: Unix
We're now trying to cleanse some name data in which some of the name fields have PTY LTD at the beginning rather than at the end, for example "PTY LTD ACME SYSTEMS".

There may be further standardization to be done within the remainder of the name. What we'd like to be able to do, probably in Input_Modifications subroutine, is to move the PTY LTD tokens to the end of the buffer. There may or may not be punctuation involved ("PTY. LTD. ACME SYSTEMS" for example).

Can anyone offer advice about whether this is feasible?

Thank you for your time.
rjdickson
Participant



Joined: 16 Jun 2003
Posts: 378
Location: Chicago, USA
Points: 2531

Post Posted: Thu Jun 26, 2014 7:31 am Reply with quote    Back to top    

Hi,

In this particular case, you may want to insert some Pattern Action language elsewhere in rule. A possibility is right after the comment block that says:
Code:
;-------------------------------------------------
; Suffix (coming before the organization name)
;-------------------------------------------------


Insert the following:
Code:
W =A= "PTY" | O =A= "LTD" | & ;Added 26 June 2014
COPY_A [1] temp
CONCAT " " temp
CONCAT_A [2] temp
RETYPE [1] O temp temp
RETYPE [2] 0
COPY "Y" Suffix_Flag
SET_L_MARGIN OPERAND [3]


Notes:
1) Periods have already been removed by this point in the rule. So PTY. LTD. and PTY LTD will effectively be the same.
2) That the data must begin with PTY LTD, so "ACME SYSTEMS PTY LTD" will not be hit with this pattern (but will with others).
3) Make sure you comment the code with something you can find later, so that you can re-insert this code if you ever update with a new IBM provided rule set.

I hope this helps.

_________________
Regards,
Robert
Rate this response:  
Not yet rated
U
Participant



Joined: 17 Apr 2007
Posts: 225
Location: Singapore
Points: 2795

Post Posted: Thu Jun 26, 2014 9:46 pm Reply with quote    Back to top    

Thank you, that works well.

Now, of course, "they" want more. The new problem is to find "W =A= "PTY" | O =A= "LTD" anywhere in the string and move it to the end.

Examples
Code:
ALLEN & UNWIN PTY LTD      (ok)
ALLEN & PTRY LTD UNWIN
PTY TLD ALLEN & UNWIN      (ok)
ALLEN PTY LTD & UNWIN


Thank you for your time.
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54519
Location: Sydney, Australia
Points: 295643

Post Posted: Thu Jun 26, 2014 11:01 pm Reply with quote    Back to top    

Can you simply standardise them in QualityStage, and re-arrange them afterwards in DataStage?

_________________
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Rate this response:  
Not yet rated
rjdickson
Participant



Joined: 16 Jun 2003
Posts: 378
Location: Chicago, USA
Points: 2531

Post Posted: Mon Jun 30, 2014 2:33 pm Reply with quote    Back to top    

Ray has an excellent idea.

You can also change the pattern action language I provided earlier to scan from the left by starting pattern with a *. However, the margins (and subsequent PAL to deal with the margins) will likely be wrong.

You could also trap PTY LTD and put it off to the side in a temporary variable, and then test for non-blank when evaluating the suffix.

_________________
Regards,
Robert
Rate this response:  
Not yet rated
stuartjvnorton
Participant



Joined: 19 Apr 2007
Posts: 523
Location: Melbourne
Points: 3890

Post Posted: Mon Jun 30, 2014 5:51 pm Reply with quote    Back to top    

I'd look go with Robert's "also" suggestion.

I don't like the way it puts PTY at the end of PrimaryName_AUAME and LTD in NameSuffix_AUNAME. I like to move them both to NameSuffix_AUNAME. The concatenation at the end doesn't change, but can help marginally with matching.

You could do that early and not have to worry about trying to juggle tokens and margins etc.
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours