DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
This topic has been marked "Resolved."
Author Message
U
Participant



Joined: 17 Apr 2007
Posts: 223
Location: Singapore
Points: 2778

Post Posted: Mon Sep 09, 2013 9:44 pm Reply with quote    Back to top    

DataStage® Release: 8x
Job Type: Parallel
OS: Unix
We have a problem in standardizing addresses like 123 JOHN ST ST LEONARDS using a modified rule set in which ST is standardised to STREET.

QualityStage parses the above address as "123", "JOHN STREET", "ST" (which is not what we want).

Short of using a PREP rule set, is there any smart way for parsing the above address?

Thank you for your time.
stuartjvnorton
Participant



Joined: 19 Apr 2007
Posts: 518
Location: Melbourne
Points: 3853

Post Posted: Mon Sep 09, 2013 10:41 pm Reply with quote    Back to top    

This is probably a combination of things:

- you are trying to parse with locality at the end. Normally it works ok, but with localities starting with classified terms like St you will have issues.

- Did you update Multiple_Semantics when you set ST to be standardised to STREET?

- The default rule for Streets (the subroutine) takes up to the last known Street Type in the address, which includes the St out of St Leonards.

As the whole STREET vs ST thing, I think it might depend on what you did in Multiple_Semantics. It seems like it's converted them to STREET, but then later converted the last one back as part of the

COPY_A [3] {StreetType}


To go ahead with this without using AUPREP, you'd probably need a list of the localities that start with St (and other Street Types that start locality names and any others that start with words that are classified, especially the directionals for Street Type Direction which could be worse)

Check those before or at the start of Multiple_Semantics:

eg

? | M =T= "ST" | & = @ST_LOCALITIES.TBL | $
RETYPE [2] + "SAINT" "SAINT"

etc.

Doable, but not overly nice...
Rate this response:  
Not yet rated
U
Participant



Joined: 17 Apr 2007
Posts: 223
Location: Singapore
Points: 2778

Post Posted: Tue Sep 10, 2013 3:03 am Reply with quote    Back to top    

Thank you. That's basically what we did - handled all instances of "ST" within Multiple_Semantics explicitly, and made sure that the locality information is not processed by the address rule set. So it's all good now.

Thank you again for your time and advice.
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours