ST and other multi-use tokens

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
U
Participant
Posts: 230
Joined: Tue Apr 17, 2007 8:23 pm
Location: Singapore

ST and other multi-use tokens

Post by U »

We have a problem in standardizing addresses like 123 JOHN ST ST LEONARDS using a modified rule set in which ST is standardised to STREET.

QualityStage parses the above address as "123", "JOHN STREET", "ST" (which is not what we want).

Short of using a PREP rule set, is there any smart way for parsing the above address?

Thank you for your time.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

This is probably a combination of things:

- you are trying to parse with locality at the end. Normally it works ok, but with localities starting with classified terms like St you will have issues.

- Did you update Multiple_Semantics when you set ST to be standardised to STREET?

- The default rule for Streets (the subroutine) takes up to the last known Street Type in the address, which includes the St out of St Leonards.

As the whole STREET vs ST thing, I think it might depend on what you did in Multiple_Semantics. It seems like it's converted them to STREET, but then later converted the last one back as part of the

COPY_A [3] {StreetType}


To go ahead with this without using AUPREP, you'd probably need a list of the localities that start with St (and other Street Types that start locality names and any others that start with words that are classified, especially the directionals for Street Type Direction which could be worse)

Check those before or at the start of Multiple_Semantics:

eg

? | M =T= "ST" | & = @ST_LOCALITIES.TBL | $
RETYPE [2] + "SAINT" "SAINT"

etc.

Doable, but not overly nice...
U
Participant
Posts: 230
Joined: Tue Apr 17, 2007 8:23 pm
Location: Singapore

Post by U »

Thank you. That's basically what we did - handled all instances of "ST" within Multiple_Semantics explicitly, and made sure that the locality information is not processed by the address rule set. So it's all good now.

Thank you again for your time and advice.
Post Reply