QS, Standardize, and MNS primer

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
bobyon
Premium Member
Premium Member
Posts: 200
Joined: Tue Mar 02, 2004 10:25 am
Location: Salisbury, NC

QS, Standardize, and MNS primer

Post by bobyon »

Let me begin with I am a DataStage developer. I probably don't know enough about QS to even be qualified to ask questions in this forum.

Hoowever, I have been asked to standardize some address data and am doing my best.

I have figured out how to use the US Name and address rule sets to do some very basic standardization of names and addresses.

Now, I have been asked to standardize some international addresses.
I've reviewed the few (very few) pages of documentation I can find and found the following:

Code: Select all

WAVES or MNS stages process only address and postal code information. So you must remove the extraneous data from the file before you run the job.

Before you configure the WAVES or MNS stage, add a preprocessor stage such as the Standardize stage to the Designer client canvas to remove data unrelated to address specifics from the address columns. For example, "in care of," "contact," or "attention to" information is acceptable as part of the address but notes such as 'undeliverable" "requires signature" "secure entry" are not acceptable.

Also, the MNS and WAVES stages work best when the input data contains a country code. You can use the COUNTRY rule set in a Standardize stage to assign country codes.

Since the data I am working with includes such things as "FC FORGN" in many, if not most, of these addresses I would like to remove that data from the address line columns.

My thought was to build a job that would take the data through the COUNTRY rule set first to attempt to get a country code that can be used in the MNS stage. However, it appears that every address that includes "FC" and/or "FORGN" in one of the 4 address line columns results in "US" in the ISO countryCode column (because I added literal "ZQUSZQ" to the beginning of the Slected columns list) and "N" in the IdentifierFlag.

The data consists of 7 columns: Address Lines 1 - 4, City, State, Zip.
It appears that whatever has touched the data before being sent to me has recognized "FC" and/or "FORGN" iin one of the address line columns and copied those values to the State and zip columns.
I realize I could use DataStage stages to remove those values from the state and zip columns, but what is the right way to get those values out of the address line columns? If a Standardize stage, what rule set should be used?

I'm appreciative of any and all responses but am also perfectly willing to go read the primer on QS, Standardize, and MNS stages...... assuming one exists.

Thanks,
Bob
Bob
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

Are the addresses with FC FORGN all foreign addresses? If so, you could use DS to not only strip it out, but divert them to a path that runs COUNTRY first and goes from there. The ones that don't, if they are reliably US, can just be put through the US sets.

If it's not reliable, you can strip it out in DS or QS and pass it all through COUNTRY etc.
Post Reply