DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
bobyon



Group memberships:
Premium Members

Joined: 02 Mar 2004
Posts: 200
Location: Salisbury, NC
Points: 2021

Post Posted: Fri Oct 18, 2013 3:02 pm Reply with quote    Back to top    

DataStage® Release: 8x
Job Type: Parallel
OS: Unix
Let me begin with I am a DataStage developer. I probably don't know enough about QS to even be qualified to ask questions in this forum.

Hoowever, I have been asked to standardize some address data and am doing my best.

I have figured out how to use the US Name and address rule sets to do some very basic standardization of names and addresses.

Now, I have been asked to standardize some international addresses.
I've reviewed the few (very few) pages of documentation I can find and found the following:
Code:
WAVES or MNS stages process only address and postal code information. So you must remove the extraneous data from the file before you run the job.

Before you configure the WAVES or MNS stage, add a preprocessor stage such as the Standardize stage to the Designer client canvas to remove data unrelated to address specifics from the address columns. For example, "in care of," "contact," or "attention to" information is acceptable as part of the address but notes such as 'undeliverable" "requires signature" "secure entry" are not acceptable.

Also, the MNS and WAVES stages work best when the input data contains a country code. You can use the COUNTRY rule set in a Standardize stage to assign country codes.


Since the data I am working with includes such things as "FC FORGN" in many, if not most, of these addresses I would like to remove that data from the address line columns.

My thought was to build a job that would take the data through the COUNTRY rule set first to attempt to get a country code that can be used in the MNS stage. However, it appears that every address that includes "FC" and/or "FORGN" in one of the 4 address line columns results in "US" in the ISO countryCode column (because I added literal "ZQUSZQ" to the beginning of the Slected columns list) and "N" in the IdentifierFlag.

The data consists of 7 columns: Address Lines 1 - 4, City, State, Zip.
It appears that whatever has touched the data before being sent to me has recognized "FC" and/or "FORGN" iin one of the address line columns and copied those values to the State and zip columns.
I realize I could use DataStage stages to remove those values from the state and zip columns, but what is the right way to get those values out of the address line columns? If a Standardize stage, what rule set should be used?

I'm appreciative of any and all responses but am also perfectly willing to go read the primer on QS, Standardize, and MNS stages...... assuming one exists.

Thanks,
Bob

_________________
Bob
stuartjvnorton
Participant



Joined: 19 Apr 2007
Posts: 523
Location: Melbourne
Points: 3890

Post Posted: Fri Oct 18, 2013 7:07 pm Reply with quote    Back to top    

Are the addresses with FC FORGN all foreign addresses? If so, you could use DS to not only strip it out, but divert them to a path that runs COUNTRY first and goes from there. The ones that don't, if they are reliably US, can just be put through the US sets.

If it's not reliable, you can strip it out in DS or QS and pass it all through COUNTRY etc.
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours