DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
This topic has been marked "Resolved."
Author Message
jneasy
Participant



Joined: 29 Jan 2012
Posts: 26
Location: Australia
Points: 268

Post Posted: Mon Aug 05, 2013 10:38 pm Reply with quote    Back to top    

DataStage® Release: 8x
Job Type: Parallel
OS: Unix
As my knowledge and exposure to QualityStage is limited I am unsure as to what would be the best approach to match client address to a G-NAF address.

My approach would be first attempt to split the addresses into Australian, non-Australian and unknown addresses using the COUNTRY rule set. From here would it be best practice to attempt to standardise the field FullAddress into Street name, street type, suburb, state, etc. then try and match against G-NAF?

To give you a bit of context, a lot of the addresses that are being put into the “Unknown” bucket seem to be a valid Australian address but are missing State and Country. My thought was to generate a reference list of suburbs, then perform some sort of lookup to determine if suburb substring in the FullAddress field appeared in the valid G-NAF suburb list or is this a bit of an overkill? Would it be better to standardise the address using AUAREA and AUADDR then attempt to match against G-NAF?

"G-NAF is the authoritative index of geocoded Australian addresses." http://www.psma.com.au/?product=g-naf

Cheers,
Joe.
stuartjvnorton
Participant



Joined: 19 Apr 2007
Posts: 518
Location: Melbourne
Points: 3853

Post Posted: Wed Aug 07, 2013 1:15 am Reply with quote    Back to top    

Do you have a raw source of G-NAF data sitting a DB tables, or another product like QAS where the data is encrypted & proprietary and you use the engine to access/match against the data?

If the former, you could use Country to split out the obviously foreign addresses first and either attempt to push the unknowns through the AU rulesets, or attempt to enrich the unknowns with State/Country from G-NAF or Aus Post locality file, using the locality and/or postcode. If you get a match and can enrich, then reclassify as AU and go.
You could also use AUPREP to try to split out the address data from the locality data (limited benefit from what I've seen in the past though it might be better now).
Once it's parsed, then create your match strategy and go.

If the latter, you call it however you want, with or without DataStage.

If you don't have either you could get the plugin for DataStage/QualityStage, which gives you a stage to put in your job and it takes care of the matching for you.
http://www-01.ibm.com/support/docview.wss?uid=swg24032994
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours