COUNTRY rule set is identifing the Country code

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
akonda
Participant
Posts: 97
Joined: Wed Feb 28, 2007 6:15 am

COUNTRY rule set is identifing the Country code

Post by akonda »

I m trying to identify the internation address country code using Country rule set in standarization stage. I also specified ZQUSZQ in literal and moved to selected columns as shown below:

Input_Column
<literal> ZQUSZQ

But after executing the job the output columns are like below

ISOCountryCode_COUNTRY is showing "US" for all the records which is wrong. Can somebody please suggest me.

Thanks
arun
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The literal ZQUSZQ instructs the rule set to use country code US in cases where it cannot determine the country code. The default country code, if you like. In this case the flag is set to "N" to indicate that it's not established as a confident selection.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

Can you give us some examples of records that shouldn't be returning US?
Maybe your data makes sense to you but doesn't have enough for QS to go on.

Not putting "common sense" and "obvious" domain knowledge aside when you work on this stuff is a common mistake.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Also try moving the literal ahead of the input column.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

ray.wurlod wrote:Also try moving the literal ahead of the input column. ...
I thought that too to start with, but I think it might be a typo.
You get ERR for the ISO country code if the parameter isn't at the start.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

Something else to remember is that unless your data contains the name of the country in a form the ruleset can recognise, it can only make a state/province or city guess for Australia, Canada, Germany, Spain, France, Great Britain, Italy or USA (and most of them only do state/provinces. Germany and France use the city and only GB uses both).
akonda
Participant
Posts: 97
Joined: Wed Feb 28, 2007 6:15 am

Post by akonda »

Thanks for the response. here is sample data

Input -
City :MISSISSAUGA
state:ON
country_Code: CAN

Output -
ISOCountryCode_Country = US
IdentifierFlag_Country = N

I've provided three input columns to Country rule set, but the rule set couldn't understand and defaulted to 'US'. Is there anything I am missing before I feed the input columns to Country rule set.
arun
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

The classification file doesn't recognise ISO 3166-1 alpha-2 or alpha-3 codes, only the (more or less) full names. So CANADA works where CA or CAN won't.

Seems like a fairly strange oversight, to be honest.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Even so youd've thought it would have nailed ON for the province Ontario. Ah well.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

ray.wurlod wrote:Even so youd've thought it would have nailed ON for the province Ontario. Ah well. ...
It's very limited in the patterns for Canada it will accept without an area code.
That said, If the "CAN" was "CA" it would have found that (hard coded because no alpha-2)...
rjdickson
Participant
Posts: 378
Joined: Mon Jun 16, 2003 5:28 am
Location: Chicago, USA
Contact:

Post by rjdickson »

Yes, the postal code would really help, here.

'CA' is a bit generic - could be California :) But yes, CAN should probably be there...

How about adding a classification override:
Input Token: CAN
Standard Form: CA
Classification: C

This works for me...
Regards,
Robert
Post Reply