Page 1 of 1

Review column analysis results in 'domain & Completeness

Posted: Thu Jul 19, 2012 10:09 am
by akonda
Hello,

I've done column Analysis using information analyzer. Now I'm trying to review the analysis in Frequency distribution -> 'domain & Completeness'. since I have millions records, its hard to go thru every entry. Is there a way to write a condition for each column and get the review done. like if the City name contains numeric values make the status invalid else valid.

Example:

Atlanta- Valid
Atalata12345- Invalid
atla- default

Posted: Thu Jul 19, 2012 12:31 pm
by stuartjvnorton
Under domain you can set valid ranges or use a list of valid results (ie for AU you could load in the Aus Post locality DB), and for format you can mark invalid patterns, but you can't use custom rules in here. Would be a handy thing though...

Posted: Thu Jul 19, 2012 1:57 pm
by akonda
Thanks for your reply.

Is it possible to change the status for multiple columns at a time. ?

Apperently, not possible but just want to make sure that I m not missing the fecility in the tool.

Posted: Wed Jul 25, 2012 7:24 pm
by stuartjvnorton
No you can't and I'm not sure why you'd want to.

Sure it would be quicker, but it only makes sense if you don't plan on reviewing the results you just produced. Which doesn't make sense.

Posted: Thu Jul 26, 2012 6:24 pm
by vmcburney
Have a look at DQ rules in Information Analyzer - I would recommend version 8.7 rollup 1. This lets you create the type of DQ rules you are looking for where the rule is defined as a multi criteria statement and it produces data quality metrics. For data that has millions of rows you will find DQ rules more effective than manual data checking. You can then bind these city rules to different instances of city columns.

Have a look at this article on pre-built address rules:
Using pre-built rule definitions with IBM InfoSphere Information Analyzer

You will also find QualityStage more suitable to cleansing these fields - so standardising Atalata into Atlanta.

Posted: Fri Jul 27, 2012 4:51 pm
by ray.wurlod
There are some consoles coming along to aid with managing these rules.

Can't tell you when, but they're in development now.