Whole name and salutation

This forum is in support of all issues about Data Quality regarding DataStage and other strategies.

Moderators: chulett, rschirm

Post Reply
zhaicaibing
Participant
Posts: 49
Joined: Wed Jun 11, 2003 12:49 am

Whole name and salutation

Post by zhaicaibing »

Hi,

I have a salutation table that contain all the valid salutations e.g.
Dr.
Dato Dr.
Tan Sri
Dato Dr. Tan Sri
YB
YB DATIN
Yang Berhormat

I have another table that contain people information including the name and name with salutation in a column called WholeName.

The Name column contains e.g.
Peter Goh
Anita Mui Yen Fong

The WholeName column contain e.g.
Tan Dri Peter Goh
YB Anita

I would like to find the people table WholeName's salutation that cannot be found in the salutation table.

Please advice.
timwalsh
Participant
Posts: 29
Joined: Tue Mar 04, 2003 7:48 am

Post by timwalsh »

What Tools do you have to work with?
DataStage?
INTEGRITY?
zhaicaibing
Participant
Posts: 49
Joined: Wed Jun 11, 2003 12:49 am

Partial String references a column in another table

Post by zhaicaibing »

I am using Quality Manager
timwalsh
Participant
Posts: 29
Joined: Tue Mar 04, 2003 7:48 am

Post by timwalsh »

I think that's going to be a little difficult with Quality Manager. That type of data profiling I would normally perform in INTEGRITY via pattern investigation.

The reason is that the "Salutation" has varying lengths and varying number of words within the "WholeName w/ Salutation" Field. Therefore, Quality Manager can not simple substring out or parse out the value.

Option 1: Via Pattern Investigation
I would first classify the term as salutation or name, then parse out the salutations from the whole name, then compare the parsed out salutations to the list of salutations that you already have. You can then determine the salutations that you have not determine.

Option 2: Via Pattern Investigation
If you can identify names, but not salutions, then I would use the above method, but parse out the names and not the salutions.

There are a few different methods that I can think of, but they all require pattern investigation.

Pattern Investigations is time consuming. I would be very interesting in different methods that would accomplish the same thing!!!

Tim
Post Reply