Splitting name into title, forename, surname

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
AlexD
Participant
Posts: 3
Joined: Fri Mar 03, 2006 9:43 am
Location: London

Splitting name into title, forename, surname

Post by AlexD »

Hi,

I've probably posting something that is a fairly common problem, unfortunately the posts I've searched on have only really solved it for strings in a consistent format. We have a fullname field that needs to be split into title, forename/initials and surname fields. The problem is that the fullname field contains a variety of name formats

e.g.
Mr Edward Smith
D Jones
Miss F G H Underhill
Peter Morrison
T.D. Watson
etc.

Obviously using Field and defining space as a delimiter is not going to work effectively for this and Datastage itself is possibly unsuitable. Is there some code that someone has devised for this scenario...or should I be looking at QualityStage?

Thanks in advance,
Alex
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

This is precisely the task that QualityStage performs, almost out of the box. You can invoke a QualityStage standardization task through a QualityStage stage in a DataStage job.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jhmckeever
Premium Member
Premium Member
Posts: 301
Joined: Thu Jul 14, 2005 10:27 am
Location: Melbourne, Australia
Contact:

Post by jhmckeever »

ray.wurlod wrote:This is precisely the task that QualityStage performs, almost out of the box. You can invoke a QualityStage standardization task through a QualityStage stage in a DataStage job.
The USNAME/GBNAME rulesets achieve this with little or no customization effort required. You can play with your examples using the Rules Analyzer to see if it meets your needs.

J.
AlexD
Participant
Posts: 3
Joined: Fri Mar 03, 2006 9:43 am
Location: London

Post by AlexD »

Thanks for your responses. We shall look into these areas of Quality Stage more closely, but does anyone know a way of doing it in DataStage?

regards,
Alex
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Alex,

there is no builtin function to accurately do this in DataStage. There are 3rd party product for name cleansing out there (Trillium software comes to mind).

In the past I've programmed my own logic in short routines, using spaces, commas, and periods as delimiters and using a list of known prefixes and titles to strip the non-name portions out; then using the last word as the family name and any string left for the first and second names.
Post Reply