Page 1 of 1

Split multiple State tokens from single field

Posted: Thu May 12, 2005 1:56 pm
by mpine
I am trying to clean up a file. The file is in the following format:
40212 VA, TN
40419 CA,KY - IN
40520 RI,P.A.

The pattern report comes out like this:
00082394 63.618% S [X] | NH
00024016 18.543% SS [X] | OH,MS
00009435 7.285% SSS [X] | CA,FL,OH

It also has other entries but I would be happy with getting these
paterns cleaned up. I need to resulting file to look like:

40312 VA TN
40319 CA KY IN
40320 RI PA

Can you tell me the best way to accomplish this?

be happy with getting these

Posted: Thu May 12, 2005 2:32 pm
by DSkkk
Hi- So basically what u want is u need to eliminate the special characters from ur Data file. I am sure this can be achieved by using a Parse Stage
in Quality Stage where u can specify which character should be replaced by what. But this is slightly Cumbersome because u need to specify all the special characters that might occur and which u dont want. So the bottomline is that u can do this in a much efficient way in Datastage.

Cheers
DSkkk

Posted: Thu May 12, 2005 6:55 pm
by ray.wurlod
Can't you just make sure that "-", "," and "." are in your STRIPLIST ?

Not sure why you'd want to force the third character of the first column to be "3", so I'm assuming that the problem there was with the keyboard (or, at least, between the keyboard and the chair). :lol:

follow up

Posted: Fri May 13, 2005 9:31 am
by mpine
ray.wurlod wrote:Can't you just make sure that "-", "," and "." are in your STRIPLIST ?

Not sure why you'd want to force the third character of the first column to be "3", so I'm assuming that the problem there was with the keyboard (or, at least, between the keyboard and the chair). :lol:
It was a problem between the keyboard and the chair :D

Running a word investigate stage determines the states without a problem using the STRIPLIST, but I am not sure how to actually move the S (state) tokens it identifies into separate fields so I can use the pivot stage in DataStage to make the file how I would like. The final product would be as follows:

Out of QualityStage somehow:

40312 VA TN
40319 CA KY IN
40320 RI PA

Out of DataStage using a pivot stage (I think):

40312 VA
40312 TN
40319 CA
40319 KY
40319 IN
40320 RI
40320 PA

Posted: Fri May 13, 2005 6:17 pm
by ray.wurlod
Pivot stage is a good call. Just make sure that its input link has enough columns to cope with the largest possible list of states. (50 is probably over the top, but is fail safe!)