Reading pipe delimited; but some columns missing

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
karumudi7
Participant
Posts: 20
Joined: Sun Mar 11, 2012 3:30 am
Location: Detroit
Contact:

Reading pipe delimited; but some columns missing

Post by karumudi7 »

Hi,

I have a pipe delimited file with 3 columns. But for some records at the end there are some missing fields and pipes (delimiter) is also not availble.
How can I read that file without losing any records!
Sample File:

Code: Select all

 Name|Email|Country
 ABC|abc@gmail.com|USA
 DEF|
 GHXY||INDIA
Here the problem is with second record, while reading the file I am getting only 2 records. Second one is rejected with error "Email" column not found...

Need to load these 3 records into the table, if data is not available NULL will be loaded for the purticular column.

Thanks!
DataStage 11.3 on AIX 7.1 with DB2.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

This was easier in Server, but there should be a way to do this in parallel jobs. Is this sequential file defined with variable field/record length and with a <cr><lf> or <lf> line terminator?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I experimented a bit and while there is no equivalent to the server "missing columns" mode, you can put in the following filter condition:

Code: Select all

awk 'BEGIN {FS = OFS = "|"} {$3 = $3; print}'
This will pad each input line to make exactly 3 pipe delimited fields; then you need to handle the empty strings appropriately in your job.
karumudi7
Participant
Posts: 20
Joined: Sun Mar 11, 2012 3:30 am
Location: Detroit
Contact:

Post by karumudi7 »

ArndW wrote:I experimented a bit and while there is no equivalent to the server "missing columns" mode, you can put in the following filter condition:

Code: Select all

awk 'BEGIN {FS = OFS = "|"}  ...[/quote]

Unable to read, Premium content.  :(
DataStage 11.3 on AIX 7.1 with DB2.
arunkumarmm
Participant
Posts: 246
Joined: Mon Jun 30, 2008 3:22 am
Location: New York
Contact:

Post by arunkumarmm »

Never mind about your column delimiter. Read the whole record as one single column and split it up inside the transformer using field.
Arun
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use a server Sequential File stage in a server Shared Container and make use of its excellent missing columns functionality.

Or, if it's a small to medium sized file, use a server job entirely.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rsomiset
Premium Member
Premium Member
Posts: 46
Joined: Fri Sep 21, 2007 7:16 pm

Post by rsomiset »

Did you try to read the whole record as 1 single column in source stage and then split using Field function in a transformer stage using delimiter(pipe in this case)?
--
Raj
Post Reply