Handling nulls in seq files in PX

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
fridge
Premium Member
Premium Member
Posts: 136
Joined: Sat Jan 10, 2004 8:51 am

Handling nulls in seq files in PX

Post by fridge »

Hi, I am trying to read a file from a seq stage in PX , the file contains numeric (int,decimal) values that have spaces to signify nulls.

Is there an easy way to 'map' spaces to nulls and avoid a record going down the reject link without going via a tranformer first.

Sorry for such a basic question, but we will encounter this a lot and I want to avoid my developers having to transform --> if isNull(field) do this else field shennaigans.

Have looked at properties in seq file format stage but cant get a reasonable (i.e low development) way of doing it.

Thanks in advance

Ed
Gazelle
Premium Member
Premium Member
Posts: 108
Joined: Mon Nov 24, 2003 11:36 pm
Location: Australia (Melbourne)

Post by Gazelle »

If I understand you right, this is done by specifying "null_field" value for each column, in the initial Sequential File stage.

eg. If ColumnA is a fixed-length Integer(10) field, then you specify

Code: Select all

null_field='          '
When reading the file, if ColumnA contains a string of ten spaces, then Datastage will convert the field into its internal representation of null.

You can then pass the field to a transformer and perform whatever null-handling you require.
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi,

1. What do u mean by "null_field" value. There is no property like "null_field" in the sequential file stage. Do u mean the value of Column A in the sequential file should be 10 spaces. Pls explain

2. If a field contains a value of 10 spaces does Datastage convert it to NULL?

Can u brief me more on these 2 queries

Thanks
Rich

--Pride comes before a fall
Gazelle
Premium Member
Premium Member
Posts: 108
Joined: Mon Nov 24, 2003 11:36 pm
Location: Australia (Melbourne)

Post by Gazelle »

Sorry for the confusion.

"null field" is how it appears on the internal record definition, but in the Stages it's called "Null field value".
1. Here are the steps to get to the "Null field value" property:
  1. In the "Sequential File" stage, on the "Columns" tab, if you right-click on the columns grid and select "Properties" you'll see a "Grid Properties" dialog box.
  2. Here you can select additional properties to appear on the columns grid. Select the "Field Properties" option (and opt to "Save settings for future display").
  3. Now there should be a column titled "Field properties" on the columns grid. Double-click on this cell, and a dialog box appears, with all sorts of column properties.
  4. If the column is made "nullable", then there will be a properties category called "Nullable", under which you can select the "Null field value".
  5. For the "Null field value", enter the value to be interpreted as NULL (eg. for ColumnA, enter ten spaces).
2. After defining the "Null field value" as ten spaces, then if Datastage finds that ColumnA has a value of 10 spaces, Datastage converts the value to NULL.

Note that if the Column property is consistent for all columns (eg. if "Null field value" is ten spaces for allof the fields in the record), then you can define the "Null field value" on the "Format" tab, under the "Field Defaults" category.

Note also that the reverse can be defined when outputting the column. ie. If a "Null field value" is defined when ouputting to a sequential file, then Datastage will convert NULL to the defined "Null field value".
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi,

Iam writing my database rejects to a .csv file. I have a field EFFECTIVE_OUT_DTS which is a timestamp and is a null field.
I got this warning initially
At field "EFFECTIVE_OUT_DTS": Exporting nullable field without null handling properties

I used the techique you had mentioned in your post and set the "null_value" to be 4 spaces

I got this warning
At field "EFFECTIVE_OUT_DTS": "null_field" length (4) must match field's fixed width (19)

I used 19 spaces instead and the job run successfully.

Thanks a ton for your post.

But I have a doubt - I had mentioned the length of the timestamp field to be 38. But the warning says that I have to use the field's fixed width of 19.
Why is that. Can anybody clarify me on this.

Thanks
--Rich
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

38 is (probably) the precision of the underlying numeric data type. 19 is the number of characters in a TimeStamp (without sub-seconds). There are, for example, 26 characters in a TimeStamp(6) - that is, TimeStamp with microseconds. However, DataStage does not seem to recognize precision for timestamps.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Gazelle
Premium Member
Premium Member
Posts: 108
Joined: Mon Nov 24, 2003 11:36 pm
Location: Australia (Melbourne)

Post by Gazelle »

That issue of null_values for fixed-length fields (eg. a timestamp field) was a real problem for us.

We had a pipe-delimited file with a timestamp field, where || signified a null.

Ascential provided a patch that basically provides a new setting (a project environment variable: "APT_IMPEXP_ALLOW_ZERO_LENGTH_FIXED_NULL").
I do not know if this allows four spaces to be defined as the null_value, but you could ask Ascential support.
Post Reply