Junk character in sequential file viewer stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
rumu
Participant
Posts: 286
Joined: Mon Jun 06, 2005 4:07 am

Junk character in sequential file viewer stage

Post by rumu »

Hi All,

I am reading a cobol EBCDIC file using CFF stage and loading it into Sequential file stage. There are 2 fields defined in CFF as PICX(2) and PIC X(1) which is in Record Layout shown as CHARCTER 2 and CHARACTER 1 respectively.

I directly mapped those 2 fields to sequential file stage using datatype Char(2) and CHAR(1).
Some data are shown in the datastage file viewer for the field with 2 charadters as

Code: Select all

?|
The second character is | like but I can not copy it as when I am pasting only ? is pasted.
I used String to Raw function to display it and I got the following

Code: Select all

{1a 18}
Are these CAN and Linefeed in HEX? How do I remove them ? When I see it in Unix, it shows noting....

The column has 2 distinct Values when I putput in Unix

RE and blank.

I use dthe following commnd to deisplay in hexdump:

Code: Select all

-bash-4.2$ cat RDTDP.txt|cut -d'|' -f1|sort|uniq|hexdump
0000000 181a 1a0a 0a1a 4552 000a
0000009
How can I convert these foreign characters to space?
Rumu
IT Consultant
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What character set are you using? Could these be double-byte representations of Unicode characters?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rumu
Participant
Posts: 286
Joined: Mon Jun 06, 2005 4:07 am

Post by rumu »

Hi Ray,

The NLS map is set to Project default(UTF-8).

I used following derivation in the transformer and those characters were not seen.
Trim(Trim(DSLink3.RDT_ADDL_SEG_KEY_PROD,char(24)),char(26))
I used 24 as Dec representation for hex 18 and 26 is Dec representation for hex 1A.
Is that approach ok?
Rumu
IT Consultant
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Who knows? You've condemned what might be valid characters to be "junk". I'd examine that assumption pretty closely.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rumu
Participant
Posts: 286
Joined: Mon Jun 06, 2005 4:07 am

Post by rumu »

Hi Ray,

I used StringToRaw function to to check the values. How can I identify whether it is a double byte character?
Rumu
IT Consultant
Post Reply