DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
This topic has been marked "Resolved."
Author Message
premkishore1983
Participant



Joined: 08 Oct 2007
Posts: 13
Location: India
Points: 243

Post Posted: Thu Jun 30, 2016 11:51 am Reply with quote Back to top

DataStage® Release: 11x
Job Type: Server
OS: Unix
Hi All,

Currently I'm in the process of migrating the datastage codes from DS9.1 (mounted in AIX) to DS11.3 (mounted in Red Hat 4.4.7-16).

Below universal code is written for scrubbing/converting particular position of characters of incoming EBCIDIC file into spaces.

It is working fine in Datastage9.1 in AIX & the downstream is able to process the files without any issues.

But when i try to run the same code against the same input EBCIDIC file in Datastage11.3 in LINUX, I could find the output file size is getting increased by 2.5 times & the downstream processing of the output file genertaed by this code is getting impacted.

I also tried commenting out the conversion piece of code & re-directing the input data directly to the output file, but still the output file size is getting increased 2.5 times.

Any help/suggestions on this regard could be really helpful.

Code:
FILE_NAME = SOURCE_DIR:"/":SOURCE_NAME:".new"
GOSUB INIT
*
OPENSEQ SOURCE_DIR:"/":SOURCE_NAME:".src" TO F.SRC ELSE
  Call DSLogInfo('Unable to open ':SOURCE_DIR:'/':SOURCE_NAME:'.src',"Output")
  ErrorCode = 1
  GOTO 10000
END
*
OPENSEQ SOURCE_DIR:"/":SOURCE_NAME:".new" TO F.NEW ELSE
  Call DSLogInfo('Unable to open ':SOURCE_DIR:'/':SOURCE_NAME:'.new',"Output")
  ErrorCode = 1
  GOTO 10000
END
*
CNT = 1
EOF = 0
LOOP UNTIL EOF DO
  READBLK RECORD FROM F.SRC,LENGTH THEN
    RECORD[76,6] = STR(CHAR(32),6);                               
    WRITEBLK RECORD ON F.NEW ELSE
      Call DSLogInfo('Unable to write to ':SOURCE_DIR:'/':SOURCE_NAME:'.new',"Output")
      ErrorCode = 1
      GOTO 10000
    END
    CNT = CNT + 1
    IF CNT/100000 = INT(CNT/100000) THEN
      Call DSLogInfo(CNT:' records processed...',"Output")
    END
  END ELSE
    EOF = 1
  END
  *
REPEAT
*
      Call DSLogInfo(CNT:' records processed...',"Output")

CLOSESEQ F.SRC
CLOSESEQ F.NEW
*
GOTO 10000
*
INIT: * - - Initialize Unix Flat File - - *
*
  Command = "rm ":FILE_NAME
  Call DSLogInfo("Command = ":Command,"Command")
  CMD = 'sh -c "':Command:'"'
  Call DSExecute("UV", CMD, Output, SystemReturnCode)
  Call DSLogInfo("Output ":Output, "Output")
  Call DSLogInfo("System Returncode ":SystemReturnCode, "SysCode")
  *
  Command = "touch ":FILE_NAME
  Call DSLogInfo("Command = ":Command,"Command")
  CMD = 'sh -c "':Command:'"'
  Call DSExecute("UV", CMD, Output, SystemReturnCode)
  Call DSLogInfo("Output ":Output, "Output")
  Call DSLogInfo("System Returncode ":SystemReturnCode, "SysCode")
  *
  Command = "chmod 660 ":FILE_NAME
  Call DSLogInfo("Command = ":Command,"Command")
  CMD = 'sh -c "':Command:'"'
  Call DSExecute("UV", CMD, Output, SystemReturnCode)
  Call DSLogInfo("Output ":Output, "Output")
  Call DSLogInfo("System Returncode ":SystemReturnCode, "SysCode")
  *
RETURN
10000
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42622
Location: Denver, CO
Points: 219444

Post Posted: Thu Jun 30, 2016 12:19 pm Reply with quote Back to top

Without looking at the code itself, the first thing that comes to mind when someone says "the output file size is getting increased by 2.5 times" that there is a codepage / characterset issue. Can you confirm / deny that?

_________________
-craig

And I'm hovering like a fly, waiting for the windshield on the freeway...
Rate this response:  
Not yet rated
premkishore1983
Participant



Joined: 08 Oct 2007
Posts: 13
Location: India
Points: 243

Post Posted: Thu Jun 30, 2016 1:14 pm Reply with quote Back to top

Thanks for the reply Chuck.

What do you mean by the codepage/characterset issue?
Can you please elaborate?

NLS Mapping was set to Project default (ISO8859-1) in both the environments.
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42622
Location: Denver, CO
Points: 219444

Post Posted: Thu Jun 30, 2016 1:35 pm Reply with quote Back to top

Craig, actually... not Chuck. Wink

I'm wondering what character encoding the file is being created with and if, perhaps, you've gone from single-byte to a multi-byte one. ISO 8859-1 is 8-bit single-byte coded graphic character set so guessing that's not the issue.

Have you compared the two files, either visually or thru your tool of choice? Can you determine where the increase in file size is coming from? If not, it might help to post a couple of records from each version, wrapped in [code] tags so we can see them.

_________________
-craig

And I'm hovering like a fly, waiting for the windshield on the freeway...
Rate this response:  
Not yet rated
premkishore1983
Participant



Joined: 08 Oct 2007
Posts: 13
Location: India
Points: 243

Post Posted: Fri Jul 01, 2016 12:09 pm Reply with quote Back to top

Below is a sample record,

AIX Output:

Code:
@@@@@@  PS0<@@@@@@@


LINUX Output:

Code:
�������@@@������������������������������������@�@@������� � PS0<������@@     



Input:

Code:
@@@@@@  PS0<@@     
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42622
Location: Denver, CO
Points: 219444

Post Posted: Fri Jul 01, 2016 12:31 pm Reply with quote Back to top

How about in a hex dump format? od -h should shed more light on this, I would think.

_________________
-craig

And I'm hovering like a fly, waiting for the windshield on the freeway...
Rate this response:  
Not yet rated
premkishore1983
Participant



Joined: 08 Oct 2007
Posts: 13
Location: India
Points: 243

Post Posted: Fri Jul 01, 2016 1:38 pm Reply with quote Back to top

Input :

Code:
0000000 f1f0 f6f1 e3c1 40d4 4040 f0f0 f0f5 f0f0
0000020 f0f8 f9f9 f1f0 f4f5 f6f6 f1f4 f0f0 f0f5
0000040 f1f1 d9f6 c5c4 d6d4 e3d5 c9c5 d5c1 c440
0000060 4040 f2f0 f0f4 f0f0 02f1 8120 0c01 5001
0000100 3053 c33c f3f2 d2f4 40c3 2040 2020 2020
0000120


AIX OutPut :

Code:
0000000 f1f0 f6f1 e3c1 40d4 4040 f0f0 f0f5 f0f0
0000020 f0f8 f9f9 f1f0 f4f5 f6f6 f1f4 f0f0 f0f5
0000040 f1f1 d9f6 c5c4 d6d4 e3d5 c9c5 d5c1 c440
0000060 4040 f2f0 f0f4 f0f0 02f1 8120 0c01 5001
0000100 3053 c33c f3f2 d2f4 40c3 4040 4040 4040
0000120



LINUX Output :

Code:
0000000 bfef efbd bdbf bfef efbd bdbf bfef efbd
0000020 bdbf bfef 40bd 4040 bfef efbd bdbf bfef
0000040 efbd bdbf bfef efbd bdbf bfef efbd bdbf
0000060 bfef efbd bdbf bfef efbd bdbf bfef efbd
0000100 bdbf bfef efbd bdbf bfef efbd bdbf bfef
0000120 efbd bdbf bfef efbd bdbf bfef efbd bdbf
0000140 bfef efbd bdbf bfef efbd bdbf bfef efbd
0000160 bdbf bfef efbd bdbf bfef efbd bdbf bfef
0000200 efbd bdbf ef40 bdbf 4040 bfef efbd bdbf
0000220 bfef efbd bdbf bfef efbd bdbf bfef 02bd
0000240 ef20 bdbf 0c01 5001 3053 ef3c bdbf bfef
0000260 efbd bdbf bfef efbd bdbf bfef 40bd 2040
0000300 2020 2020
0000304
Rate this response:  
Not yet rated
premkishore1983
Participant



Joined: 08 Oct 2007
Posts: 13
Location: India
Points: 243

Post Posted: Tue Jul 05, 2016 9:53 am Reply with quote Back to top

Hi All,

Good Morning,
Any suggestions to resolve this issue would be really helpful.
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42622
Location: Denver, CO
Points: 219444

Post Posted: Tue Jul 05, 2016 10:49 am Reply with quote Back to top

Well... still obviously some kind of codeset issue, though I couldn't tell you what you got there on the LINUX side. For both input and AIX, this:

Code:
"f1f0 f6f1 e3c1 40d4 4040 f0f0 f0f5 f0f0" = "1061TA M  00500"


This, the alleged equivalent on the LINUX side:

Code:
"bfef efbd bdbf bfef efbd bdbf bfef efbd"

I have no idea what that is. Sorry. Hopefully someone else can be more helpful.

I'm wondering if it is a Big Endian (AIX) vs. Little Endian (LINUX I believe) issue? Confused

_________________
-craig

And I'm hovering like a fly, waiting for the windshield on the freeway...
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54259
Location: Sydney, Australia
Points: 294280

Post Posted: Tue Jul 05, 2016 5:00 pm Reply with quote Back to top

I think it might be a byte order issue, as Craig suspects.

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
currently hiring: Canberra, Sydney and Melbourne
Rate this response:  
Not yet rated
premkishore1983
Participant



Joined: 08 Oct 2007
Posts: 13
Location: India
Points: 243

Post Posted: Wed Jul 06, 2016 8:55 am Reply with quote Back to top

Thanks for the reply Craig & Ray.

Can you please let me know where could i check the codeset / byte order issue. I mean whether this has to be checked at Datastage Level or OS level?
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54259
Location: Sydney, Australia
Points: 294280

Post Posted: Wed Jul 06, 2016 5:01 pm Reply with quote Back to top

If the source is a file, there may be a Byte Order Mark at the beginning of the file. If not there is no easy way to check the byte order, but you could try converting the file using dos2unix or ...

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
currently hiring: Canberra, Sydney and Melbourne
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42622
Location: Denver, CO
Points: 219444

Post Posted: Wed Jul 06, 2016 10:59 pm Reply with quote Back to top

Ah... that's what BOM stands for. Duh. Wink

There are many resources available online regarding the conversion from one 'endian' to another, for example this one. I'm wondering how the file gets 'downstream' from the Linux system, perhaps this transfer mechanism (something like Connect:Direct I imagine) could leverage the conversion. Do you have any friendly System Administrator types that might be able to help with this?

_________________
-craig

And I'm hovering like a fly, waiting for the windshield on the freeway...
Rate this response:  
Not yet rated
premkishore1983
Participant



Joined: 08 Oct 2007
Posts: 13
Location: India
Points: 243

Post Posted: Thu Jul 07, 2016 11:57 am Reply with quote Back to top

Thanks Craig for sharing the information on how little endian & big endian works.

We get the Input file from Mainframe connect directed to our AIX server, Since the connection between the mainframe and Linux was not established yet, we scp'ed the input file from AIX to LINUX for testing our jobs & we identified this issue.
Rate this response:  
Not yet rated
premkishore1983
Participant



Joined: 08 Oct 2007
Posts: 13
Location: India
Points: 243

Post Posted: Wed Jul 20, 2016 4:19 pm Reply with quote Back to top

I'm trying to convert the byte order to bid endian by using FORMAT.CONV, as like below

Code:

FILE_NAME = SOURCE_DIR:"/":SOURCE_NAME:".new"
FORMAT.CONV -u FILE_NAME

GOSUB INIT
*
OPENSEQ SOURCE_DIR:"/":SOURCE_NAME:".src" TO F.SRC ELSE
  Call DSLogInfo('Unable to open ':SOURCE_DIR:'/':SOURCE_NAME:'.src',"Output")
  ErrorCode = 1
  GOTO 10000
END
*

OPENSEQ SOURCE_DIR:"/":SOURCE_NAME:".new" TO F.NEW ELSE
  Call DSLogInfo('Unable to open ':SOURCE_DIR:'/':SOURCE_NAME:'.new',"Output")
  ErrorCode = 1
  GOTO 10000
END
*
CNT = 1
EOF = 0
LOOP UNTIL EOF DO
  READBLK RECORD FROM F.SRC,LENGTH THEN
    RECORD[76,6] = STR(CHAR(32),6);                             
    WRITEBLK RECORD ON F.NEW ELSE
      Call DSLogInfo('Unable to write to ':SOURCE_DIR:'/':SOURCE_NAME:'.new',"Output")
      ErrorCode = 1
      GOTO 10000
    END
    CNT = CNT + 1
    IF CNT/100000 = INT(CNT/100000) THEN
      Call DSLogInfo(CNT:' records processed...',"Output")
    END
  END ELSE
    EOF = 1
  END
  *
REPEAT
*
      Call DSLogInfo(CNT:' records processed...',"Output")

FORMAT.CONV -u FILE_NAME

CLOSESEQ F.SRC
CLOSESEQ F.NEW
*
GOTO 10000
*
INIT: * - - Initialize Unix Flat File - - *
*
  Command = "rm ":FILE_NAME
  Call DSLogInfo("Command = ":Command,"Command")
  CMD = 'sh -c "':Command:'"'
  Call DSExecute("UV", CMD, Output, SystemReturnCode)
  Call DSLogInfo("Output ":Output, "Output")
  Call DSLogInfo("System Returncode ":SystemReturnCode, "SysCode")
  *
  Command = "touch ":FILE_NAME
  Call DSLogInfo("Command = ":Command,"Command")
  CMD = 'sh -c "':Command:'"'
  Call DSExecute("UV", CMD, Output, SystemReturnCode)
  Call DSLogInfo("Output ":Output, "Output")
  Call DSLogInfo("System Returncode ":SystemReturnCode, "SysCode")
  *
  Command = "chmod 660 ":FILE_NAME
  Call DSLogInfo("Command = ":Command,"Command")
  CMD = 'sh -c "':Command:'"'
  Call DSExecute("UV", CMD, Output, SystemReturnCode)
  Call DSLogInfo("Output ":Output, "Output")
  Call DSLogInfo("System Returncode ":SystemReturnCode, "SysCode")
  *
RETURN
10000


but i'm getting the following error,


Warning
Quote:

0018 FORMAT.CONV -u FILE_NAME
^
Variable Name (UNDEFINED) unexpected, Was expecting: Assignment Operator
0057 FORMAT.CONV -u FILE_NAME
^
Variable Name (UNDEFINED) unexpected, Was expecting: Assignment Operator

2 Errors detected, No Object Code Produced


Am i missing something here, any suggestions would be helpful.
Rate this response:  
Not yet rated
Display posts from previous:

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours