FTP Enterprise stage to read Mainframe variable blk dataset

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Chandrathdsx
Participant
Posts: 59
Joined: Sat Jul 05, 2008 11:32 am

FTP Enterprise stage to read Mainframe variable blk dataset

Post by Chandrathdsx »

Hi,
I am trying to read data from a mainframe variable block (VB) dataset with complex structure (multiple record types) using FTP Enterprise

Mainframe dataset properties:
RECFM = VB
LRECL = 35
DCB -- none

Code: Select all

 
Layout:
FLD               LEN   FORMAT
REC-LEN            04     BINARY
SEG-TYPE          04     CHAR
ROOT-SEG-KEY  15     CHAR
LINE-NO            03     BINARY
SEGMENT-DATA    VARIABLE   BASED ON THE 'SEG-TYPE'

Sample data [no delimter, just I separated the fields into readable text format. But couple of fields are binary the values I see in Hex on on mainframe:
30	SEG1	KEY1	1	XXXXX
35	SEG2	KEY1	2	XXXXXXXXXX
32	SEG3	KEY1	3	XXXXXXX
30	SEG1	KEY2	1	YYYYY
32	SEG3	KEY2	2	YYYYYYY  
Stage output
properties:
source open command: site filetype=seq recfm=VB QUOTE SITE RDW
transfer type: ASCII
Format:
record level format as mainframe COBOL
record type: Implicit
delimiter: none
char set: tried EBCDIC (getting all junk chars), tried with ASCII
byte order: native-endien
allow all zeros: yes
Columns tab:
varchar 35 [max of all record types]



Job design:
FTP enterprise --> Transform --> Seq file
Issues:
1. Part of the data missing for some records. may be due to VB, I do not see DCB parm in JCL that create the dataset I am trying to read in Datastage.
2. The numeric fileds getting corrupted. I tried numeric out_format = sprintf_string, no luck.

I saw couple of FAQs related to Mainframe datasets, CFF and FTP enterprise, which is very useful. But, despite of all my trouble shoots I am unable fix the issue with missing part of data which I am suspecting the DCB and blocks though not sure how to fix it.
Appreciate any help related to it.
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

The use of variable length fields is the first problem. In the COBOL formatting, the binary record length "prefix" is explicitly defined as its own field, but I don't see a similar "prefix" for SEGMENT-DATA. That it's variable based on SEG-TYPE is not enough. You need the field length prefix.

You have a basic design flaw at the source of your data. I don't have enough experience with variable length data in COBOL to offer any suggestions. The only thing I can think of is to have a mainframe program that converts your dataset into something more readable. For example, if your maximum record length is 35 bytes, the program would convert the record to fixed-length, and pad SEGMENT-DATA with spaces.

In the meantime, I do have a CFF suggestion: do a binary FTP of the dataset to a sequential file on the DS server. Keep it in EBCDIC, then use CFF to try to read it successfully. I did just that to over 300 jobs, because it's best for maintenance and processing error control, and streaming the data directly from the FTP stage prevents you from using the advantages of CFF.

About reading the downloaded EBCDIC file: it's inconvenient if you don't have an editor that converts for you, but you can use View Data in the Output tab of CFF.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
Chandrathdsx
Participant
Posts: 59
Joined: Sat Jul 05, 2008 11:32 am

Post by Chandrathdsx »

Thank you for suggestion!

The first field (4 binary) has the total length of record.

Regarding binary FTP with EBcdic, does it work for the fields that are numeric, comp fields as well parsing thru CFF stage?

Thank you!
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

CFF is very good with all of the COBOL formats, with some exceptions outlined in the FAQs. Binary (COMP) fields can have endian issues, for example.

The prefix field showing the record length is the first obstacle, so far as I can tell. Removing it by reformatting to a fixed-length record is the simplest solution for that.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

We use a generic FTP Enterprise stage and job for all mainframe datasets. It has a single column definition of binary with unspecified length. Since the datasets have no delimiters, this creates a file on the server that is identical to the dataset source, a single stream of data full file to full file.

The CFF takes care of reading the downloaded file in EBCDIC and according to the copybook required. Even CFF has problems with variable lengths, which is why I impose a technical requirement on my mainframe developers: no variable length fields or records, no OCCURS DEPENDING, and no low-values (x00, ASCII null) in character fields.

They don't have to modify the dataset they create. All they have to do is insert a step to reformat it to my requirements.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
Post Reply