DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
rumu
Participant



Joined: 06 Jun 2005
Posts: 282

Points: 2830

Post Posted: Fri Nov 23, 2018 8:32 am Reply with quote    Back to top    

Hi Frank,
Your are correct, the copybook given to me was wrong where all the decimal field was made as COMP-3..Yesterday I received a correct copybook where I found that the above mentioned columns are COMP hence when I imported the metadata, it shows me smallint length 4.

As I received a complete copybook of onetype of records so the last challenge remains to extract the first type of records from the binary file and pass those to CFF stage . I am now stuck with filtering one type of data from the binary file.
Could you please help on this?

_________________
Rumu
IT Consultant
Rate this response:  
Not yet rated
rumu
Participant



Joined: 06 Jun 2005
Posts: 282

Points: 2830

Post Posted: Fri Nov 23, 2018 1:28 pm Reply with quote    Back to top    

Hi,

While loading metadata in CFF stage , should we select all the group columns ? also there is an option for "Create Filler" I assume this should not be checked.

In the next window for loading metadata, there are 3
1)Flatten Selective arrays
2)All arrays
3)AS IS

Which option should be selected?

_________________
Rumu
IT Consultant
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 43011
Location: Denver, CO
Points: 221947

Post Posted: Sat Nov 24, 2018 5:56 am Reply with quote    Back to top    

rumu wrote:
I used the command in the command line, it did not work as it returns 0 rows.

Figured. You've got an EBCDIC file so a simple grep for ASCII characters wasn't going to work.

Idea Always test your filter commands outside of DataStage first.

_________________
-craig

Space Available
Rate this response:  
Not yet rated
rumu
Participant



Joined: 06 Jun 2005
Posts: 282

Points: 2830

Post Posted: Mon Nov 26, 2018 6:13 am Reply with quote    Back to top    

Hi Craig,
That's correct...My File in EBCDIC and I requested to change it to ASCII during FTP but mainframe guy did not agree.
Without converting to ASCII, is there a way to filter records before passing to CFF?

_________________
Rumu
IT Consultant
Rate this response:  
Not yet rated
FranklinE



Group memberships:
Premium Members

Joined: 25 Nov 2008
Posts: 739
Location: Malvern, PA
Points: 7018

Post Posted: Mon Nov 26, 2018 7:06 am Reply with quote    Back to top    

Rumu,

There's too much going on here for you to keep up with the details. I strongly recommend sitting with your mainframe developer and going through an exercise in simplification.

On the copybook, as they created it, get help with flattening the groups manually. Edit the copybook before importing it to DataStage. This gives you two advantages: you don't have to rely on DS to make decisions about groups, and you have a clear set of fields to test your input data.

Ideally, when you finish, you'll have a series of record types and layouts, with just the REDEFINES at the highest level to separate them.

You do not need to convert to ASCII if your mainframe team can be engaged to do any data manipulation before you use FTP and CFF. It is just better to do it there for many reasons, starting with the mainframe processes created the file and they are best suited to manipulating it.

One choice I might test: rather than one complex multi-record type file, break it down to one file for each record type. This is not difficult to do in COBOL (or most macro-based utilities they might have). DataStage is nice to leverage for how it is prepared to handle some things, and not nice when those things don't fit what DataStage expects from them.'

In short, if you can't make it work easily, you need to put in the work to simplify.

_________________
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: http://www.dsxchange.com/viewtopic.php?t=143596 Using CFF FAQ: http://www.dsxchange.com/viewtopic.php?t=157872
Rate this response:  
Not yet rated
rumu
Participant



Joined: 06 Jun 2005
Posts: 282

Points: 2830

Post Posted: Mon Nov 26, 2018 10:04 am Reply with quote    Back to top    

Hi Frank,

I have suggested the mainframe team to split the files for each of the Record types but they said they cant do it. So I was trying to levearage the 'Constraint' tab in Output tab. I checked single record type and use HeaderRecordID fied to select one type of records for which metadata is defined.
But it seems, did not work as I am getting warnings as 'Short input record" or " record overrun" leading Import error .
If we select Single record type, then this constartint tab wont work?


While loading the columns, I unchecked the group columns as checked Create Filler options. I selected all the columns. Then used the Flatten selective arrays optins.


Are these settings ok?


My main issue is , receiving no help from mainframe team as they are third party. They are just sending the File. No one from Mainframe team is avavialble to discuss.

I have some queries on your following suggestion:

Code:
On the copybook, as they created it, get help with flattening the groups manually. Edit the copybook before importing it to DataStage. This gives you two advantages: you don't have to rely on DS to make decisions about groups, and you have a clear set of fields to test your input data.

Ideally, when you finish, you'll have a series of record types and layouts, with just the REDEFINES at the highest level to separate them.



Different record types are defined by level 01 so how can I get series of record types and Layouts using Rededife clause? I may be missing your point so If you can explain a bit , will be good for me.
I know this has been going on for long but really this is a cumbersome task that I have been assigned.
No one else is there in my project to help me out. So I am coming back to you all again and again.Thanks for bearing me for such a long time.

_________________
Rumu
IT Consultant
Rate this response:  
Not yet rated
FranklinE



Group memberships:
Premium Members

Joined: 25 Nov 2008
Posts: 739
Location: Malvern, PA
Points: 7018

Post Posted: Mon Nov 26, 2018 10:18 am Reply with quote    Back to top    

Rumu,

The lack of cooperation with you is a serious problem. I'm sorry that you have to deal with it.

You can try to do the split yourself. It won't be easy. If you have a consistent record length for each record type, the "split" job would just identify the record type and write it to its own file. You could then use CFF to read each file based on the copybook section for the record type.

If the records don't have a consistent length, I'm not sure what you would do, but there may be methods others here can suggest.

The basic approach would be a two column read. First column would be the record type field, the second column would be the rest of the record. I would use transformer constraints to split the records to their files. Each of the split files could then be read by their own CFF stages with the matching copybook section.

_________________
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: http://www.dsxchange.com/viewtopic.php?t=143596 Using CFF FAQ: http://www.dsxchange.com/viewtopic.php?t=157872
Rate this response:  
Not yet rated
rumu
Participant



Joined: 06 Jun 2005
Posts: 282

Points: 2830

Post Posted: Mon Nov 26, 2018 10:54 am Reply with quote    Back to top    

Hi Frank,

In one of my previous posts, I mentioned that I am trying to read the file using Sequential File stage with one column as Varbinary then try to split the record in columns using transformer stage, there I faced issue that while converting the first 6 bytes (which holds the Recordid for the each type of records) using rawtoString function, I could view the Recordid in datastage viewer and unix but Transforstage could not read it as String and can not split it...
You suggested to read the entire record as Char but that makes the job in hanging state generating warnings. So I went back to mainframe team to split the files from tehir side but they said no.
As the file is in EBCDIC format, what exactly data type to be used in splitting the files in datastage ?

_________________
Rumu
IT Consultant
Rate this response:  
Not yet rated
rumu
Participant



Joined: 06 Jun 2005
Posts: 282

Points: 2830

Post Posted: Tue Nov 27, 2018 9:10 am Reply with quote    Back to top    

Hi All,

Can you please let me know how can I read the Cobol binary record as 1 record and split based on Record identifies. I need help mainly in datatype selection.

I used Varbinary to read the record as single file and then in the transformer stage used rawTo string function to the column and take first 5 characters to compare with the Record string. But the equality is not working ..
My Input data is 16326 byte long and Recird id is 10 byte long. so I read in one column as 16336 length of Var binary. Do I need make any specific changes in Format tab? Like Record delimeter, String type etc?
If I want to cut 11 to 16326 as a raw data to feed to subsequent CFF stage then which data type to be used ?

_________________
Rumu
IT Consultant
Rate this response:  
Not yet rated
FranklinE



Group memberships:
Premium Members

Joined: 25 Nov 2008
Posts: 739
Location: Malvern, PA
Points: 7018

Post Posted: Tue Nov 27, 2018 10:44 am Reply with quote    Back to top    

Assumptions:

Input record length is 16336 bytes. First 10 bytes is record type indentifier.
Every record is consistently of that length.
You must preserve the EBCDIC character set.

Sample table definition:
Code:
REC_TYPE Char(10).
DATA_COLUMN Binary(16326).


You can use Sequential File stage for this. Set the format of the input file the same as it would be set in CFF. Your transformer output links would go to Sequential Stages, and each link constraint would be for the record type.

It shouldn't matter what SQL type the DATA_COLUMN is because you will not attempt to use any functions on it in the split job. Each output file would be read for the section of the copybook for the record type, and CFF would be best for that.

_________________
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: http://www.dsxchange.com/viewtopic.php?t=143596 Using CFF FAQ: http://www.dsxchange.com/viewtopic.php?t=157872
Rate this response:  
Not yet rated
rumu
Participant



Joined: 06 Jun 2005
Posts: 282

Points: 2830

Post Posted: Tue Nov 27, 2018 11:02 am Reply with quote    Back to top    

Frank..I followed the design. I set Character set as EBCDIC, Byte order Big Endian, Data format Binary, Rounding nearest value and Record Delimeter Unix newline. These are formats set in CFF stage. In CFF there was one more field called Seaparator which was set to project default.

I kept Rec Type as Char 10 and Data Column as 16326 . In the following transformer used constratnt, REC_TYPE[1,5]='01MST' to filter out first type of data.
In out put seq file has one column DATA_COLUMN as Binary 16326.
While I ran the job, it aborts giving warnings as REC_TYPE lacks whitespace delimeter at offset 10. ..Same warnings for 50 records an dthen aborted as warnings set as 50 in project level.

_________________
Rumu
IT Consultant
Rate this response:  
Not yet rated
FranklinE



Group memberships:
Premium Members

Joined: 25 Nov 2008
Posts: 739
Location: Malvern, PA
Points: 7018

Post Posted: Tue Nov 27, 2018 12:15 pm Reply with quote    Back to top    

If you have a delimiter setting, you should delete it. That's the only reason I can think of for the warning.

_________________
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: http://www.dsxchange.com/viewtopic.php?t=143596 Using CFF FAQ: http://www.dsxchange.com/viewtopic.php?t=157872
Rate this response:  
Not yet rated
eostic

Premium Poster



Group memberships:
Premium Members

Joined: 17 Oct 2005
Posts: 3824

Points: 30833

Post Posted: Tue Nov 27, 2018 5:55 pm Reply with quote    Back to top    

Reviewing this thread. A truly frustrating situation when the mainframe folks aren't communicating. It has probably been said in earlier parts of this thread, but be ABSOLUTELY CERTAIN you are wor ...

_________________
Ernie Ostic

blogit!
Open IGC is Here!
Rate this response:  
Not yet rated
rumu
Participant



Joined: 06 Jun 2005
Posts: 282

Points: 2830

Post Posted: Wed Nov 28, 2018 7:18 am Reply with quote    Back to top    

Hi Frank,

I tried the job removing the property

Record Delimiter=UnixNewline.

The same warnings I received.

I guess, since the record is contiguous(without field deliemeter) so using 2 columns to read the input record is giving error as it expects a delimiter after first Rec_Type column which is not there in the record. RecType is 10 Byte and actual data is starting from 11th byte.
I tried to read 16336(10+16326) in a Binary column and spitted the column using rawtoString function to read first 6 bytes for the Record type and mapping entire 16336 to data column so that this can be read using copybook later. but this job now throws warnings as

Code:
Sequential_File_11: When checking operator: When validating import schema: Unrecognized top level format property: round=round_inf

Sequential_File_11: When checking operator: When validating import schema: Unrecognized top level format property: packed

Sequential_File_11: When checking operator: When validating import schema: Unrecognized top level format property: julian


Then reads 90% of the file and then started Fatal message as
imput buffer overrun and getting aborted.

I tried using the following setting under format tab
Decimal
Packed=yes
Rounding=nearest value
Date=IsJulian

Still the warnings persists,

Also, I came to know that, the input file which has 11 types of records, each with varying length ie on type '01MST' with 16326 length, another '02CHK' is with 56778 length etc..
In that case If I use the highest record length in the DataColumn will that work? or put garbage's at trailing part for the shorter record?

_________________
Rumu
IT Consultant
Rate this response:  
Not yet rated
FranklinE



Group memberships:
Premium Members

Joined: 25 Nov 2008
Posts: 739
Location: Malvern, PA
Points: 7018

Post Posted: Wed Nov 28, 2018 9:20 am Reply with quote    Back to top    

At this point, I have an unhelpful observation to make: you are not dealing with standard COBOL formatting. You are dealing with undisciplined formatting out of COBOL coding.

They have set for you an impossible obstacle: reading a file with inconsistent record lengths. One more thing you might check, and I ask you to forgive me if I describe things you already know.

You have a series of variable length records. In COBOL, the prefix bytes of each record are defined explicitly, and the prefix contains the actual length of the following record. In the object oriented world, the prefix is always implied, never defined explicitly.

So, if you can see the prefix for each record -- and you will likely need the help of the mainframe developers for that -- it's possible to set up a VarChar column, scale undefined or set to the maximum record length in the file, which would accurately read each record.

I have little experience with variable length records out of COBOL processes, because the first standard for a COBOL file is that every record in a file without exception has the same record length as every other record. The mainframe developer should be padding the records in your file to make them all the same length, they should all be the length of the longest possible record. Since they are not cooperating, your only option is to do the padding yourself -- you've tried and failed so far -- or create a process before you try reading the actual data which splits the records into separate files where every record is of a consistent length. Ernie has a good suggestion to look to Server stages that might work. I expect you may need to build a process outside of DataStage.

I recommend not using a binary format to read the data. The RawTo functions are very limited in scope. I wish I could be more helpful, but the lack of cooperation is the main obstacle, and neither of us has any control over that. Sad

_________________
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: http://www.dsxchange.com/viewtopic.php?t=143596 Using CFF FAQ: http://www.dsxchange.com/viewtopic.php?t=157872
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours