how to validate CFF file structure

sec105105 · Post by **sec105105** » Mon May 16, 2016 12:27 pm

Hi,
Is there an elegant way to test the structure of a CFF file? We read in dozens of mainframe files, and the source-side personnel sometimes neglects to tell us that they're changing the structure. Usually, they make the change in the UA environment a few weeks ahead, so instead of our production jobs bombing when the change makes it there, we could run a single UA job to test all the files to see if the structure has changed.

So far the only thing I can think of is to copy all existing CFF stages in to one job, and test a specific column in the file. But that means dozens of CFF stage/transform/interprocess stage stages to write to one seqfile.

Would like a server and parallel job solution, if possible...

FranklinE · Post by **FranklinE** » Mon May 16, 2016 1:19 pm

First, I would respectfully inform the mainframe development group that impact analysis and due notice of changes should be a requirement. I have over 160 Cobol-format files in my batch application, and every single one of them is coded to the copybook for the source data. Our host version control is exact and disciplined.

My application has a formal data mapping requirements document. I have a suite of jobs that essentially copy the CFF and transformer from the main processing job, where the transformer's only task is to prepare for a sequential file the fields in the copybook for import to a spreadsheet as an analysis tool and aid to the data mapping.

That could be a good starting point for you, though if a change is to a PIC X() field it might not be so easy to find it. A change to a numeric field should show up right away, either as a warning or error on the transformer derivation.

I'm not being cynical when I say that being told ahead of time of any change is the only effective solution. As it stands, you are being required to find their changes after the fact by doing a significant amount of coding, even if you do find an elegant solution.

sec105105 · Post by **sec105105** » Mon May 16, 2016 2:18 pm

Hi
Thanks for the comments.

I agree that better communication and a solid process would be the best. Sadly, those requests have not proven effective. So, I'd like to take control of the situation by an automated way.

So, I've begun the imports of all the CFF stages that I can see. It's laborious and brain-dead work, so if anyone else has any ideas, feel free to save me a few hours !

chulett · Post by **chulett** » Mon May 16, 2016 3:51 pm

I had to build something like this to protect myself from the same thing - unannounced changes to our source system. Rather than go table-by-table, I took a more metadata driven approach utilizing the various views in the databases. Something like this, high level:

1. create a driving xref table of our staging table owner/name and the associated source table owner/name.
2. Stream in the source DDL for each table, ordered
3. Stream in the target DDL for each matching table, ordered
4. Full Outer Join the two datasets
5. Compare the column names, data types, precision, scale, misses, etc.
6. Create a "report" of the results.

A tweaked example of one of the source SQLs, Oracle flavored:

Code: Select all

select atc.OWNER,
       atc.TABLE_NAME, 
       atc.COLUMN_ID, 
       atc.COLUMN_NAME, 
       atc.DATA_TYPE ||  
        case 
         when atc.DATA_TYPE = 'NUMBER' and atc.DATA_PRECISION IS NOT NULL then '(' || atc.DATA_PRECISION ||
           case 
            when NVL(atc.DATA_SCALE,0) > 0 then ',' || atc.DATA_SCALE else '' 
           end || ')'
         when atc.DATA_TYPE = 'VARCHAR2' then '(' || atc.DATA_LENGTH || ')'
         else ''
        end as DATA_TYPE, 
       atc.NULLABLE 
from us.ALL_TAB_COLUMNS atc,
     us.CHK_STG_TABLES_XREF stx
where atc.OWNER = stx.STG_TABLE_OWNER
  and atc.TABLE_NAME = stx.STG_TABLE_NAME
  and stx.OFF_TABLE_NAME is not null
  and atc.COLUMN_NAME not like 'META%' --> columns we add on our side, ignored in the comparison
  and atc.COLUMN_ID is not null
  and stx.CHECK_METADATA_IND = 'Y'
order by atc.TABLE_NAME, atc.COLUMN_ID

Hopefully somewhat helpful.

chulett · Post by **chulett** » Tue May 17, 2016 6:37 am

And, stupid me, just caught the fact that you were talking about files rather than tables.

Got all excited and thought I could help yesterday, it was not to be. I'll leave it here on the chance that perhaps it helps someone else... some other day.

sec105105 · Post by **sec105105** » Tue May 17, 2016 7:28 am

Yes, if it was tables, I would've tried RCP, but I don't believe that's suitable for CFF stages.

FranklinE · Post by **FranklinE** » Tue May 17, 2016 8:58 am

The point of "breakage", the point of failure, is in the (lack of) ability to compare an existing table definition to the format and layout of an existing file.

Craig's table solution has some value for files. The problem is in where you get your metadata.

As a less tedious but still time-intensive suggestion -- and I doubt you'll find a way to automate it -- is to find the source code from which the table definition is created. For me, this is the copybook, and in a well-controlled source repository, the copybook is a distinct entity that can be copied to a text file. That is how I import copybooks to my project via the CFD import wizard.

If you can get the equivalent of a copybook -- might likely mean finding the Cobol program in which the record is defined and manually creating a copybook from it -- your comparison would be to the code instead of the actual data files. That part would easily be automated along the lines of Craig's process.

The best of luck to you.

satyanarayana · Post by **satyanarayana** » Thu Dec 08, 2016 11:20 pm

Try to read EBCDIC Data File Using Sequential Stage and Pass the Metadata using Schema File(You can parameter the Schema file Name ).