how to validate CFF file structure

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
sec105105
Participant
Posts: 44
Joined: Fri Mar 20, 2009 7:21 am
Location: Ottawa

how to validate CFF file structure

Post by sec105105 »

Hi,
Is there an elegant way to test the structure of a CFF file? We read in dozens of mainframe files, and the source-side personnel sometimes neglects to tell us that they're changing the structure. Usually, they make the change in the UA environment a few weeks ahead, so instead of our production jobs bombing when the change makes it there, we could run a single UA job to test all the files to see if the structure has changed.

So far the only thing I can think of is to copy all existing CFF stages in to one job, and test a specific column in the file. But that means dozens of CFF stage/transform/interprocess stage stages to write to one seqfile.

Would like a server and parallel job solution, if possible...
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

First, I would respectfully inform the mainframe development group that impact analysis and due notice of changes should be a requirement. I have over 160 Cobol-format files in my batch application, and every single one of them is coded to the copybook for the source data. Our host version control is exact and disciplined.

My application has a formal data mapping requirements document. I have a suite of jobs that essentially copy the CFF and transformer from the main processing job, where the transformer's only task is to prepare for a sequential file the fields in the copybook for import to a spreadsheet as an analysis tool and aid to the data mapping.

That could be a good starting point for you, though if a change is to a PIC X() field it might not be so easy to find it. A change to a numeric field should show up right away, either as a warning or error on the transformer derivation.

I'm not being cynical when I say that being told ahead of time of any change is the only effective solution. As it stands, you are being required to find their changes after the fact by doing a significant amount of coding, even if you do find an elegant solution.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
sec105105
Participant
Posts: 44
Joined: Fri Mar 20, 2009 7:21 am
Location: Ottawa

Post by sec105105 »

Hi
Thanks for the comments.

I agree that better communication and a solid process would be the best. Sadly, those requests have not proven effective. So, I'd like to take control of the situation by an automated way.

So, I've begun the imports of all the CFF stages that I can see. It's laborious and brain-dead work, so if anyone else has any ideas, feel free to save me a few hours !
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I had to build something like this to protect myself from the same thing - unannounced changes to our source system. Rather than go table-by-table, I took a more metadata driven approach utilizing the various views in the databases. Something like this, high level:

1. create a driving xref table of our staging table owner/name and the associated source table owner/name.
2. Stream in the source DDL for each table, ordered
3. Stream in the target DDL for each matching table, ordered
4. Full Outer Join the two datasets
5. Compare the column names, data types, precision, scale, misses, etc.
6. Create a "report" of the results.

A tweaked example of one of the source SQLs, Oracle flavored:

Code: Select all

select atc.OWNER,
       atc.TABLE_NAME, 
       atc.COLUMN_ID, 
       atc.COLUMN_NAME, 
       atc.DATA_TYPE ||  
        case 
         when atc.DATA_TYPE = 'NUMBER' and atc.DATA_PRECISION IS NOT NULL then '(' || atc.DATA_PRECISION ||
           case 
            when NVL(atc.DATA_SCALE,0) > 0 then ',' || atc.DATA_SCALE else '' 
           end || ')'
         when atc.DATA_TYPE = 'VARCHAR2' then '(' || atc.DATA_LENGTH || ')'
         else ''
        end as DATA_TYPE, 
       atc.NULLABLE 
from us.ALL_TAB_COLUMNS atc,
     us.CHK_STG_TABLES_XREF stx
where atc.OWNER = stx.STG_TABLE_OWNER
  and atc.TABLE_NAME = stx.STG_TABLE_NAME
  and stx.OFF_TABLE_NAME is not null
  and atc.COLUMN_NAME not like 'META%' --> columns we add on our side, ignored in the comparison
  and atc.COLUMN_ID is not null
  and stx.CHECK_METADATA_IND = 'Y'
order by atc.TABLE_NAME, atc.COLUMN_ID
Hopefully somewhat helpful. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

And, stupid me, just caught the fact that you were talking about files rather than tables. :roll: Got all excited and thought I could help yesterday, it was not to be. I'll leave it here on the chance that perhaps it helps someone else... some other day. :(
-craig

"You can never have too many knives" -- Logan Nine Fingers
sec105105
Participant
Posts: 44
Joined: Fri Mar 20, 2009 7:21 am
Location: Ottawa

Post by sec105105 »

Yes, if it was tables, I would've tried RCP, but I don't believe that's suitable for CFF stages.
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

The point of "breakage", the point of failure, is in the (lack of) ability to compare an existing table definition to the format and layout of an existing file.

Craig's table solution has some value for files. The problem is in where you get your metadata.

As a less tedious but still time-intensive suggestion -- and I doubt you'll find a way to automate it -- is to find the source code from which the table definition is created. For me, this is the copybook, and in a well-controlled source repository, the copybook is a distinct entity that can be copied to a text file. That is how I import copybooks to my project via the CFD import wizard.

If you can get the equivalent of a copybook -- might likely mean finding the Cobol program in which the record is defined and manually creating a copybook from it -- your comparison would be to the code instead of the actual data files. That part would easily be automated along the lines of Craig's process.

The best of luck to you.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
satyanarayana
Participant
Posts: 13
Joined: Fri Jul 15, 2005 12:01 am

Post by satyanarayana »

Try to read EBCDIC Data File Using Sequential Stage and Pass the Metadata using Schema File(You can parameter the Schema file Name ).
Post Reply