Schemas for flat files and datasets

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
hobocamp
Premium Member
Premium Member
Posts: 98
Joined: Thu Aug 31, 2006 10:04 am

Schemas for flat files and datasets

Post by hobocamp »

My searching for this hasn't turned up any results, but figured I'd put the question to the experts.

Is there a method for performing a mass extract of the file and dataset schemas within a DS project? I know that for a particular instance, the columns could be saved as a table definition and then exported.

But in light of GDPR requirements, there is a request to be able to look at these definitions en masse, in order to locate any sensitive data (SSN, Bank info, etc.) being captured and stored.

Thanks in advance for any advice.

Tom Smith
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

None of which I'm aware.

Are the files guaranteed to have column headers? Does it violate any rules to inspect the data to guess the data types?
In a perfect world it would be possible to use the Connector Import Wizard in DataStage Designer, specifying the File Connector but, alas, it's not.

For data sets, you have the orchadmin command; you could build a script to loop through all the descriptor file names, which I hope you keep in a consistent location.
Similarly to above, importing an Orchestrate schema definition only allows for one descriptor file at a time to be processed.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

You can filter out the schema file from a dataset like this:

Put this into a sequencer execute stage.


echo "record" > #seqfilepath##ds_filename#.schema; echo "{record_delim='\n', final_delim=end,delim=',', quote=double}" >> #seqfilepath##ds_filename#.schema; $DSHOME/../PXEngine/bin/orchadmin describe -s #dataset_path##ds_filename# 2>/dev/null | sed 1,11d >> #seqfilepath##ds_filename#.schema; echo #seqfilepath##ds_filename#.schema


you can figure out the rest.
Post Reply