Dynamic file structure

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
drkumar
Participant
Posts: 2
Joined: Tue Mar 04, 2014 6:37 am
Location: Chennai

Dynamic file structure

Post by drkumar »

Hi All
I have Source file which is pipe delimited, it's load to Teradata table, Here my source file structure will change dynamically its should handle below scenarios


1. My ETL Job should not fail it has to process with existing columns.

2. If any changes in source file structure, Need to get EMAIL notification.


Thanks
Ratna Kumar
Thanks
Ratna Kumar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:?

In what world does a source file "change dynamically" and can you give us some ideas as to what exactly that may mean? Specifically wondering what kind of changes you are expecting. With the mention of "existing columns" are we talking about possible new columns being added to the end of the record? That's something that can be both checked for and fairly easily handled. But if we go full dynamic here - columns can be swapped around, new columns added in random spots, those kind of things - then that's a whole different kettle of fish.

Please clarify for us.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How do you propose to detect a change?

That will affect how your processing runs.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
drkumar
Participant
Posts: 2
Joined: Tue Mar 04, 2014 6:37 am
Location: Chennai

Post by drkumar »

Thank you Craig..
The new columns will be added at end of the record.
Thanks
Ratna Kumar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Then it seems to me you will need to have a pre-check process for the presence of new columns and it sounds like that's as simple as counting the number of pipe delimiters in any record. There are multiple ways to handle it, could be a DataStage job but seems to me a script would be perfectly acceptable as well. When it finds more than the expected number of pipes, sound the alarm.

Then continue to run the job with the expected number of columns and see what kind of warnings / errors you get when there is a "short read" i.e. there are additional columns not included in the metadata. From what I recall, there was a checkbox on the Server side to suppress them, not sure how PX handles it but I would guess not gracefully. Build a small test harness to see and post your findings... unless someone chimes in with an answer for you before you get that together. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
rrcr
Participant
Posts: 4
Joined: Thu Jul 06, 2017 12:00 am

Post by rrcr »

Like chulett suggested we need to have before job script in which we can check for the number of columns . If the columns are more/less than the expected then send a mail notification.

coming to column definations we have to use schema file with RCP.
the schema file needs to be updated in before job script based on the number of columns

thanks,
Ramireddy Ch
Post Reply