Creating a Header

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ScottDun
Participant
Posts: 61
Joined: Thu Dec 10, 2015 9:51 am

Creating a Header

Post by ScottDun »

Another brainbuster for you guys. I have a set of data that looks like this:

AD43829T||||4,5
AF87571S||||4,5
CP06459X||||4,5
AD99586W||||4,5
CK71846A||||4,5
In the transformer, I want to add a header to this to make the output look like this:

File partially processed|Total-10|Accepted-0|Rejected-10|
AD43829T||||4,5
AF87571S||||4,5
CP06459X||||4,5
AD99586W||||4,5
CK71846A||||4,5

But only using 1 column. Is this possible? I used the code:
(("File partially processed"):"|":("Total":"-":Detail_Count):"|":("Accepted":"-":(Detail_Count - Reject_Count)):"|":("Rejected":"-":Reject_Count)):"|":(Link_RESP_EXTR.DATA [71,8]):"|":(Link_RESP_EXTR.DATA[3500,11]):"|":(Link_RESP_EXTR.DATA[3511,8]):"|":(Link_RESP_EXTR.DATA[3524,8]):"|":(ErrorCodes) in my transformer but it makes the data look like:

File partially processed|Total-10|Accepted-0|Rejected-10|AD43829T||||4,5
File partially processed|Total-10|Accepted-0|Rejected-10|AF87571S||||4,5
File partially processed|Total-10|Accepted-0|Rejected-10|CP06459X||||4,5

Is there a way to make it just a header using one column?

Thanks

[Note - had to disable Smilies in the post - Andy]
SCOTTDun
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

Easy-peasy! :-)

I've done this a lot! The problem is you are getting the column output for each row of data processed and you only want one at the end of data processing. On older versions I've added a row number in front of the column and just used a remove duplicates to only pass along the last row.

However - with your "one column" restriction, I'd suggest using a constraint based on "Last Row" (@EOD or LastRow()) that only lets out the column on the last row. That should work on release 9 or later.

Note - you must output the "header" column on a separate link and then merge that link back into the main feed so it goes to the file. There are several methods you can use to insure it gets written out first - let us know if you need help with that part.

Another method I've seen used is to build each record as a two varchar fields. The first field is a flag to indicate "Header" or "Detail" (and or "Trailer" when required). The second column is a large varchar field that holds the entire record contents. All the detail rows have the normal output condensed into the second field. The last row also outputs the header info on a second link (via last row constraint mentioned earlier).

Then a funnel is used to merge the links back together (they have identical metadata) with the header record first, then detail records, (then trailer if required). Afterwards you can use a modify stage to drop the first column before you send it to the sequential file stage.
Last edited by asorrell on Thu Oct 19, 2017 9:56 pm, edited 1 time in total.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
ScottDun
Participant
Posts: 61
Joined: Thu Dec 10, 2015 9:51 am

Post by ScottDun »

Thanks for responding. So let me give you the entirety of the job. I am taking an FTP to an external filter to get rid of Header/Trailer. Then I am using
a transfomer to load two files and a DB2. For the file in question, I have a constraint based on my stage variables (I am creating error codes for field validations) being more than 0. So when writing the code:
(("File partially processed"):"|":("Total":"-":Detail_Count):"|":("Accepted":"-":(Detail_Count - Reject_Count)):"|":("Rejected":"-":Reject_Count)):"|":FILENAME:"|":(Link_RESP_EXTR.DATA [71,8]):"|":(Link_RESP_EXTR.DATA[3500,11]):"|":(Link_RESP_EXTR.DATA[3511,8]):"|":(Link_RESP_EXTR.DATA[3524,8]):"|":(ErrorCodes) :

I already have a constraint. I should've prefaced this.
SCOTTDun
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

Scott,

You need to break up the statement onto two links. On each output link put one VARCHAR column called Record -

First link has a "LastRow()" constraint
Record: ("File partially processed"):"|":("Total":"-":Detail_Count):"|":("Accepted":"-":(Detail_Count - Reject_Count)):"|":("Rejected":"-":Reject_Count)

Second link:
Record: FILENAME:"|":(Link_RESP_EXTR.DATA [71,8]):"|":(Link_RESP_EXTR.DATA[3500,11]):"|":(Link_RESP_EXTR.DATA[3511,8]):"|":(Link_RESP_EXTR.DATA[3524,8]):"|":(ErrorCodes)

After the transformer, use a funnel stage to put the two feeds back together, with the funnel set to "drain" the first link before starting the second link using the "Sequence" mode setting. That will cause the Header record to be the first record on the following sequential file.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
Post Reply