Generate Multiple Files without outputting key column value

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
JeroenDmt
Premium Member
Premium Member
Posts: 107
Joined: Wed Oct 26, 2005 7:36 am

Generate Multiple Files without outputting key column value

Post by JeroenDmt »

I am creating multiple files through one sequential file stage using the "Generate Multiple Files" option.
The filename is based on a root file string plus the value of a key column, so that for each distinct value of the key column a new file is generated containing the key value in the file name.

For the output properties I have set:
- Write Method = Generate Multiple Files
- Exclude Partition String = True
- Key = <fieldname>
- Use Value in Filename=True
- Root File String = <base file name>

That works perfectly fine. However it forces me to include the key column in the output, and I do not want to include that in the output. I only want to use it to set the filenames of the output file.
Is there any way I can achieve that? (using the generate multiple files property)
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I'll be curious if anyone can help as you're kind of out there on the bleeding edge of the product, playing with new functionality. You may need to involve your official support provider and then come back and tell us the answer. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
JeroenDmt
Premium Member
Premium Member
Posts: 107
Joined: Wed Oct 26, 2005 7:36 am

Post by JeroenDmt »

Going the official way as well, but I am betting on two horses. Maybe someone here has ran into this problem before. There are always more people playing with new functionality.

If the official support provider horse wins, I will post the results here as well obviously ;)
gsbrown
Premium Member
Premium Member
Posts: 148
Joined: Mon Sep 23, 2002 1:00 pm
Location: USA

Post by gsbrown »

Was this ever resolved? I'm now running into the same issue. I need "File Name" as one of the output columns to use in the file name generation, but I don't want it as a column in my output file.
ssnegi
Participant
Posts: 138
Joined: Thu Nov 15, 2007 4:17 am
Location: Sydney, Australia

Reply

Post by ssnegi »

make the unwanted key column the first field in the column definition. Then Use the Filter Property of sequential file.
put unix script : cut -d, -f2,n
n--> total number of columns
This will print all the columns except the first.
Last edited by ssnegi on Thu Mar 20, 2014 3:40 am, edited 2 times in total.
JeroenDmt
Premium Member
Premium Member
Posts: 107
Joined: Wed Oct 26, 2005 7:36 am

Post by JeroenDmt »

For now you have to use a workaround like a unix script like the one just mentioned.

The enhancement of the DataStage functionality is planned for the next major release according to IBM Product Management.
babbu9
Premium Member
Premium Member
Posts: 75
Joined: Tue Jun 01, 2004 9:44 am

resolved

Post by babbu9 »

I was able to generate multiple files using the "generate multiple files" option.
You need to define the Key field (Ex: Region) on which you would like to split the data. I also included the Root File string to point to the Directory where the files will be created. You specify the file name in the root file string

Ex: /....../InformatServer/Projects/Project1/TgtFiles/Region

and it created multiple files with prefix Region.part00004.001,Region.part00009.001,.....

I still am playing with the providing the filename in a format that I need.
An option would be to use shell script for changing filenames after job has finished.
But the data in each file is specific to the Region field in my source data and it seems to be working.
Post Reply