Need help in creating transposed file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Yarobi
Participant
Posts: 7
Joined: Sun May 05, 2019 8:37 pm

Need help in creating transposed file

Post by Yarobi »

Hello all,

I'm new to this forum and I realy hope someone can help me.

My problem is that I need to create a file with data from a DB2 database.

The file should look like this

Code: Select all

  ID     ID2     Col1     Col2     Col3
  11     1        12        2          
  12     3                 15        1
  12     4         9       12        4
  16     2                 20       16 
Col1 to ColX are values, and the ColX because I don't know how many columns there will be, but around 1000. And not every column has always a value.

And these values are stored in a db2-table like this

Code: Select all

  ID     ID2    Code  Value
 11     1        Col1   12
 11     1        Col2   2     
 12     3        Col2   15
 12     3        Col3   1
 12     4        Col1   9
 12     4        Col2   12
 12     4        Col3   4
 16     2        Col2   20
 16     2        Col3   16
I already made a dataset with all possible Codes, but now I wonder if there is an easy way to transpose this dataset into columns?

Hope you understand my problem.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

it looks easy enough.
take your input data into a transformer.
output 1 of the transformer has
this:
11 1 Col1 12
and output #2 has this one:
11 1 Col2 2
put them into a funnel stage.
Write to target.
I assume there are a fixed number of columns that can be transposed (11,12,13,... N right).
each one gets an output. each output gets a constraint that prevents sending it if it does not exist (so 11 1 col 13, which does not exist, is created but filtered out before the funnel).

there may also be a way to do what you want with a transformer loop variable, which can generate rows with some logic. I have not used one of those in some time, and I am not sure if it can solve this easily or not.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

1000+ right... personally I would write something that fixed the file outside of datastage for something that big and load the fixed file. But you can probably do it with a transformer loop as you wanted, using the names. If not, a basic routine maybe?

that is too many to do it the simple way I said for sure.
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

As UCDI has mentioned ... an option is to leverage looping transformer. (I believe that Tony Curcio had published an example for similar use case in his blog. Unfortunately, I am not able to find link to that example).

Alternatively, if it possible to stage incoming data in a sequential file ... you can use following command in FILTER condition of Sequential file stage OR in External Source Stage

Code: Select all

awk -F" " '{for(i=4;i<=NF;i++){print $1 " "  $2 " " $3 " " $i}}'
Yarobi
Participant
Posts: 7
Joined: Sun May 05, 2019 8:37 pm

Post by Yarobi »

The input is a massive amount of DB2 data :(.

I made a little example to show what I was looking for.
https://ibb.co/BcH3Kmv

I guess it's not going to work :(.
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

As source is DB2 table, you can also pivot data in DB2 SELECT statement. See theseexamples.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

it may be easiest to just extract it to a text file, and process the text file in your favorite file fixer -- basic routine? unix shell command? Write your own program? Etc...
mouthou
Participant
Posts: 208
Joined: Sun Jul 04, 2004 11:57 pm

Re: Need help in creating transposed file

Post by mouthou »

This certainly a challenging scenario which is almost a pivoting scenario. I think this can be done in 2 steps for 2 needs in your result. one need is to create header row with key columns and new columns created based on Code and the other need is to map the Value to each code

not sure if it really helps you but gave a thought for some high level logic which could be a pointer.

1. to create a header row, you could use MODIFY stage which allows to create dynamic columns. since you already said you extracted the possible Code values, you could use that list to send as parameter to create new columns. you could create a record (say in a file) with just column names as key columns and columns of Code values ( as per your parameter list)

2. in this second flow, you could skip Code field and do pivot with just key columns and Value column. this will give the result of actual data without header (the one created above)

visualizing more of 2 above flows and funneling the individually created 2 results at the end etc I will leave it to you.
mouthou
Participant
Posts: 208
Joined: Sun Jul 04, 2004 11:57 pm

Re: Need help in creating transposed file

Post by mouthou »

Yes you could write such IF-THEN-ELSE logic with looping but when to come to think of it, it feels strongly that DS wouldnt support 5000 columns metadata and it includes my previous MODIFY stage option too where 5000 columns are created.

Do check the maximum number of columns DS can support. In that case you would have to split the file with permissible number of columns and your end result file(s) could be unexpected then!
Post Reply