DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
Yarobi
Participant



Joined: 05 May 2019
Posts: 5

Points: 45

Post Posted: Sun May 05, 2019 9:03 pm Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Windows
Hello all,

I'm new to this forum and I realy hope someone can help me.

My problem is that I need to create a file with data from a DB2 database.

The file should look like this

Code:

  ID     ID2     Col1     Col2     Col3
  11     1        12        2         
  12     3                 15        1
  12     4         9       12        4
  16     2                 20       16


Col1 to ColX are values, and the ColX because I don't know how many columns there will be, but around 1000. And not every column has always a value.

And these values are stored in a db2-table like this
Code:

  ID     ID2    Code  Value
 11     1        Col1   12
 11     1        Col2   2     
 12     3        Col2   15
 12     3        Col3   1
 12     4        Col1   9
 12     4        Col2   12
 12     4        Col3   4
 16     2        Col2   20
 16     2        Col3   16


I already made a dataset with all possible Codes, but now I wonder if there is an easy way to transpose this dataset into columns?

Hope you understand my problem.
UCDI



Group memberships:
Premium Members

Joined: 21 Mar 2016
Posts: 362

Points: 3744

Post Posted: Mon May 06, 2019 3:07 am Reply with quote    Back to top    

it looks easy enough.
take your input data into a transformer.
output 1 of the transformer has
this:
11 1 Col1 12
and output #2 has this one:
11 1 Col2 2
put them into a funnel stage.
Write to target.
I assume there are a fixed number of columns that can be transposed (11,12,13,... N right).
each one gets an output. each output gets a constraint that prevents sending it if it does not exist (so 11 1 col 13, which does not exist, is created but filtered out before the funnel).

there may also be a way to do what you want with a transformer loop variable, which can generate rows with some logic. I have not used one of those in some time, and I am not sure if it can solve this easily or not.
Rate this response:  
Not yet rated
Yarobi
Participant



Joined: 05 May 2019
Posts: 5

Points: 45

Post Posted: Mon May 06, 2019 2:21 pm Reply with quote    Back to top    

there are ± 1000 columns that can be transposed, not all 1000 will be used by 1 key combination.

I hoped I could use the values in the third column (Col1, Col2, Col...) to use as column name to store the values from the fourth column .
Rate this response:  
Not yet rated
UCDI



Group memberships:
Premium Members

Joined: 21 Mar 2016
Posts: 362

Points: 3744

Post Posted: Mon May 06, 2019 10:29 pm Reply with quote    Back to top    

1000+ right... personally I would write something that fixed the file outside of datastage for something that big and load the fixed file. But you can probably do it with a transformer loop as you wanted, using the names. If not, a basic routine maybe?

that is too many to do it the simple way I said for sure.
Rate this response:  
Not yet rated
rkashyap



Group memberships:
Premium Members

Joined: 02 Dec 2011
Posts: 524
Location: Richmond VA
Points: 4730

Post Posted: Mon May 06, 2019 10:37 pm Reply with quote    Back to top    

As UCDI has mentioned ... an option is to leverage looping transformer. (I believe that Tony Curcio had published an example for similar use case in his blog. Unfortunately, I am not able to find link to that example).

Alternatively, if it possible to stage incoming data in a sequential file ... you can use following command in FILTER condition of Sequential file stage OR in External Source Stage
Code:
awk -F" " '{for(i=4;i<=NF;i++){print $1 " "  $2 " " $3 " " $i}}'
Rate this response:  
Not yet rated
Yarobi
Participant



Joined: 05 May 2019
Posts: 5

Points: 45

Post Posted: Thu May 09, 2019 1:19 pm Reply with quote    Back to top    

The input is a massive amount of DB2 data Sad.

I made a little example to show what I was looking for.
https://ibb.co/BcH3Kmv

I guess it's not going to work Sad.
Rate this response:  
Not yet rated
rkashyap



Group memberships:
Premium Members

Joined: 02 Dec 2011
Posts: 524
Location: Richmond VA
Points: 4730

Post Posted: Fri May 10, 2019 8:24 am Reply with quote    Back to top    

As source is DB2 table, you can also pivot data in DB2 SELECT statement. See these examples.
Rate this response:  
Not yet rated
Yarobi
Participant



Joined: 05 May 2019
Posts: 5

Points: 45

Post Posted: Sun May 12, 2019 8:57 pm Reply with quote    Back to top    

Thx, but I'm still limited then to select 225 fields.
Since my db has ± 5600 codes I need to split it up in 26 selects?
Rate this response:  
Not yet rated
UCDI



Group memberships:
Premium Members

Joined: 21 Mar 2016
Posts: 362

Points: 3744

Post Posted: Mon May 13, 2019 9:52 pm Reply with quote    Back to top    

it may be easiest to just extract it to a text file, and process the text file in your favorite file fixer -- basic routine? unix shell command? Write your own program? Etc...
Rate this response:  
Not yet rated
mouthou
Participant



Joined: 04 Jul 2004
Posts: 208

Points: 1991

Post Posted: Tue May 14, 2019 3:19 pm Reply with quote    Back to top    

This certainly a challenging scenario which is almost a pivoting scenario. I think this can be done in 2 steps for 2 needs in your result. one need is to create header row with key columns and new columns created based on Code and the other need is to map the Value to each code

not sure if it really helps you but gave a thought for some high level logic which could be a pointer.

1. to create a header row, you could use MODIFY stage which allows to create dynamic columns. since you already said you extracted the possible Code values, you could use that list to send as parameter to create new columns. you could create a record (say in a file) with just column names as key columns and columns of Code values ( as per your parameter list)

2. in this second flow, you could skip Code field and do pivot with just key columns and Value column. this will give the result of actual data without header (the one created above)

visualizing more of 2 above flows and funneling the individually created 2 results at the end etc I will leave it to you.
Rate this response:  
Not yet rated
Yarobi
Participant



Joined: 05 May 2019
Posts: 5

Points: 45

Post Posted: Tue May 21, 2019 9:30 pm Reply with quote    Back to top    

mouthou wrote:
...
1. to create a header row, you could use MODIFY stage which allows to create dynamic columns. since you already said you extracted the possible Code values, you could use that list to send as parameter to create new columns. you could create a record (say in a file) with just column names as key columns and columns of Code values ( as per your parameter list)
...


Suppose I can create a dataset with all the possible codes thanks to the MODIFY stage.

Is there a loop possible in a transformer to reuse this code for all the ± 5000 codes?

Code:
If Trim(L_JoinedData.C_I_CODE_CASE)="A0070" Then L_JoinedData.M_I_WAARD_MONT  Else SetNull()


And the codes or like this : A0070, A0080, A0100, B1740, B1760, B1770
there are some gaps in it Sad.
Rate this response:  
Not yet rated
mouthou
Participant



Joined: 04 Jul 2004
Posts: 208

Points: 1991

Post Posted: Wed May 22, 2019 1:49 pm Reply with quote    Back to top    

Yes you could write such IF-THEN-ELSE logic with looping but when to come to think of it, it feels strongly that DS wouldnt support 5000 columns metadata and it includes my previous MODIFY stage option too where 5000 columns are created.

Do check the maximum number of columns DS can support. In that case you would have to split the file with permissible number of columns and your end result file(s) could be unexpected then!
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours