Merging multiple rows into single row

gayathrisivakumar · Post by **gayathrisivakumar** » Mon Aug 26, 2013 6:44 pm

Hi,'

I have a requirement to merge multiple rows into one row. I tried using stage variables but it seems that logic is going wrong somewhere.

My Input Data
ColA ColB ColC
1 A 100
1 B 200
1 C 300

Desired Output
1,A,100,B,200,C,300

How can I achieve this?

I tried comparing Currkey = PrevKey and appending values to stage variables, but some of the records give desired output and others not.

Can anybody please help.

IBM Analytics Champion 2009 - 2020 · Post by **asorrell** » Mon Aug 26, 2013 7:21 pm

You are using the correct approach. Did you hash partition and sort on your key? (The column that holds the "1" in the example)

If you are, can you show some example failures or your stage variable logic?

gayathrisivakumar · Post by **gayathrisivakumar** » Mon Aug 26, 2013 8:42 pm

Thanks for the quick reply.

My job design is like this:

seqfile ->sort ->transformer->remove duplictes->transformer->seqfile

This logic works fine

when

colA colB colC
1 A 10
1 B 20
2 C 30
2 D 40

Output comes as 1,A,10,B,20
2,c,30,D,40

But it fails when
colA colB colC
1 A 10
1 B 20
2 C 30
2 D 40
3 E 50

Output is coming as 1,A,10,B,20
2,C,30,D,40
3,A,C,10,B,20,E,50.

I am doing hash partitioning based on colA.Inside the stage variable if CurrKey = PrevKey then concatenate input columns into the stage variable.

Please let me know where am I going wrong.

Regards,
Gayathri

srinivas.nettalam · Post by **srinivas.nettalam** » Tue Aug 27, 2013 1:05 am

Just to make sure nothing wrong in your derivation.Enable KeyChange Column in sort and write your derivation like

Code: Select all

svCurrent = If KeyChange=1 Then ColB:',':ColC Else svPrev:', ':ColB:',':ColC
svPrev=svCurrent

gayathrisivakumar · Post by **gayathrisivakumar** » Tue Aug 27, 2013 7:01 am

Yes I am doing the same derivation. The logic goes wrong from the third change in key. Anybody has any insight?

chulett · Post by **chulett** » Tue Aug 27, 2013 7:08 am

If you run this on a single node, does it run correctly?

gayathrisivakumar · Post by **gayathrisivakumar** » Tue Aug 27, 2013 7:59 am

when I try to run it on single node the results are completely different.

1,A,C,10,B,20,C,30,D,40,E,50.

boxtoby · Post by **boxtoby** » Tue Aug 27, 2013 9:23 am

When I have done this sort of thing in the past I have had success but doing the following:

Sort the data by colA,hash partition by colA and set the key change indicator on.

I use three stage vars as follows:

svLastRowInGroup:
LastRowInGroup(colA)

(LastRowInGroup() function is set to 1 for true, zero for false)

svDataOut
if svLastRowInGroup and keyChange then colB:",":colC else if svLastRowInGroup then svDataOut:",":colB:",":colC else ""

(This logic checks single row records as well)

svData
if keyChange then colB:",":colC else svData:",":colB:",":colC

The transformer has a constraint of svLastRowInGroup

chulett · Post by **chulett** » Tue Aug 27, 2013 9:23 am

gayathrisivakumar wrote:when I try to run it on single node the results are completely different.

Then your logic is flawed. You'd need to supply the details of your job design to allow anyone to help without continuing to guess at this point.

What field(s) did you specify as the change keys in the Sort stage? Is the KeyChange column being set correctly? Post your exact stage variables and derivations and the output link constraint you are using.

ps. You don't need "current" and "prev" variables when the Sort stage is setting the KeyChange column.

chulett · Post by **chulett** » Tue Aug 27, 2013 10:51 am

Was going to get to the 'LastRowInGroup' part as something needs to push out the last 'group' when taking this approach but wanted to check the version you are running as it was included starting with 8.5.

boxtoby · Post by **boxtoby** » Wed Aug 28, 2013 2:04 am

ChangeKey would be colA

I wouldn't have thought the logic was that difficult to follow.....

gayathrisivakumar · Post by **gayathrisivakumar** » Wed Aug 28, 2013 4:35 pm

Thanks all for your help.

Craig I am using Datastage 8.7. I couldn't read your whole post as I am yet to renew my membership.

I rebuilt the whole job again based on your suggestions and now it is working.

Since I was busy I dint get time to debug my existing code.