Page 1 of 1

Adding rows to a single column

Posted: Mon Oct 18, 2010 11:48 am
by swadeep
Hi I have a scenario below, plz help me with the solution.

Input
-------
Key Add

1 3456 ABC lane
1 Apt 544
1 Charlotte NC
1 28217
2 101 ABC lane
2 Apt 467
2 Dallas Texas
2 75252

Output
-----------
Key Add

1 3456 ABC lane, Apt 544, Charlotte NC, 28217
2 101 ABC lane, Apt 467, Dallas Texas, 75252

Thanks in Advance

Posted: Mon Oct 18, 2010 12:29 pm
by anbu
Use stage variables

Code: Select all

PrevKey = CurrKey
CurrKey = Key
CurrData = If PrevKey = CurrKey then PrevData : Data else Data
PrevData = CurrData
Use Remove Duplicates to retain the row which has longest Data.

Also you should run this job in 1 node.

Posted: Mon Oct 18, 2010 1:50 pm
by jcthornton
With the definition of the input given, I hope the correct order is guaranteed on the input, otherwise the addresses could get a bit jumbled.

On a second note, is there a reason for recommending just one node? It should work just fine as long as your partitioning (I would probably use Hash partitioning on just the Key field) keeps all rows with the same key together.

Posted: Tue Oct 19, 2010 10:44 am
by swadeep
Thank you very muck. It works.

Posted: Wed Oct 20, 2010 12:12 am
by ravireddy25
anbu wrote:Use stage variables

Code: Select all

PrevKey = CurrKey
CurrKey = Key
CurrData = If PrevKey = CurrKey then PrevData : Data else Data
PrevData = CurrData
Use Remove Duplicates to retain the row which has longest Data.

Also you should run this job in 1 node.
HI,
I have one doubt like we can remove duplilcates based on key but how to retreive the row which has longest data.
Thanks In Advance

Posted: Wed Oct 20, 2010 12:18 am
by ray.wurlod
Use Len() functions to compare lengths.

Posted: Wed Oct 20, 2010 12:49 am
by ravireddy25
ray.wurlod wrote:Use Len() functions to compare lengths. ...
Hi,
sorry to ask again, i can use Len function but if we have data like

key data length
1 ------ 5
1 --------- 7
1 ----------- 8
2 ---------- 6
2 ------------ 7
2 -------------- 8

and my output should be like
key data length
1 ------------------ 8
2 ------------------ 8

How we can achieve this?

Posted: Wed Oct 20, 2010 1:03 am
by ray.wurlod
Generate a column containing the length and aggregate preserving MAX.

Posted: Wed Oct 20, 2010 1:34 am
by ravireddy25
ray.wurlod wrote:Generate a column containing the length and aggregate preserving MAX. ...
Hi,
Got it Thanks :)