Adding rows to a single column

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
swadeep
Participant
Posts: 3
Joined: Mon Feb 13, 2006 10:13 am
Location: USA

Adding rows to a single column

Post by swadeep »

Hi I have a scenario below, plz help me with the solution.

Input
-------
Key Add

1 3456 ABC lane
1 Apt 544
1 Charlotte NC
1 28217
2 101 ABC lane
2 Apt 467
2 Dallas Texas
2 75252

Output
-----------
Key Add

1 3456 ABC lane, Apt 544, Charlotte NC, 28217
2 101 ABC lane, Apt 467, Dallas Texas, 75252

Thanks in Advance
anbu
Premium Member
Premium Member
Posts: 596
Joined: Sat Feb 18, 2006 2:25 am
Location: india

Post by anbu »

Use stage variables

Code: Select all

PrevKey = CurrKey
CurrKey = Key
CurrData = If PrevKey = CurrKey then PrevData : Data else Data
PrevData = CurrData
Use Remove Duplicates to retain the row which has longest Data.

Also you should run this job in 1 node.
You are the creator of your destiny - Swami Vivekananda
jcthornton
Premium Member
Premium Member
Posts: 79
Joined: Thu Mar 22, 2007 4:58 pm
Location: USA

Post by jcthornton »

With the definition of the input given, I hope the correct order is guaranteed on the input, otherwise the addresses could get a bit jumbled.

On a second note, is there a reason for recommending just one node? It should work just fine as long as your partitioning (I would probably use Hash partitioning on just the Key field) keeps all rows with the same key together.
Jack Thornton
----------------
Spectacular achievement is always preceded by spectacular preparation - Robert H. Schuller
swadeep
Participant
Posts: 3
Joined: Mon Feb 13, 2006 10:13 am
Location: USA

Post by swadeep »

Thank you very muck. It works.
ravireddy25
Participant
Posts: 59
Joined: Wed Dec 31, 2008 5:49 am

Post by ravireddy25 »

anbu wrote:Use stage variables

Code: Select all

PrevKey = CurrKey
CurrKey = Key
CurrData = If PrevKey = CurrKey then PrevData : Data else Data
PrevData = CurrData
Use Remove Duplicates to retain the row which has longest Data.

Also you should run this job in 1 node.
HI,
I have one doubt like we can remove duplilcates based on key but how to retreive the row which has longest data.
Thanks In Advance
Ravi
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use Len() functions to compare lengths.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ravireddy25
Participant
Posts: 59
Joined: Wed Dec 31, 2008 5:49 am

Post by ravireddy25 »

ray.wurlod wrote:Use Len() functions to compare lengths. ...
Hi,
sorry to ask again, i can use Len function but if we have data like

key data length
1 ------ 5
1 --------- 7
1 ----------- 8
2 ---------- 6
2 ------------ 7
2 -------------- 8

and my output should be like
key data length
1 ------------------ 8
2 ------------------ 8

How we can achieve this?
Ravi
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Generate a column containing the length and aggregate preserving MAX.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ravireddy25
Participant
Posts: 59
Joined: Wed Dec 31, 2008 5:49 am

Post by ravireddy25 »

ray.wurlod wrote:Generate a column containing the length and aggregate preserving MAX. ...
Hi,
Got it Thanks :)
Ravi
Post Reply