Plz help me on below tech tip

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nagarjuna.dudam
Participant
Posts: 4
Joined: Wed Sep 08, 2010 10:57 am

Plz help me on below tech tip

Post by nagarjuna.dudam »

Hi All,
can anybody help me on this below.

1)I have a source below.
a
a
a
b
b
c
c

I need output like below.
a1
a2
a3
b1
b2
c1
c2



2)
src:
a
b
a
b
c
d

I need the result should be as non duplicate values like below.
c
d

Thanks,
Nag
Ravi.K
Participant
Posts: 209
Joined: Sat Nov 20, 2010 11:33 pm
Location: Bangalore

Post by Ravi.K »

What is your source? Whether it is Database (Oracle / DB2) or Flat file.

If it is Database then you can handle at Database itself by using the below queries.

1. select col||row_number() over(partition by col order by col) from yourtable;

2. select col from yourtable group by col having count(col) <=1;

The above queries are work for Oracle if your database is DB2 then simulate accordingly.
Cheers
Ravi K
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

(1) Sort and partition data using Sort stage generating key change column then use stage variables to increment/reset counter depending on whether a change has occurred.

(2) Fork join. Make a copy of your data (Copy stage) and run it through an Aggregator to count the rows for each group. Downstream of that join the two streams then filter based on count = 1.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
XRAY
Participant
Posts: 33
Joined: Mon Apr 03, 2006 12:09 am

Post by XRAY »

2)
a) One of the way is to use column generator ( assign 1 to a new column ) - Aggregator ( Sum ) - Filter

b) Another way is

Parallel Aggregator ( Count Row ) -> Sequential Aggregator ( Sum ) with Sort Merge -> Filter
vishal_rastogi
Participant
Posts: 47
Joined: Thu Dec 09, 2010 4:37 am

Post by vishal_rastogi »

for the second thing you can use the aggregator and then use where link count = 1 in the transformer constraint
Vish
Vidyut
Participant
Posts: 24
Joined: Wed Oct 13, 2010 12:45 am

Post by Vidyut »

for 1st--
StageVar1 = (If DSLink2.Col1 = StageVar Then StageVar1 +1 Else 1)
StageVar = (DSLink2.Col1 )

Output
DSLink2.Col1:StageVar1

for 2nd: vishal_rastogi has already answered
Post Reply