need help with getting the first non null value

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
SPuneet
Participant
Posts: 28
Joined: Thu Jul 19, 2012 12:52 am

need help with getting the first non null value

Post by SPuneet »

I need to pick the first non null value for different columns for a particular id.

data scenario: the data is like this

ID C2 C3 C4 C5
-------- ------ ------ ------ ------

1000 32 A <NULL> 999

1000 <NULL> <NULL> Y 111

1000 <NULL> O N <NULL>

1000 22 <NULL> <NULL> 777

1001 <NULL> <NULL> <NULL> 666

1001 22 <NULL> Y <NULL>

1001 32 O <NULL> 555


Basically i have sorted it acocrding to id now i want is the id and the first non-null values for the rest of the columns.
my desired result is


DESIRED RESULT:


ID C2 C3 C4 C5

---- -------- ------ ------ ------

1000 32 A Y 999

1001 22 O Y 666
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What 8.x version do you have? Normally the aggregator would help out for something like this but I don't think it is a viable option here due to needing the first value rather than say the max value. The Server aggregator supports first/last but I don't believe the parallel one does.

Failing that, it would seem that you could use stage variables to do the if-then-else checking needed and having at least version 8.5 would be best for that.
-craig

"You can never have too many knives" -- Logan Nine Fingers
SPuneet
Participant
Posts: 28
Joined: Thu Jul 19, 2012 12:52 am

Post by SPuneet »

Hi,

I am using version 8.7 and i belive you are correct , this can't be done in aggregator. Will have to use stage variables

As i haven't used stage variables in transformer much.., so would require assistance with that to acheive the desired result


regards,
SPuneet
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The stage variables route would be pretty straight-forward. Setup one for each column and just do an if-then-else check to say if your SV is at its initial value and the column is non-null store the column value otherwise do nothing... and you do 'nothing' by setting the stage variable to itself, its current value.

Once you've processed the "last record in the group", output the result for all columns. Sorry but off the top of my head I don't recall the magical new thingie that lets you know that... function? System variable?
-craig

"You can never have too many knives" -- Logan Nine Fingers
SPuneet
Participant
Posts: 28
Joined: Thu Jul 19, 2012 12:52 am

Post by SPuneet »

Something Like

SVFirstC2= IF ((In.ID= SV_ID) OR @INROWNUM = 1) THEN ( IF IsNotNull(In.C2) Then In.C2)
SV_ID= In.ID

Have i understood it correctly?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Simpler version...

If sv_ID is null and In.ID is not null then In.ID else sv_ID

You shouldn't need to worry about the row number if the initial value is set appropriately, never mind the fact that you'd have to be very careful with it when running on multiple nodes.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Has this question not come up in the last month or so? Even the sample data looks the same...
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply