Create Multiple years data from single record

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
new_ds_man
Participant
Posts: 18
Joined: Tue Jul 08, 2008 1:12 pm

Create Multiple years data from single record

Post by new_ds_man »

All,

I would really appreciate some ideas on the below requirement.

***********************************************************

Input is a single record from source, like below, with start year, end year and the total amount spent for those years:

Start_Year End_Year Amt
________________________________________
2009/07/01 2010/09/01 204.20

This needs to be transformed into the below. The spent amount is distributed between July, 2009 thru to Sep 2010 (which is 15 months in total, so Amount per month is 204.20/15 = 13.61).
All the other months need to have a value zero in the below table.
So 2 records need to be created for the 2 years and the total amount distributed equally for the 15 months. All other months not in the range will have a zero


Year Jan Feb Mar Apr May June July Aug Sep Oct Nov Dec
____________________________________________________________________________________________________________________________

2009 0 0 0 0 0 0 13.61 13.61 13.61 13.61 13.61 13.61
2010 13.61 13.61 13.61 13.61 13.61 13.61 13.61 13.61 13.61 0 0 0

*************************************************************


Thanks in advance.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Actually, the amount per month is 13.61333333333333... so you either need to alternate rounding up and then down across the months to even out the difference across the year or you need to lump the difference into the first or last month - 13.66 - otherwise you'll be short.

Isn't this just two output links with 12 columns apiece funneled back together before your target? Is this always a 15 month ('two year') range or could the date range encompass any number of months/years?
-craig

"You can never have too many knives" -- Logan Nine Fingers
new_ds_man
Participant
Posts: 18
Joined: Tue Jul 08, 2008 1:12 pm

Post by new_ds_man »

Thanks Chulett for replying. About the decimal, you are right.
Going to the question, the date range is dynamic. Any number of years/months. could be 2 months or 36 months/ 3 years.
jgreve
Premium Member
Premium Member
Posts: 107
Joined: Mon Sep 25, 2006 4:25 pm

the "big ugly transformer" technique...

Post by jgreve »

new_ds_man wrote:Thanks Chulett for replying. About the decimal, you are right.
Going to the question, the date range is dynamic. Any number of years/months. could be 2 months or 36 months/ 3 years.
In 8.5 I'd be tempted to look at a transformer loop and spin through the months, then try pivoting by year.

However, for a straight up datastage 7 solution Craig suggested an approach with his question about the size of your date range. If you limit your data to 5 or 10 year spans, that may not be too bad to do with a transformer stage. Then you could have 5 or however many output links, each with a condition to limit how many records you actually create.

But that quickly becomes tedious. Since you're on unix, I'd be tempted to call a shell script and run some perl code or something to actually loop. But for the interesting challenge of a straight up datastage solution... :-)

Suppose you have a transformer stage, and you set up some stage variables like this (take the expressions with a pinch of salt, for example I don't have the date-manipulation ready off the top of my head):

Code: Select all

var name : type    : expression             : description
  sy     : int     : start year             : extract year from start date
  sm     : int     : start month            : extract month from start date
  ey     : int     : end year               : extract from end date
  em     : int     : end month              : extract month from end date
  ny     : int     : 1 + ey - sy            : number of years to span
  nm     : int     : 1+(12-sm) + 12*ny + em : number of months to span
  mv     : decimal : total_value / num_months
Then have 5, or 10, or however many output links you need:
yr1: link condition: ny >= 1
yr2: link condition: ny >= 2
...etc...
yr10: link_condition: ny >= 10
yr_error: link_condition ny >= 11

This gets kind of ugly doing logic for each fields value.
We can skip some of the logic for each field because if the record's
year is out of range, we ignore all of it.

So now consider the question, "When do we want zero for a given month?"
Whenever our field's month is "in range", e.g. between the sm (start_month) and em (end_month).

We can get away with only considering the start_month on year1 of the range, since it just doesn't apply on later years (if any).

Code: Select all

# 
# We do always need to check the sm start_month on link year1.
# The end-month does matter; if we span 2+ years then nothing to limit.
# Adding the link condition ny >= 1 will block edge cases like if your
# data source incorrectly flips start & end dates.

YEAR1: condition: ny >= 1
month_01: if((sm > 1) or (ny=1 and em < 1)) then 0 else mv
month_02: if((sm > 2) or (ny=1 and em < 2)) then 0 else mv
month_03: if((sm > 3) or (ny=1 and em < 3)) then 0 else mv
...
month_09: if((sm > 9) or (ny=1 and em < 9)) then 0 else mv
month_10: if((sm > 10) or (ny=1 and em < 10)) then 0 else mv
month_11: if((sm > 11) or (ny=1 and em < 11)) then 0 else mv
month_12: if((sm > 12) or (ny=1 and em < 12)) then 0 else mv


# for 2nd, 3rd, etc. years, no need to check the start_month.
# We do need to check the end_month + end_year though.
YEAR2: condition: ny >= 2
month_01: if( ny=2 and em < 1 ) then 0 else mv
month_02: if( ny=2 and em < 2 ) then 0 else mv
month_03: if( ny=2 and em < 3 ) then 0 else mv
...
month_09: if( ny=2 and em < 9)  then 0 else mv
month_10: if( ny=2 and em < 10 ) then 0 else mv
month_11: if( ny=2 and em < 11 ) then 0 else mv
month_12: if( ny=2 and em < 12 ) then 0 else mv

YEAR3: condition: ny >= 3
month_01: if( ny=3 and em < 1 ) then 0 else mv
month_02: if( ny=3 and em < 2 ) then 0 else mv
month_03: if( ny=3 and em < 3 ) then 0 else mv
...
month_09: if( ny=3 and em < 9)  then 0 else mv
month_10: if( ny=3 and em < 10 ) then 0 else mv
month_11: if( ny=3 and em < 11 ) then 0 else mv
month_12: if( ny=3 and em < 12 ) then 0 else mv

...enc...

YEAR10: condition: ny >= 10
month_01: if( ny=10 and em < 1 ) then 0 else mv
month_02: if( ny=10 and em < 2 ) then 0 else mv
month_03: if( ny=10 and em < 3 ) then 0 else mv
...
month_09: if( ny=10 and em < 9)  then 0 else mv
month_10: if( ny=10 and em < 10 ) then 0 else mv
month_11: if( ny=10 and em < 11 ) then 0 else mv
month_12: if( ny=10 and em < 12 ) then 0 else mv

YEAR_ERR: condition: ny > 10
month_01: -1
new_ds_man
Participant
Posts: 18
Joined: Tue Jul 08, 2008 1:12 pm

Re: the "big ugly transformer" technique...

Post by new_ds_man »

Great info there jgreve. Thanks for that and i will try to work a solution using the same. Will let u know soon enough.

Thanks Again.
Post Reply