DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
This topic has been marked "Resolved."
Author Message
hitmanthesilentassasin
Participant



Joined: 13 Mar 2007
Posts: 150

Points: 1382

Post Posted: Fri May 31, 2013 1:58 am Reply with quote    Back to top    

DataStage® Release: 8x
Job Type: Parallel
OS: Unix
Hi,

I have a requirement to survive the most frequent non blank value, incase of tie the latest record based on a date value column should survive.

not able to find a way to pick the latest record value when its a tie. any suggestions ?

Thanks
prasannakumarkk
Participant



Joined: 06 Feb 2013
Posts: 117
Location: Chennai,TN, India
Points: 607

Post Posted: Fri May 31, 2013 3:03 am Reply with quote    Back to top    

In a sort stage, make partitioning and inline sort based on key columns and have the date column partitioned as well as sorted and in the stage page give all the keys except date column. Set key change column to true. Filter the record for which key change column is 1.

_________________
Thanks,
Prasanna
Rate this response:  
Not yet rated
hitmanthesilentassasin
Participant



Joined: 13 Mar 2007
Posts: 150

Points: 1382

Post Posted: Fri May 31, 2013 3:19 am Reply with quote    Back to top    

Thanks Mate, but my requirement is to survive the best record based on most frequent non blank values in case if it is a tie then the tie breaker should be the date. I was looking for the survive stage solutions. or should I restore to the age old transformer method using stage variables?
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54479
Location: Sydney, Australia
Points: 295436

Post Posted: Fri May 31, 2013 4:15 pm Reply with quote    Back to top    

If you can carry through a column such as LAST_UPDATE_TIMESTAMP you can use the highest value of that as your tie-breaker.

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
Rate this response:  
Not yet rated
hitmanthesilentassasin
Participant



Joined: 13 Mar 2007
Posts: 150

Points: 1382

Post Posted: Sun Jun 02, 2013 5:38 pm Reply with quote    Back to top    

I have already tried to define expressions like c.timestamp > c.timestamp and also like b.timestamp > b.timestamp but nothing fetches the desired results. the only options available is greater than but not like greatest in version 8.7

An alternate to this could be to sort the data based on the key column and the timestamp column and generate a string based on the previous values and the current values and in survivor stage select the data based on the longest string value.

I was hoping that there could be more clear way to define the greatest value.
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54479
Location: Sydney, Australia
Points: 295436

Post Posted: Sun Jun 02, 2013 8:19 pm Reply with quote    Back to top    

Curious choices. Did you try c.timestamp > b.timestamp?

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
Rate this response:  
Not yet rated
hitmanthesilentassasin
Participant



Joined: 13 Mar 2007
Posts: 150

Points: 1382

Post Posted: Sun Jun 02, 2013 8:35 pm Reply with quote    Back to top    

Yes I did. Its strange the functionality is limited to greater than and longest length of the string and not like highest, lowest or median values
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54479
Location: Sydney, Australia
Points: 295436

Post Posted: Sun Jun 02, 2013 9:01 pm Reply with quote    Back to top    

Not really. Keep in mind what a survive rule is doing; it's specifying a condition under which a field in the currect (c) record will replace that field in the so-far-the-best (b) record. Median cou ...

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
Rate this response:  
Not yet rated
hitmanthesilentassasin
Participant



Joined: 13 Mar 2007
Posts: 150

Points: 1382

Post Posted: Sun Jun 02, 2013 9:08 pm Reply with quote    Back to top    

I think I have got the solution. Run the job in sequential mode and sort the data based on the key and the timestamp column so that that latest record is processed as the last record. whenever there is a tie survive stage seems to pick the value from the last record(can't bet my life on this ) but this is what a quick test has shown.
Rate this response:  
Not yet rated
vmcburney

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup

Joined: 23 Jan 2003
Posts: 3590
Location: Australia, Melbourne
Points: 28110

Post Posted: Sun Jun 02, 2013 10:14 pm Reply with quote    Back to top    

You don't need sequential mode if you partition on the key and sort on the key and timestamp column. The partitioning will keep groups of records on the same node.

_________________
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn: Vincent McBurney LinkedIn
Rate this response:  
Not yet rated
hitmanthesilentassasin
Participant



Joined: 13 Mar 2007
Posts: 150

Points: 1382

Post Posted: Sun Jun 02, 2013 11:38 pm Reply with quote    Back to top    

Agree - the job can retain parallel mode with the appropriate partitioning. I was mentioning the testing steps that yielded the results Very Happy
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours