DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
This topic has been marked "Resolved."
Author Message
sensiva



Group memberships:
Premium Members

Joined: 22 Aug 2017
Posts: 17

Points: 323

Post Posted: Mon Aug 26, 2019 9:55 am Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Unix
Hello,

I would like to have your views on the usage of multiple sort stages for the use of creating a key change column and not sorting the data actually.

Here is the scenario,
Code:
Sort Stage 1 - Sort Mode - Sort for all columns
Sorting columns A, B, C, D, E

Sort Stage 2 - Sort Mode - (Don't Sort Previously sorted)
Sorting columns A, B, C
Create a key change column 1

Sort Stage 3 - Sort Mode - (Don't Sort Previously sorted)
Sorting columns A
Create a key change column 2


I did read from the knowledge center that Don't sort would not use much of memory, but still hesitant to use multiple sort stage for an input data that would probably contain around 3 million records. Is it advisable to use the sort stage just for creating key change columns, else would have do in transformer with comparing the previous records..

Any pointers would be of great help.

Thanks
Sen

_________________
sen
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 43026
Location: Denver, CO
Points: 222086

Post Posted: Mon Aug 26, 2019 12:17 pm Reply with quote    Back to top    

Sorry, it's been awhile but can't you sort and create the key change column at the same time? Meaning two rather than three Sort stages. And if the sort handles the key change, I'm not sure there's a need to have a transformer do it post-sort unless there are rules to it that you would need those stage variables to handle properly... seeing as how the data needs to be sorted regardless.

Regardless, I don't think you need to be too concerned about the performance impact of "Don't Sort" stages but curious what others think. And 3M isn't really a large amount to sort IMHO unless your infrastructure is not up to the task.

_________________
-craig

Peaches come from a can, they were put there by a man
If I had my little way, I'd eat peaches every day
Rate this response:  
Not yet rated
sensiva



Group memberships:
Premium Members

Joined: 22 Aug 2017
Posts: 17

Points: 323

Post Posted: Tue Aug 27, 2019 3:43 am Reply with quote    Back to top    

Thanks for your reply

chulett wrote:
Sorry, it's been awhile but can't you sort and create the key change column at the same time? Meaning two rather than three Sort stages.


Yes, Sort does create a keyChange while sorting, but i don't want the keyChange with 5 keys (A,B,C,D,E) but with rather one keyChange with (A,B,C) keys and another with just A as key.

Code:
 Say for example A = COUNTRY, B = STATE, C = ORDER, D = PRODUCTS E = xxxx

I need to sort on all these keys to process the data and then i would need a key change till ORDER and another key change just for the COUNTRY to route and process them differently.


Quote:
I don't think you need to be too concerned about the performance impact of "Don't Sort" stages but curious what others think. And 3M isn't really a large amount to sort IMHO unless your infrastructure is not up to the task.


Our infrastructure is well built, and I could still ask for more cpu if need be. But would really like my design to be well made to put forth my points and demand them.

I would go ahead and implement with 3 sort stage with 2 of them needing just for keyChange.

And definetly as said, it would be great to have others views as well .

Thanks
Sen

_________________
sen
Rate this response:  
Not yet rated
sensiva



Group memberships:
Premium Members

Joined: 22 Aug 2017
Posts: 17

Points: 323

Post Posted: Tue Aug 27, 2019 3:46 am Reply with quote    Back to top    

Thanks for your reply

chulett wrote:
Sorry, it's been awhile but can't you sort and create the key change column at the same time? Meaning two rather than three Sort stages.


Yes, Sort does create a keyChange while sorting, but i don't want the keyChange with 5 keys (A,B,C,D,E) but with rather one keyChange with (A,B,C) keys and another with just A as key.

Code:
 Say for example A = COUNTRY, B = STATE, C = ORDER, D = PRODUCTS E = xxxx

I need to sort on all these keys to process the data and then i would need a key change till ORDER and another key change just for the COUNTRY to route and process them differently.


Quote:
I don't think you need to be too concerned about the performance impact of "Don't Sort" stages but curious what others think. And 3M isn't really a large amount to sort IMHO unless your infrastructure is not up to the task.


Our infrastructure is well built, and I could still ask for more cpu if need be. But would really like my design to be well made to put forth my points and demand them.

I would go ahead and implement with 3 sort stage with 2 of them needing just for keyChange.

And definetly as said, it would be great to have others views as well .

Thanks
Sen

_________________
sen
Rate this response:  
Not yet rated
Mike



Group memberships:
Premium Members

Joined: 03 Mar 2002
Posts: 1021
Location: Tampa, FL
Points: 6600

Post Posted: Mon Sep 02, 2019 10:50 pm Reply with quote    Back to top    

I think I would go with 1 sort stage.

Partition by A
Sort by A,B,C,D,E
LastRowInGroup(C) transformer function will give you key breaks on A,B,C
LastRowInGroup(A) transformer function will give you key break on A

Partition by A since you likely want all of the A rows passing through the same processing node.

That's all from memory as I haven't used the LastRowInGroup() function for some time now.

Mike
Rate this response:  
Not yet rated
sensiva



Group memberships:
Premium Members

Joined: 22 Aug 2017
Posts: 17

Points: 323

Post Posted: Mon Sep 23, 2019 5:43 am Reply with quote    Back to top    

Thanks Mike, your solution worked great and just one sort stage was enough

_________________
sen
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours