DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
Seya
Participant



Joined: 29 Mar 2007
Posts: 27

Points: 271

Post Posted: Wed Dec 04, 2019 9:38 am Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Unix
Hi,
I have a job designed as below. While running this job we are running out of DataStage memory issue.
I see that there are many records coming out of transformer stage because of the 40 constraints defined.
DataStage is holding up all the data in memory before processing the aggregator stage.
Can you please share your thoughts on how to resolve this out of memory issue?

Dataset -->(Left join to a Table )-->Transformer Stage( We have about 40 filter conditions) ---> Funnel Stage --> Aggregator Stage --> Modify Stage -->ODBC connector

Thanks in Advance!
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 43053
Location: Denver, CO
Points: 222300

Post Posted: Wed Dec 04, 2019 2:44 pm Reply with quote    Back to top    

I'm sure it is the aggregator that is holding everything in memory so that it can sort and group all of your data properly. Only way to solve that is to sort the data before it gets to the Aggregator in a manner that supports the aggregation and then tell it in the Aggregator that the "input is sorted". Then it only needs to hold on to a single "group" at a time.

_________________
-craig

"May the bridges I burn light my way forward"
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54551
Location: Sydney, Australia
Points: 295806

Post Posted: Wed Dec 04, 2019 5:38 pm Reply with quote    Back to top    

What Craig said. Specify Sort mode in the Aggregator stage and ensure that your data are sorted by the grouping keys, as well as appropriately partitioned.

_________________
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Rate this response:  
Not yet rated
Seya
Participant



Joined: 29 Mar 2007
Posts: 27

Points: 271

Post Posted: Thu Dec 05, 2019 6:31 am Reply with quote    Back to top    

Thanks Craig and Ray for your reply!

I already have the sort method set in aggregator stage and Hash partition defined on the key columns.

just an update on the number of records in to transformer and to aggregator stage
(approx. 2M records ) --> Trasformer--> (64 M records) ---> Aggregator

This there any other way to resolve this issue.
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 43053
Location: Denver, CO
Points: 222300

Post Posted: Thu Dec 05, 2019 7:42 am Reply with quote    Back to top    

No, not really. Somehow, either before or during this, you need to sort your data. And this is not so much as 'issue' as much as needing to understand How It Works.

Even if you sort the data before it and then tell the Aggregator to sort it the same way, it will sort it again. I couldn't tell from your reply exactly what you meant and haven't had my hands of DS for years to give you the exact setting but make sure the Aggregator knows your data is already sorted so it skips that step. And trust me, instead of resorting it will now bust you if you get that wrong, i.e. sort it in a manner that does not support the aggregation being done... so get it right. Wink

Either add a Sort between the Transformer and Aggregator or make sure your input arrives sorted properly by (if possible) when building your source data, dumping it out already sorted.

_________________
-craig

"May the bridges I burn light my way forward"
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54551
Location: Sydney, Australia
Points: 295806

Post Posted: Mon Dec 09, 2019 7:43 pm Reply with quote    Back to top    

The Sort method in the Aggregator stage is telling the stage that the data are already sorted. It does NOT sort the data. If the data are not properly sorted (by the grouping keys, in order) then th ...

_________________
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Rate this response:  
Not yet rated
UCDI



Group memberships:
Premium Members

Joined: 21 Mar 2016
Posts: 380

Points: 3906

Post Posted: Mon Dec 30, 2019 11:56 am Reply with quote    Back to top    

are you doing something with that aggregator stage that could be hand-rolled in a transformer instead? That might resolve it. You may still want to sort the data.

also, before you go 2M to 64M, is there something you are doing there that is being undone later? Is the part that blows it up to 64M over-doing it and the the agg stage undoing part of that? Maybe the whole process can be collapsed?

Dunno without details, just throwing some stuff to think about around.
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours