Performance Issue

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
yaminids
Premium Member
Premium Member
Posts: 387
Joined: Mon Oct 18, 2004 1:04 pm

Performance Issue

Post by yaminids »

Hello there,

I designed a job which has about 20 stages in it. From the performance stand-point, would it be a good idea to design a job with so many stages especially when I can break this job into small jobs.

Thanks in advance.
-Yamini
DSkkk
Charter Member
Charter Member
Posts: 70
Joined: Fri Nov 05, 2004 1:10 pm

Post by DSkkk »

it is always better to break the jobs into smaller and call them in a sequence as you can have savepoints in a sequence and hence you don't have to restart it from the beginning when it aborts
g.kiran
kiran_kom
Participant
Posts: 29
Joined: Mon Jan 12, 2004 10:51 pm

Re: Performance Issue

Post by kiran_kom »

yaminids wrote:Hello there,

I designed a job which has about 20 stages in it. From the performance stand-point, would it be a good idea to design a job with so many stages especially when I can break this job into small jobs.

Thanks in advance.
-Yamini
Depends......
Assuming you have enough CPU, and all your intermediary stages are active stages or lookups, you would probably be better off by leaving the job in one piece. Turn inter-process buffering on.

I personally dont like breaking a job into a sequence of smaller ones because you will need to dump data on a file/table for the next job in the sequence. You will have a lot of performance loss because of these 'data landings'.

I actually like having more stages in a job than fewer, usually my servers have 4 or 8 cpus, if I distribute a series of transformation between two transformers, I would be better off coz I'd be using two cpu's as opposed to one cpu if I was doing everything in one transformer.

Just my 2 cents....
andru
Participant
Posts: 21
Joined: Tue Mar 02, 2004 12:25 am
Location: Chennai

Post by andru »

We work on a 16 CPU box with IPC on. We had a job with more than 20 stages which was performing very poorly. We split the job into 4 with intermediate files. The performance has improved considerable. So with my experience, I feel splitting the job into chunks should improve ur performance.
T42
Participant
Posts: 499
Joined: Thu Nov 11, 2004 6:45 pm

Re: Performance Issue

Post by T42 »

kiran_kom wrote:I personally dont like breaking a job into a sequence of smaller ones because you will need to dump data on a file/table for the next job in the sequence. You will have a lot of performance loss because of these 'data landings'.
Be careful there. While we're talking about Server here, data landing are fairly common, especially with very large set of data (larger than what you have in memory). With EE, data landing is clearly defined (by using Scratch space). High-performance disk drives would make EE jobs go way faster with large datasets. With Server, it's a matter of determining how much memory you are using. If you are using a lot of memory, and will be hitting swap space often, might as well land the data for further use.
yaminids
Premium Member
Premium Member
Posts: 387
Joined: Mon Oct 18, 2004 1:04 pm

Re: Performance Issue

Post by yaminids »

Hello all,
Thank you very much for your suggestions.
-Yamini
Post Reply