Tuning Data stage jobs for performance

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
sumesh.abraham
Participant
Posts: 153
Joined: Thu May 11, 2006 1:52 am
Location: Bangalore

Tuning Data stage jobs for performance

Post by sumesh.abraham »

Hi all,

I am analyzaing a server job for performance issues. This is a server job which uses 5 look ups (Uses DRS stage). Each of the stage executes SQL queries that has multiple joins on tables (some of the tables have close to 300,000 records). In a lower environment the job completes in 4 minutes whereas in Production environment it takes close to 4 hours!!. The job runs in both environments at the same time, so not suspecting anything with table contention...etc

I am trying to see the query plan and also would be checking the stale statistics on the tables at the time of run.

I would like to explore at Data stage level, whether setting any environment variables at project / job level can better the performance. Could you please advise.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Not really. Which environment variables are different between development (where I assume you regard performance to be satisfactory) and production?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jcthornton
Premium Member
Premium Member
Posts: 79
Joined: Thu Mar 22, 2007 4:58 pm
Location: USA

Post by jcthornton »

Assuming that your production DB is different than your lower environment(?), have you isolated the problem to DataStage? [Do the queries run in the same amount of time on the DB directly in both places? Can you connect from a different tool using the same method and run the queries faster than DS?, etc.?]

It can also be helpful to isolate the issue. Are all the lookup SQL queries taking 4 hours? Or is it a particular one? If it is just one query that is behaving that much worse, it may prove helpful to build a job with just the worst performing query in it - maybe even without the lookup included.

Last suggestion I have is to change only 1 thing at a time and re-benchmark. Knowing what gives you the best gains in performance may suggest other steps to take or point to non-DS changes that should be tried.

This can all help you identify where in the process the greatest performance gains can be made and will make it easier to track the impact of different changes.
Jack Thornton
----------------
Spectacular achievement is always preceded by spectacular preparation - Robert H. Schuller
Post Reply