dsenv and performance tuning
Moderators: chulett, rschirm, roy
dsenv and performance tuning
I'm trying to tune our environment and am not finding much in the docs as to how set things up. There are some descriptions with minimums for the settings, a lot of warnings about not changing them, ans so on. Can anyone help with which ones are safe and what the values for those really mean?
Also, are there other files I can examine for tuning?
Thanks!
Also, are there other files I can examine for tuning?
Thanks!
rjhcc
The misconception is that the DS Engine needs special configuration to accelerate your jobs. Most performance problems are related to design choices and lack of understanding of what processing demands are being made.
There's nothing in the dsenv file that will accelerate poorly written SQL or even efficient SQL that has a complicated explain plan. There's no magic configuration parm that will cause 100M rows to flow faster across a network. Likewise, there's no parameter that will made a 100M row hashed file more efficient (hashed file tuning/optimization/etc is a whole other topic).
The reasons to touch the dsenv file are usually connectivity related. This file is used to define database connections and set environment variables. Once you move into the PX world, yes, there are environmental settings that "tune" performance but that's in different files than dsenv.
You may be confusing dsenv with uvconfig. The uvconfig file has a handful of parameters that do need to be upsized for specific reasons, such as the number of dynamic files that can be open at any moment or the default hashed file bit size (32 or 64). Those can affect performance or operability of the job processes.
But, for the most part, you need to focus pretty much all of your attention on job design. You need to understand performance and resource monitoring in your environment. If you're on Solaris, learn what prstat shows you. For AIX, checkout topas, and for HP see glance. Anything else get top downloaded onto your server.
You need to see what your job is doing so that you can pinpoint if you're cpu constrained. You may think a job is slow but when checking its cpu usage you see that it's at 100%. This implies it can't go any faster. If your job is not using cpu, you need to wonder why. Read the history here, OCI lookups suck on high volume batch processing. Inserts and updates need to be separated on high volume processing.
There's nothing in the dsenv file that will accelerate poorly written SQL or even efficient SQL that has a complicated explain plan. There's no magic configuration parm that will cause 100M rows to flow faster across a network. Likewise, there's no parameter that will made a 100M row hashed file more efficient (hashed file tuning/optimization/etc is a whole other topic).
The reasons to touch the dsenv file are usually connectivity related. This file is used to define database connections and set environment variables. Once you move into the PX world, yes, there are environmental settings that "tune" performance but that's in different files than dsenv.
You may be confusing dsenv with uvconfig. The uvconfig file has a handful of parameters that do need to be upsized for specific reasons, such as the number of dynamic files that can be open at any moment or the default hashed file bit size (32 or 64). Those can affect performance or operability of the job processes.
But, for the most part, you need to focus pretty much all of your attention on job design. You need to understand performance and resource monitoring in your environment. If you're on Solaris, learn what prstat shows you. For AIX, checkout topas, and for HP see glance. Anything else get top downloaded onto your server.
You need to see what your job is doing so that you can pinpoint if you're cpu constrained. You may think a job is slow but when checking its cpu usage you see that it's at 100%. This implies it can't go any faster. If your job is not using cpu, you need to wonder why. Read the history here, OCI lookups suck on high volume batch processing. Inserts and updates need to be separated on high volume processing.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Certainly, DataStage doesn't behaves the same everywhere with the same settings. You need to optimize it's performance as per your environments' requirement and acceptability.
Do you have a routine housekeeping or a regular maintenance cycle setup for the DataStage server ? I've witnessed environments which gave huge performance gains once the housekeeping cycle was put in place.
- James
Do you have a routine housekeeping or a regular maintenance cycle setup for the DataStage server ? I've witnessed environments which gave huge performance gains once the housekeeping cycle was put in place.
- James
Maintenance tasks, such as periodic cleansing of projects &PH& directory, sensible job log purging intervals, etc, all impact job performance. A job, in addition to processing its data, also has responsibilities to file log entries and such, therefore, a huge job log file can impact runtime. I didn't mention this as configuration tuning because it's not really "engine" configuration. Just like running 100 jobs simultaneously is not configuration tuning, it's more like sensible choices.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The ability to capture execution statistics for active stages should be your first port of call, noting what Ken (kbland) had to say. By this means you can identify the "hot spots" where, say, a Transformer stage is spending most of its time, and address those areas first.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Came back today after the weekend and was thankfull to see al of your input.
I have to say that all of these prove to be true. I wondered about the claim of dsenv not being a place to tune and was glad to see the reply stating that it is. It contains disk cache, memory, and file handling parameters among others.
I would appreciate any other posts or suggestions!
I have to say that all of these prove to be true. I wondered about the claim of dsenv not being a place to tune and was glad to see the reply stating that it is. It contains disk cache, memory, and file handling parameters among others.
I would appreciate any other posts or suggestions!
rjhcc
No, it doesn't. As noted earlier, you are confusing dsenv with uvconfig.rjhcc wrote:I wondered about the claim of dsenv not being a place to tune and was glad to see the reply stating that it is. It contains disk cache, memory, and file handling parameters among others.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Job Monitor would also give you the %CPU utilisation of that job.
shin0066 wrote:Thanks for sharing very good information Guru's.
I wonder - is there a way to get a jobs statistics in terms of CPU consumption?
By having those we can work on those jobs to reduce the CPU time to complete the job.
Thanks,
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
If you're tuning, you need to be aware of everything running on your DS server. Looking at cpu utilization from DS Monitor is deceiving - it only shows you what the job used. It doesn't tell you why it didn't use 100%. If a job is all file based processing, there's no network or database delays. A job will run until it runs into disk delays swapping/paging/reading/writing or cpu wait time.
Monitor is like a speedometer in your car. It tells you how fast you are going, it doesn't tell you if you're behind a slow car, stuck in traffic, going around tight corners, climbing a hill, or pulling a trailer. It doesn't tell you the circumstances dictating the speed. Tuning is identifying the circumstances and making adjustments.
Monitor is like a speedometer in your car. It tells you how fast you are going, it doesn't tell you if you're behind a slow car, stuck in traffic, going around tight corners, climbing a hill, or pulling a trailer. It doesn't tell you the circumstances dictating the speed. Tuning is identifying the circumstances and making adjustments.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
%CPU as reported by the monitor is usually wrong, for the same reasons that rows/sec is usually wrong. The clock is running even when rows are not being processed, and the time is rounded to whole seconds before the division has been done - %CPU is calculated as (CPU seconds) / (clock seconds) * 100. So it's approximate at best. The CPU figures obtained with stage tracing are in microseconds if the platform supports them; in milliseconds otherwise, and are reported raw (not as a percentage).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
an exercise in mis communication...
Again I return to be thankful for the suggestions!
However, I must appologize, and agree with the postings in regard to dsenv and uvconfig. Thanks for your patience with that one. I repeatedly and incorrectly called it dsenv.
Respectfully,
However, I must appologize, and agree with the postings in regard to dsenv and uvconfig. Thanks for your patience with that one. I repeatedly and incorrectly called it dsenv.
Respectfully,
rjhcc
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The manual Administering UniVerse contains the best coverage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.