DS 9.1 - poor XML stage performance

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
mczelej
Participant
Posts: 2
Joined: Fri Apr 04, 2014 3:27 am

DS 9.1 - poor XML stage performance

Post by mczelej »

Hello,
in our company recently we found a problem with very (I mean very) poor performance in DataStage 9.1. We use XML pack 3.0 (only one XML stage) to simply transform 5 input columns (from row generators) to a XML file. In XML assembly I use minimal schema validation with mapping of 5 required elements (schema is consisted of about 50 attributes).
With "Logging enabled" option enabled I got performance of about 100 rows per seconds. With the same settings but with "Enable logging" options disabled, I reached about 3000 rows per seconds.
Why performance of a XML stage depends so strongly from logging? Each row is schema-correct and no warnings are logged. I have tried different settings of log level with no result.
I would like to add, that CPU,memory and disks are OK, I mean there is some traffic, but a machine is not running on 100%. CPU is utilized the most, but there is some reserve.
This schema is cut off, becouse original one was consisted of about 5000 attributes and the performance is so bad, that debugging process is imposible (of course with logging enabled).

Do you have any advice/info, why this option can be so critical? We should have logging enabled on production env, so we can not simply disable it.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.

Are you using "OR logging"? This is known to cause severe performance degradation. Prefer local (non-OR) logging.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

....Or are you talking about the logging inside of the Stage itself? This should only be used for debugging.....it is VERY voluminous. What are you trying to Log? By default, the stage does just fine with the normal DataStage log in the Director.....and 3000 / second soumds great!

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
mczelej
Participant
Posts: 2
Joined: Fri Apr 04, 2014 3:27 am

Post by mczelej »

Thanks for your replies.

In the post above I mean logging inside XML stage below "Limit Output Rows" option.
I tested behavior of this option (Enable logging) and if I disable it and if any record will not be schema-validate no message will be displayed (record will be quietly dropped).
The only option I found is to set all validations to be FATAL in Assemlby properties in XML_composer stage, but this solution is not the one that I preffer (I would like to know that 3 records was rejected, but job was succesfuly completed).
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Ah. Ok. That's a nice "convenience" feature that is useful during development. Don't use that for production. There are far better and more performant ways to limit rows on DataStage links and send them anywhere from the log (Peek) or to a file, dataset, etc.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

The XML stage documentation does warn about logging affecting performance.
Enable logging
Enter Yes to enable logging and set the log level to Warning. By default, logging is not enabled. Enabling logging affects performance.
Choose a job you love, and you will never have to work a day in your life. - Confucius
Post Reply