Routine maintenance

thompsonp · Post by **thompsonp** » Fri May 19, 2017 6:56 am

Apart from purging job logs as per project or job level settings what maintenance activities are recommended to ensure DataStage doesn't slow down - in terms of either job performance or responsiveness when opening / editing / compiling jobs?

I'd expect some routine tasks would be beneficial on xmeta database and perhaps the installation file system but I've never done anything apart from purging job logs. Neither can I find any guidance in the documentation apart from a technote about xmeta logging writing logs to the xmeta repository.

Does a DataStage installation eventually reach a tipping point where the performance or responsiveness suffers if nothing is done due to the size of the projects, xmeta database or fragmentation of corresponding file systems?

PaulVL · Post by **PaulVL** » Fri May 19, 2017 3:37 pm

&PH& will fill up in each project path. To many files in that path "can" affect performance, like... if 50K files are in one path...

Scratch disk can fill up with orphaned files, if the scratch disk was never modified from the original default path (under the $DSHOME mount) then you might fill up that mount and eventually tank your datastage setup.

Version 8 had had issues when using logging. If you used the xmeta then after about a week of use... you felt the pain. Converting (back) to universe logging solved that speed issue.

OSH processes can be orphaned as well and those PIDs would need to be terminated.

Random junk that users run on the host can also chew up memory/CPU and may need to be killed.

Lots of stuff that is not mentioned in the books that only comes from experience of how your environment lives and breaths.

Example:

I've seen "ls --color" commands orphaned and linger forever on a host. 50 of those pids months old...

I've seen countless orphaned OSH pids because of weird aborts/crashes.

GIGs of scratch disk files never cleaned up by aborted jobs.

orphaned DSAPI slave processes...

the list goes on.

A good admin is proactive and cleans the environment so that the users never see the issues to begin with.

Mike · Post by **Mike** » Fri May 19, 2017 4:29 pm

Everything on PaulVL's list is important.

I would also add space monitoring.

Never let your $TMPDIR location fill up.

Developers never think about deleting old and unnecessary datasets until there is a crisis. I like to routinely delete any dataset that is more than x (e.g. 30) days old.

Mike